# Motivating Category Theory for Haskell for Non-mathematicians

**The Issue: Why Study Category Theory?
**

Two days ago, there was another interesting post by Andrzej Jaworski, entitled “[Haskell-cafe] Mathematics for Uninterested,” dated “Sat, 17 Jan 2009 05:15:25 +0100,” on the Haskell-cafe mailing list on the Haskell programming language, offering an explanation of why mathematics is important for learning Haskell. In particular, he wrote:

MATHEMATICS_IS_ABOUT_SIMPLIFYING_THINGS.

MATHEMATICS = MATHEMATICAL_KNOWLEDGE + FORMAL_METHODS_of_THINKING

You need only the second part for Haskell.

He then continued:

… Category Theory generalizes our mathematical experience along the line that we know

more that we can prove.

Nevertheless, many non-mathematicians continue to have much difficulty in motivating study of category theory. Some regard it as too theoretical. Others regard it as too abstract. And still others regard it as too dry.

While I cannot argue too much against the second point (indeed, category theory has been nicknamed “abstract nonsense,” although not necessarily pejoratively), and cannot truly deny the first for those who do not prefer theory, I have something to say from personal experience, as a student who overcame math phobia to appreciate beauty in mathematics, about the third.

Each student learns differently. Some learn best by seeing examples. Some learn best by studying theory. Still others learn best by actually solving problems.

**Case Study: My Personal Experience in Approaching Mathematics**

Several years ago, when I was still living in Manhattan, I read an article in a local newsletter about how different students had different learning styles in primary (and possibly secondary) school, and how putting different students into the wrong type of class inhibited their learning. In particular, the article discussed how most students in the classes could be grouped into either of two different learning styles: those who learned by example, and those who learned by theory. Students who learned by example needed to see concrete examples of the mathematics in action in order to infer the reasoning; conversely, students who learned by theory learned best by studying the general theory from which to deduce specific applications. Neither style was correct; they were simply different, and suited different types of students.

The same can probably be said for learning category theory. Personally, I learn best from a combination of first studying theory, and actually applying the theory to incrementally more sophisticated problems. Without the theory, I cannot understand the examples, but without actually applying the theory, I tend to forget about it very quickly. Applying the theory, especially in a creative manner, makes it stick in my brain.

However, although I eventually was able to major in computer science, getting there was a long, arduous, grueling odyssey. The most difficult part was conquering a required course in the design and analysis of algorithms. I had come to college from a self-educated background, studying on my own since fifth grade in elementary school without any teacher, using only books, because of family circumstances. In the end, I wound up studying throughout my secondary school years at home.

However, my interests had been in writing and multimedia applications. As a result, prior to coming to college, while I had had extensive practice in writing, I had had minimal exposure to mathematics. My first exposure to college-level mathematics resulted in an outbreak of math phobia.

Nevertheless, I still wished to study something related to computers in college. Because the school happened to be a research-oriented school with a theoretically-oriented faculty, the only computer-related topic available happened to be computer science, and the focus was on theory.

Theoretical computer science happens to have a heavy focus on mathematics. However, I happened to be struggling with math phobia. I needed to find something interesting about mathematics if I was to remain in computer science.

Fortunately, one very generous mathematics student offered to tutor me *gratis *once per week over one summer. He gave me what he termed “an elementary course from an advanced viewpoint.” He started out from axiomatic set theory, focusing on lemmas and proofs of theorems, and gave me an intuitive grasp of such elementary concepts as set comprehension, power sets, ordered pairs, and Cartesian products. I later learned about Russell’s paradox, and found it rather interesting. At the end of the summer, I was able to take and pass an undergraduate course at my college in axiomatic set theory. Then I discovered *Gödel, Escher, Bach: an Eternal Golden Braid*, was fascinated, and the next semester, I proceeded on to pass recursive function theory. Then I took a course in automata theory. Then I took a course (albeit in the philosophy department) in meta-logic. Then I took a graduate-level course in recursion equations (domain theory).

After then taking a year’s leave of absence to study discrete mathematics for computer science outside of my official curriculum, I was able to return and eventually pass the required course in the design and analysis of algorithms. The next semester, I then took an additional course in advanced algorithms, which turned out to be significantly less of a hassle than the first course in algorithms (the exercises were more abstract, but the workload was considerably less, so I went to the library and did some research for each problem set). One of my favorite courses later wound up being one in formal semantics of programming languages, where I was exposed to this fascinating creature, the lambda calculus.

Yet somehow along the way … I forgot how to program well. I had to struggle through another required course in systems programming in C, and this time *acquired *programming phobia, which I had not had before. Oh dear, I had forgotten how to use pointers in coding linked lists! I only recovered after auding a course using Scheme to teach computer science under a professor who also taught Haskell. Then I audited another course under this professor in Haskell. Thank goodness! No more pointers, no more memory management! Instead, higher-order functions!

**A Short, Thin, Elementary, Yet Influential Book and Its Approach**

One of the most influential books in my struggle to conquer algorithms was a most unassuming, very thin, informal book called *Naive Set Theory*, by Paul R. Halmos. This book had no explicit exercises marked as such, and the treatment was almost conversational in tone, yet almost every sentence was an implicit exercise in abstract thinking. Each chapter was only three to five pages long, with almost no equations, and the entire book was only about a hundred pages long. The covers together were barely thicker combined than the text. Yet it still covered the following topics:

the Axiom of Extension

the Axiom of Specification

unordered pairs

unions and intersections

complements and powers

unordered pairs

relations

functions

families

inverses and composites

numbers

the Peano Axioms

arithmetic

order

the Axiom of Choice

Zorn’s Lemma

well ordering

transfinite recursion

ordinal numbers

sets of ordinal numbers

ordinal arithmetic

the Schröder-Bernstein Theorem

countable sets

cardinal arithmetic

cardinal numbers

Imagine covering an entire topic in mathematics with almost no equations, and no explict exercises, and only three to five pages per chapter! This was easily one of the shortest books I ever read in college, yet one of the most influential: Not only did it resolve the math phobia, but it also imbued me with an appreciation for elegant proofs, and hence for beauty in mathematics.

**A Related Topic: Computability Theory
**

Another experience that helped in my endeavor was an independent seminar with one researcher in automata theory (it was actually this experience which eventually led to my auditing a course in this subject). My employer for one summer was a programmer who had graduated from the same university. He helped me set up an independent once-a-week seminar (actually a private tutorial) covering the book *Elements of the Theory of Computation* (first hardcover edition), by Harry R. Lewis and Christopher Papadimitriou. I would read the book, collect questions, and bring them to the seminar every week. This book covered such topics as computability, the diagonalization lemma, unsolvability, finite state automata, Turing machines, mathematical logic, and other related topics. What I liked about this book was the aspect that, although it gave a very formal, abstract treatment to computability theory, otherwise it did not assume any specific knowledge of any area of mathematics.

I think that a book similar to the above two for category theory would be highly useful, at least for students with the same learning pattern as myself. For such students, I believe that what is most important is the intuition; the formalism can come later. Mathematics is not about formalism; it is about insight: insight into patterns and structure, and the beauty that comes with such patterns. While a book with many exercises would probably be necessary at some point, for a first course in category theory for a non-mathematician, an approach similar to Halmos’ would seem most effective.

**Set Theory vs. Category Theory: The Conflict in Approach**

The only problem with recommending a book on set theory to a student of category theory is that the approaches do not overlap. Set theory studies what is inside sets; category theory studies what is outside them.

Specifically, set theory starts out by defining the empty set, then defining basic axioms, and then defining certain operations on sets and functions operating on sets in leading up to lemmas and theorems about properties of sets. However, the focus is essentially on what is inside the sets.

However, category theory ignores what is inside, and instead focuses on structure-preserving mappings (known as “homomorphisms”) between different sets having a specific structure. These homomorphisms lead to theorems about the structure of the sets. Thus, it is these mappings in category theory that are the focus, not explicitly what is inside the sets. This is the origin of the nickname of “abstract nonsense” in reference to category theory.

Therefore, there is a risk that exposing a new student of Haskell first to set theory could actually confuse the student upon subsequent exposure to category theory. Therefore, I might substitute instead an elementary introduction to category theory with an informal, somewhat conversational tone, such as the following:

*Conceptual Mathematics: A First Introduction to Categories*

by F. William Lawvere and Stephen Hoel Schanuel

Cambridge: Cambridge University Press, 1997

As discussed in my blog entry “Learning Haskell through Category Theory, and Adventuring in Category Land: Like Flatterland, Only About Categories,”this is an elementary introduction to category theory, with a conversational tone, using such examples as Galileo and the flight of a bird for illustration.

**Advanced Topics for Further Study
**

Once the student has had a taste of category theory, there are two possible alternatives: Either broaden the topics, or deepen them. The latter has already been discussed in my above-mentioned blog post “Learning Haskell through Category Theory, and Adventuring in Category Land: Like Flatterland, Only About Categories“; here I discuss the former.

As mentioned above, **automata theory** is one possibility for advanced study for students who have acquired a taste for theory and wish to explore more:

*Elements of the Theory of Computation* (first hardcover edition)

by Harry R. Lewis and Christopher Papadimitriou

New Jersey: Prentice Hall, Inc., June 1981

As discussed above, a formal, rigorous treatment of the theory of computation. The book assumes the ability to reason abstractly, but does not assume familiarity with any specific areas of mathematics. Most readers either love or hate this book, depending on whether they are theoretically-inclined.

Another possibility is **recursive function theory**:

*Computability, Complexity, and Languages, Second Edition: Fundamentals of Theoretical Computer Science (Computer Science and Scientific Computing)*

by Martin Davis, Ron Sigal, and Elaine J. Weyuker

San Francisco, California: Morgan Kaufmann Publishers, 1994

This book, like *Elements of the Theory of Computation*, is formal, but does not assume any specific background in mathematics. One of the authors of this book, Ron Sigal, used to be my professor for set theory, recursive function theory, and recursion equations (domain theory) in college. He was friendly and approachable, had an exceptionally clear teaching style, and usually spent the first class in an interesting lecture discussing the historical background of the subject.

Still another possibility is my nemesis, the **design and analysis of algorithms**:

*Compared to What?: An Introduction to the Analysis of Algorithms (Principles of Computer Science Series)*

by Gregory J.E. Rawlins

New York: W. H. Freeman, November 15, 1991

This was my favorite book on algorithms in college, and played an instrumental role in allowing me to pass this subject. Without this book, I probably would have had a much more difficult time in clearing this subject. This book approaches each algorithm from the perspective of how the algorithm is arrived at, rather than just handing the solution ready-made, interspersing numerous entertaining illustrations and quotes from works by Lewis Carroll. I further wrapped this book in kaleidoscopic bookwrap to strengthen the feeling of being in algorithm wonderland while reading this book, and the effect worked like a charm: I learned the algorithms, finally passed the course, and completed the major.

*Introduction to Algorithms, Second Edition*

by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein

Cambridge, Massachusetts: The MIT Press, September 1, 2001

This is the second edition of a classic work in the subject. It provides a rigorous theoretical treatment of algorithms. The first edition of this book forced the author of this blog to undertake a seemingly never-ending journey through set theory, discrete mathematics, recursive function theory, automata theory, meta-logic, and recursion equations (domain theory) in order to clear the topics covered by this book. I haven’t read the second edition yet, but the first edition was very complete and thorough, and was basically a mathematical encyclopedia on algorithms, complete with a brief introduction to prerequisite topics, detailed development of proofs, numerous exercises that led to other parts of the main text, numerous starred exercises of special difficulty (my professor hired a student who did all the starred exercises on his problem sets as a summer research intern), detailed pseudocode, analysis of the pseudocode, proofs of the analysis of the pseudocode, and highly detailed illustrations covering details of the algorithms and their proofs. Of course, this book has no solutions. Hey, who needs solutions? Nobody said you needed sleep, right?

*The Art of Computer Programming* (currently, Volumes 1 to 3 and Fascicles 0 to 4 of Volume 4, with Volume 5 planned for release in 2015, and Volumes 6 and 7 planned for later release)

by Donald E. Knuth

Reading, Massachusetts: Addison-Wesley, 1997 to 2006 and pending

The original classic work on the design and analysis of algorithms. Still in progress. Originally begun in 1962, written by compiler expert Donald E. Knuth. Volume 1 covers fundamental algorithms, volume 2 covers seminumerical algorithms, volume 3 covers sorting and searching, volume 4 covers combinatorical algorithms, volume 5 is scheduled to cover syntactic algorithms, volume 6 is scheduled to cover the theory of context-free languages, and volume 7 is scheduled to cover compiler techniques. Exercises range in difficulty from “warm-up” exercises to research problems.

One last topic is **algebra**, for which there is reputedly a very approachable work focusing on a limited scope of topics and exploring them to great depth:

*Topics in Algebra*

by I. N. Herstein

Xerox Corporation, 1975

Although I haven’t read this book, it has earned very high reviews for exposition and flow. The book is reputedly designed so that a few basic ideas are focused on, and problems in them explored, so that each new topic flows seemlessly from the previous one.

**Conclusion**

Let students take the book on the subway, train, or bus, and have them think about the categories in their sleep! Gradually, their curiosity will be piqued, and they will understand categories. Then they can read a second book with many exercises. But while they still have curiosity, let them at least visualize and juggle the categories in many ways in their heads before forcing them off! The more easily they can digest their first book, the more likely their curiosity will stay.

Nice post, I definitely identify with coming to CS from a “non-mathy” background–only after college did I overcome my fear of math and start to really enjoy it. Now I wish I had a younger brain so I could learn these things more easily!

I’m not sure I agree that set theory and category theory don’t overlap. In fact, I believe CT completely subsumes ST, insofar as any statement you can make in set theory can be “translated” to CT. The big idea with category theory is that the “contents of an object” are unimportant, because any interesting thing we might have to say about them can be completely expressed using only maps between objects.

Tom Crockett wrote:

> I’m not sure I agree that set theory and category theory don’t overlap. In fact,

> I believe CT completely subsumes ST, insofar as any statement you can

> make in set theory can be “translated” to CT. The big idea with category

> theory is that the “contents of an object” are unimportant, because any

> interesting thing we might have to say about them can be completely

> expressed using only maps between objects.

That’s an interesting statement in that it reminds me of a related idea that was expressed in a paper that I recently chanced upon, “When is one thing equal to some other thing,” by Barry Mazur (see http://www.math.ku.dk/kurser/2006-07/blok1/LAST/kategori.pdf).

In this paper, Mazur writes, “A category is a B.Y.O.S.T. party, i.e., you bring your own set theory to it.” He argues that the concept of “equality” in certain other branches of mathematics has been replaced by that of “equivalence,” or, more specifically, “isomorphism” in category theory; that is, instead of asking, “Are objects A and B equal to each other,” in category theory, one asks, “Does there exist an isomorphism f: A -> B between two objects A and B of a category C?” (page 11).

(He then qualifies the “isomorphism” as a “canonical isomorphism”; i.e., “Does there exist a specific isomorphism f: A -> B between two objects A and B of a category C that is characterized by an explicitly formulated property, or list of properties (so that the isomorphism can be pinpointed, instead of just being assumed to exist)?”)

Essentially (as far as I understand), instead of comparing values, we are comparing structures, and if two structures are shown to be equivalent via a specific isomorphism, then they are equivalent in the category theoretical sense of canonical isomorphism. The reason that equality makes no sense in category theory is that we are comparing structures, not values, and category theory allows the user to supply a bare set theory framework for this comparison.

He then gives an interesting example (pp. 17 to 19) in which he describes the Yoneda Lemma (see http://en.wikipedia.org/wiki/Yoneda_lemma), which “allows the embedding of any category into a category of functors defined on that category,” as a tool that allows encoding the essence of an object X in a given category C via a functor F_x that maps C to the category S of sets. What is interesting about this lemma is that it leads to the conclusion that instead of studying C, we should study the category of all functors of C into S (see http://en.wikipedia.org/wiki/Yoneda_lemma#Generalities).

The Yoneda Lemma is an interesting tool that allows studying a given category by studying the category of all functors of that category into the category of sets, thus creating a link between category theory and set theory. This seems to be another link that can be drawn between the two theories.