There’s something of an internet meme about repeating the exact definition “a monad is just a monoid in the category of endofunctors” (sometimes “just” is emphasized to highlight how simple it is). I’m not fully sure on the meme history, but I believe it is something like, these programming kids get into programming languages that have a “monad” construct, and they’ve never seen this in their other languages (which just have normal stuff like strings, ints, arrays, etc) so they ask what a monad is, and then suddenly they get this stuff from algebra (monoid is from algebra I think?) and category theory which they haven’t studied because, like, not exactly a prerequisite for being a programmer. And the people trying to explain it can’t think of any clearer way to put it, there’s just no better way to explain it (as far as the people who understand it know) so whenever the kid goes to ask someone else they always get the same cryptic definition. That sound right? Anyway, uh, what does the definition mean anyway?
Yeah, your “meme origin” sketch is basically right. In the Haskell / functional programming world (and more broadly among people who’ve seen the category-theoretic formulation), the phrase
“A monad is just a monoid in the category of endofunctors”
(or with the emphasized “just” as if to say “it’s not some mystical thing, it’s this one neat abstract fact”) gets repeated because it’s compact and elegant if you already know what all the words mean. To someone who hasn’t seen category theory, it sounds like being told “a car is just a four-wheeled combustion-powered device with a steering wheel in the category of road vehicles.” It’s accurate, but not immediately illuminating. So beginners ask “what’s a monad?” and get handed this line, which leads to the meme of the cryptic “insider answer” being recycled, often without the unpacking that makes it usable.
So: what does it mean?
Break it down:
1. Monoid (familiar algebraic idea)
A monoid is something with:
- A binary operation
⋆
that’s associative:(a ⋆ b) ⋆ c = a ⋆ (b ⋆ c)
. - An identity element
e
such thate ⋆ x = x = x ⋆ e
.
Example: strings under concatenation. Operation is ++
, identity is the empty string.
2. Category
A category consists of:
- Objects (in programming, types like
Int
,Bool
, etc.). - Morphisms between objects (in programming, functions between types).
- Composition of morphisms that’s associative, and each object has an identity morphism.
The category most functional programmers implicitly work in is something like Hask
: types as objects, pure functions as morphisms.
3. Functor and Endofunctor
A functor is a mapping between categories that preserves structure: it maps objects to objects and morphisms to morphisms in a way that respects composition and identities.
An endofunctor is a functor from a category to itself. In Hask
, examples are type constructors like:
-
Maybe
(which adds “possible failure” context), -
[]
(lists, i.e., nondeterminism or multiple results), -
IO
(side effects), -
State s
(stateful computations), etc.
Each of these takes a type a
and yields a new type (e.g., Maybe a
), and you can map a function over their contents (fmap
).
4. Category of endofunctors
Now consider all endofunctors on Hask
as objects in a new category. The morphisms between those are natural transformations (uniform ways of going from one functor to another). This whole setup itself is a category: the category of endofunctors.
This category has a built-in notion of “combining” endofunctors: functor composition. Given two endofunctors M
and M
again, you can compose them to get M ∘ M
(e.g., Maybe (Maybe a)
).
5. Monoid in that category
A monoid object in this category (with composition as the “multiplication”) is:
- An endofunctor
M
, - A natural transformation
η: Id ⇒ M
(the identity functor injecting intoM
), called the unit, - A natural transformation
μ: M ∘ M ⇒ M
(flattening two layers ofM
into one), called the multiplication,
such that the usual monoid laws hold, but now in the setting of natural transformations and functor composition:
-
Associativity: flattening nested
M
s in either grouping gives the same result (μ ∘ Mμ = μ ∘ μM
), -
Unit laws: injecting then flattening, or flattening then injecting, is a no-op (
μ ∘ Mη = id = μ ∘ ηM
).
6. That’s a monad
That (M, η, μ)
satisfying those laws is what category theory calls a monad. So a monad is literally a monoid in the category whose objects are endofunctors and whose “multiplication” is composition. Hence the pithy definition.
7. Programming translation
In programming (e.g., Haskell) people usually work with:
-
return
≈η
(puts a pure value into the monadic context), -
join
≈μ
(flattensM (M a)
intoM a
), -
bind
(>>=
) is a derived helper that sequences: givenm :: M a
andf :: a -> M b
,m >>= f
isjoin (fmap f m)
.
The monad laws in the familiar form:
- Left identity:
return a >>= f ≡ f a
- Right identity:
m >>= return ≡ m
- Associativity:
(m >>= f) >>= g ≡ m >>= (\x -> f x >>= g)
These correspond exactly to the monoid laws on η
and μ
.
8. Intuition (why people often give other explanations)
Saying “a monad is a way to sequence computations that carry along some context” gives operational intuition: the context might be “possible failure” (Maybe
), “multiple results” (List
), “state threading” (State
), etc. The abstract definition packages all such patterns uniformly: you get a “contextual computation” with a way to inject pure values and a way to compose/flatten nested contexts, obeying clean laws.
9. Example: Maybe
-
η x = Just x
(inject a value), -
μ (Just (Just x)) = Just x
,μ (Just Nothing) = Nothing
,μ Nothing = Nothing
(flatten nested Maybes), -
bind
handles failure automatically.
This satisfies the associativity and identity laws, so Maybe
is a monad.
So the meme is that the answer “a monad is just a monoid in the category of endofunctors” is technically satisfying and compact, but for learners it’s helpful to unpack it in terms of return
/bind
/join
and concrete examples. Once you’ve internalized the structure, the categorical definition becomes a crisp way of seeing why all those disparate-looking patterns (IO, lists, exceptions, state) share the same abstract shape.
If you want, I can give several concrete examples side by side, derive bind
from μ
/fmap
, or draw the commutative diagrams that encode the laws. Which angle would help most next?
This post is probably not as good as the StackOverflow answer.
No comments:
Post a Comment