Fun With Catamorphisms
The word cata (ancient Greek: κατά “down from”) + morphism (ancient Greek: μορφή “form, shape” ) is used to describe a way to collapse a recursive data structure into a new value based on its structure. fold
defined for a list gives you a hint of what a catamorphisms is able to do. It is a very powerful thing to define for any structure since all other functions can be defined in terms of it. The closest equivalent in Object Oriented programming is perhaps the visitor pattern.
To illustrate a basic catamorphism we have to define a recursive datastructure. This one is shamelessly stolen from fsharpforfunandprofit (a great resource for every one interested in functional programming).
The Gift data structure
This models a gift. A gift can be either chocolate or a book and the gift can be wrapped and/or boxed. Boxes and wrappings are the “container objects” and can contain at most one item, This makes the data structure linerally recursive as opposed to a tree for example.
We are using tagged unions
to describe the datatype Gift
. A gift can either be a Book
or a Chocolate
. Where things get a bit more complicated is we add the types Boxed
and Wrapped
. A boxed book is still a gift and a wrapped, boxed book still serve as a great birthday present. One flaw here is that an empty box is still considered a gift but that’s OK for our purpose.
The Test Data
boxGift
is a function that takes a gift and “puts it in a box”. wrapGift
takes a gift and a wrapping and returns the newly wrapped gift. These are functions that often gets called data constructors in functional languages. Something like constructors in OOP. Using these functions to wrap and box our gifts leaves us with a couple of varying gifts.
Life Without Catamorphisms
The first thing we want to do is create a function that takes a gift and tells us whats inside. A form of pretty print for the gift data type.
Here we use the tagged union
to switch on the kind. In case of the both leaf-nodes (chocolate and book) we just return a string that informs us of the taste or title.
The containers (wrapping and boxing) adds the style of paper or just the information that the gift is inside a box. Then it recursively calls whatsInside
to keep on moving to the center of the gift.
Seems to be working great! Lets do another one! This time we want to know the complete cost of the gift. This will take the price of the gift and add the price for the box or the wrapping paper. There are not defined in the data structure and is added in this function instead.
Works the same way as whatsInside
the difference is the return value. This returns a number
while whatsInside
returns a string.
And it seems like it´s adding the values correctly. Great! But I get this nagging feeling that since the two functions are pretty much the same there is some form of reuse we are missing.
Here Comes the Catamorphisms
This is where the catamorphisms enter the stage. A catamorphism is a general way of “collapsing” our gift data structure or generally any recursive data structure. A catamorphism does this by recursion from the bottom up. There are other ways of doing this collapsing but this is a nice and easy way to do it.
How do we create a catamorphism then? There are just a few steps. If we follow them we will be fine.
Step one is to create a function that that takes four functions. One for each of our cases Book
, Chocolate
, Box
and Wrapped
plus the Gift
we want to traverse.
This is our function signature. cataGift
is generic and parameterized by the type T
. The first parameter called fBook is the function that will handle book. It takes a function that takes a Book
and returns a T
. In the case of pretty printing it would be something like this:
and when we want to sum the cost it would look something like this:
fChocolate
behaves the exact same way.
fWrapped
is a bit more complicated. It takes a function that takes a T
and a Wrapping
and then returns a T
. This is because the catamorphism will recurse when it encounters a wrapping and the result from the recursion will be the first parameter a of type T
and the second parameter will be the Wrapping.
our pretty print would look something like this:
Finally our function for boxed will take a Box
and return a T
. This is a simpler version of fWrapped.
If we combine all these functions and switch on the type of gifts the complete function will look something like this.
If the gift it Chocolate
or Book
we will call the provided function on that value. Pretty straight forward. If the Gift
is wrapped we will call the cataGift on the contained gift recursevly. When that finally returns we well call fWrapped
with the result from the recursion and also the wrapping. A boxed gift will just recurse and then call fBoxed
with the result.
whatsInside with cataGift
Lets reimplement whatsInsde
using our new fancy catamorphism.
we are using lambdas here and I hope they are self explanatory. This is the result:
Seems alright!
totalCost with cataGift
How about totalCost
how do we recreate that using cataGift
. In case of Book
and Chocolate
just return the price. In the recursive cases, add our cost to the recursive case and return that. Something like this.
and that will yield
Super! It all works out!
Unwrapper
One nice thing with catamorphisms is how easy it is to convert from one structure to another. If we want to remove the packaging just use cataGift
!
Ok. I created a function called Id
. It’s not really the Identity but it will work in our case. It just takes any number of parameters and returns the first one. That makes it work both with our recursive cases and our simple cases.
Our function takes a Gift
and returns a Gift
. The simple cases just returns the same Book
or Chocolate
using the Id
function. The wrapped case will just return the fist parameter. in our case that is the contents of the wrapped datatype. Great! That will remove the wrapping! When we hit our boxed case we cant easily just return the same box. If we check our function signature fBoxed: (a: T) => T
it takes a T
and returns a T
. In our case the T
is a Gift
and the gift in question is the contents. So we need to re-box our content. Luckily we already have a function that does exactly that. boxGift
our data constructor for boxes. Just use that
The result:
Seems ok. And its easy to do an unboxer too
and a function that tastes all chocolate? Is that possible? Well yes!
Great! And look! They compose!
In any order!
Lets make a list of all gifts
Then we can sum the total costs of all the gifts using a ordinary reduce
And it is easy to create a cata that lists all the contents, discarding andy packageing
One great case for catamorphisms is reuse. We have used the same catamorphism for several different uses instead of creating new tailored functions for everything. That’s great!
There are some drawbacks. TypeScript is not great for recursion. This can be solved using folds (perhaps something to write about). And depending on your use case the types can become quite hairy. TypeScript is not the optimal language for doing this kind of work but as I showed here doable and not that complicated.