F# for Fun and Profit
  • Introduction
  • Getting started
    • Contents of the book
    • "Why use F#?" in one page
    • Installing and using F#
    • F# syntax in 60 seconds
    • Learning F#
    • Troubleshooting F#
    • Low-risk ways to use F# at work
      • Twenty six low-risk ways to use F# at work
      • Using F# for development and devops scripts
      • Using F# for testing
      • Using F# for database related tasks
      • Other interesting ways of using F# at work
  • Why use F#?
    • The "Why use F#?" Series
      • Introduction to the 'Why use F#' series
      • F# syntax in 60 seconds
      • Comparing F# with C#: A simple sum
      • Comparing F# with C#: Sorting
      • Comparing F# with C#: Downloading a web page
      • Four Key Concepts
      • Conciseness
      • Type inference
      • Low overhead type definitions
      • Using functions to extract boilerplate code
      • Using functions as building blocks
      • Pattern matching for conciseness
      • Convenience
      • Out-of-the-box behavior for types
      • Functions as interfaces
      • Partial Application
      • Active patterns
      • Correctness
      • Immutability
      • Exhaustive pattern matching
      • Using the type system to ensure correct code
      • Worked example: Designing for correctness
      • Concurrency
      • Asynchronous programming
      • Messages and Agents
      • Functional Reactive Programming
      • Completeness
      • Seamless interoperation with .NET libraries
      • Anything C# can do...
      • Why use F#: Conclusion
  • Thinking Functionally
    • The "Thinking Functionally" Series
      • Thinking Functionally: Introduction
      • Mathematical functions
      • Function Values and Simple Values
      • How types work with functions
      • Currying
      • Partial application
      • Function associativity and composition
      • Defining functions
      • Function signatures
      • Organizing functions
      • Attaching functions to types
      • Worked example: A stack based calculator
  • Understanding F# ###
    • The "Expressions and syntax" Series
      • Expressions and syntax: Introduction
      • Expressions vs. statements
      • Overview of F# expressions
      • Binding with let, use, and do
      • F# syntax: indentation and verbosity
      • Parameter and value naming conventions
      • Control flow expressions
      • Exceptions
      • Match expressions
      • Formatted text using printf
      • Worked example: Parsing command line arguments
      • Worked example: Roman numerals
    • The "Understanding F# types" Series
      • Understanding F# types: Introduction
      • Overview of types in F#
      • Type abbreviations
      • Tuples
      • Records
      • Discriminated Unions
      • The Option type
      • Enum types
      • Built-in .NET types
      • Units of measure
      • Understanding type inference
    • Choosing between collection functions
    • The "Object-oriented programming in F#" Series
      • Object-oriented programming in F#: Introduction
      • Classes
      • Inheritance and abstract classes
      • Interfaces
      • Object expressions
    • The "Computation Expressions" Series
      • Computation expressions: Introduction
      • Understanding continuations
      • Introducing 'bind'
      • Computation expressions and wrapper types
      • More on wrapper types
      • Implementing a builder: Zero and Yield
      • Implementing a builder: Combine
      • Implementing a builder: Delay and Run
      • Implementing a builder: Overloading
      • Implementing a builder: Adding laziness
      • Implementing a builder: The rest of the standard methods
    • Organizing modules in a project
    • The "Dependency cycles" Series
      • Cyclic dependencies are evil
      • Refactoring to remove cyclic dependencies
      • Cycles and modularity in the wild
    • The "Porting from C#" Series
      • Porting from C# to F#: Introduction
      • Getting started with direct porting
  • Functional Design ###
    • The "Designing with types" Series
      • Designing with types: Introduction
      • Single case union types
      • Making illegal states unrepresentable
      • Discovering new concepts
      • Making state explicit
      • Constrained strings
      • Non-string types
      • Designing with types: Conclusion
    • Algebraic type sizes and domain modelling
    • Thirteen ways of looking at a turtle
      • Thirteen ways of looking at a turtle (part 2)
      • Thirteen ways of looking at a turtle - addendum
  • Functional Patterns ###
    • How to design and code a complete program
    • A functional approach to error handling (Railway oriented programming)
      • Railway oriented programming: Carbonated edition
    • The "Understanding monoids" Series
      • Monoids without tears
      • Monoids in practice
      • Working with non-monoids
    • The "Understanding Parser Combinators" Series
      • Understanding Parser Combinators
      • Building a useful set of parser combinators
      • Improving the parser library
      • Writing a JSON parser from scratch
    • The "Handling State" Series
      • Dr Frankenfunctor and the Monadster
      • Completing the body of the Monadster
      • Refactoring the Monadster
    • The "Map and Bind and Apply, Oh my!" Series
      • Understanding map and apply
      • Understanding bind
      • Using the core functions in practice
      • Understanding traverse and sequence
      • Using map, apply, bind and sequence in practice
      • Reinventing the Reader monad
      • Map and Bind and Apply, a summary
    • The "Recursive types and folds" Series
      • Introduction to recursive types
      • Catamorphism examples
      • Introducing Folds
      • Understanding Folds
      • Generic recursive types
      • Trees in the real world
    • The "A functional approach to authorization" Series
      • A functional approach to authorization
      • Constraining capabilities based on identity and role
      • Using types as access tokens
  • Testing
    • An introduction to property-based testing
    • Choosing properties for property-based testing
  • Examples and Walkthroughs
    • Worked example: Designing for correctness
    • Worked example: A stack based calculator
    • Worked example: Parsing command line arguments
    • Worked example: Roman numerals
    • Commentary on 'Roman Numerals Kata with Commentary'
    • Calculator Walkthrough: Part 1
      • Calculator Walkthrough: Part 2
      • Calculator Walkthrough: Part 3
      • Calculator Walkthrough: Part 4
    • Enterprise Tic-Tac-Toe
      • Enterprise Tic-Tac-Toe, part 2
    • Writing a JSON parser from scratch
  • Other
    • Ten reasons not to use a statically typed functional programming language
    • Why I won't be writing a monad tutorial
    • Is your programming language unreasonable?
    • We don't need no stinking UML diagrams
    • Introvert and extrovert programming languages
    • Swapping type-safety for high performance using compiler directives
Powered by GitBook
On this page
  • Series contents
  • Rules for creating catamorphisms
  • Catamorphism example: File system domain
  • File system domain: totalSize example
  • File system domain: largestFile example
  • Catamorphism example: Product domain
  • Product domain: productWeight example
  • Product domain: mostUsedVendor example
  • Summary

Was this helpful?

  1. Functional Patterns ###
  2. The "Recursive types and folds" Series

Catamorphism examples

Applying the rules to other domains

PreviousIntroduction to recursive typesNextIntroducing Folds

Last updated 5 years ago

Was this helpful?

This post is the second in a series.

In the , I introduced "catamorphisms", a way of creating functions for recursive types, and listed some rules which can be used to implement them mechanically. In this post, we'll use these rules to implement catamorphisms for some other domains.

Series contents

Here's the contents of this series:

  • Part 1: Introduction to recursive types and catamorphisms

  • Part 2: Catamorphism examples

  • Part 3: Introducing folds

  • Part 4: Understanding folds

  • Part 5: Generic recursive types

  • Part 6: Trees in the real world

Rules for creating catamorphisms

We saw in the previous post that creating a catamorphism is a mechanical process, and the rules were:

  • Create a function parameter to handle each case in the structure.

  • For non-recursive cases, pass the function parameter all the data associated with that case.

  • For recursive cases, perform two steps:

    • First, call the catamorphism recursively on the nested value.

    • Then pass the handler all the data associated with that case, but with the result of the catamorphism replacing the original nested value.

Let's now see if we can apply these rules to create catamorphisms in other domains.

Catamorphism example: File system domain

Let's start with a very crude model of a file system:

  • Each file has a name and a size.

  • Each directory has a name and a size and a list of subitems.

Here's how I might model that:

type FileSystemItem =
    | File of File
    | Directory of Directory
and File = {name:string; fileSize:int}
and Directory = {name:string; dirSize:int; subitems:FileSystemItem list}

I admit it's a pretty bad model, but it's just good enough for this example!

Ok, here are some sample files and directories:

let readme = File {name="readme.txt"; fileSize=1}
let config = File {name="config.xml"; fileSize=2}
let build  = File {name="build.bat"; fileSize=3}
let src = Directory {name="src"; dirSize=10; subitems=[readme; config; build]}
let bin = Directory {name="bin"; dirSize=10; subitems=[]}
let root = Directory {name="root"; dirSize=5; subitems=[src; bin]}

Time to create the catamorphism!

Let's start by looking at the signatures to figure out what we need.

The File constructor takes a File and returns a FileSystemItem. Using the guidelines above, the handler for the File case needs to have the signature File -> 'r.

// case constructor
File  : File -> FileSystemItem

// function parameter to handle File case 
fFile : File -> 'r

That's simple enough. Let's put together an initial skeleton of cataFS, as I'll call it:

let rec cataFS fFile fDir item :'r = 
    let recurse = cataFS fFile fDir 
    match item with
    | File file -> 
        fFile file
    | Directory dir -> 
        // to do

The Directory case is more complicated. If we naively applied the guidelines above, the handler for the Directory case would have the signature Directory -> 'r, but that would be incorrect, because the Directory record itself contains a FileSystemItem that needs to be replaced with an 'r too. How can we do this?

One way is to "explode" the Directory record into a tuple of (string,int,FileSystemItem list), and then replace the FileSystemItem with 'r in there too.

In other words, we have this sequence of transformations:

// case constructor (Directory as record)
Directory : Directory -> FileSystemItem

// case constructor (Directory unpacked as tuple)
Directory : (string, int, FileSystemItem list) -> FileSystemItem
//   replace with 'r ===> ~~~~~~~~~~~~~~          ~~~~~~~~~~~~~~

// function parameter to handle Directory case 
fDir :      (string, int, 'r list)             -> 'r

Another issue is that the data associated with the Directory case is a list of FileSystemItems. How can we convert that into a list of 'rs?

Well, the recurse helper turns a FileSystemItem into an 'r, so we can just use List.map passing in recurse as the mapping function, and that will give us the list of 'rs we need!

Putting it all together, we get this implementation:

let rec cataFS fFile fDir item :'r = 
    let recurse = cataFS fFile fDir 
    match item with
    | File file -> 
        fFile file
    | Directory dir -> 
        let listOfRs = dir.subitems |> List.map recurse 
        fDir (dir.name,dir.dirSize,listOfRs)

and if we look at the type signature, we can see that it is just what we want:

val cataFS :
    fFile : (File -> 'r) ->
    fDir  : (string * int * 'r list -> 'r) -> 
    // input value
    FileSystemItem -> 
    // return value
    'r

So we're done. It's a bit complicated to set up, but once built, we have a nice reusable function that can be the basis for many others.

File system domain: totalSize example

Alrighty then, let's use it in practice.

To start with, we can easily define a totalSize function that returns the total size of an item and all its subitems:

let totalSize fileSystemItem =
    let fFile (file:File) = 
        file.fileSize
    let fDir (name,size,subsizes) = 
        (List.sum subsizes) + size
    cataFS fFile fDir fileSystemItem

And here are the results:

readme |> totalSize  // 1
src |> totalSize     // 16 = 10 + (1 + 2 + 3)
root |> totalSize    // 31 = 5 + 16 + 10

File system domain: largestFile example

How about a more complicated function, such as "what is the largest file in the tree?"

Before we start this one, let's think about what it should return. That is, what is the 'r?

You might think that it's just a File. But what if the subdirectory is empty and there are no files?

So let's make 'r a File option instead.

The function for the File case should return Some file then:

let fFile (file:File) = 
    Some file

The function for the Directory case needs more thought:

  • If list of subfiles is empty, then return None

  • If list of subfiles is non-empty, then return the largest one

let fDir (name,size,subfiles) = 
    match subfiles with
    | [] -> 
        None  // empty directory
    | subfiles -> 
        // return largest one

But remember that 'r is not a File but a File option. So that means that subfiles is not a list of files, but a list of File option.

Now, how can we find the largest one of these? We probably want to use List.maxBy and pass in the size. But what is the size of a File option?

Let's write a helper function to provide the size of a File option, using this logic:

  • If the File option is None, return 0

  • Else return the size of the file inside the option

Here's the code:

// helper to provide a default if missing
let ifNone deflt opt =
    defaultArg opt deflt 

// get the file size of an option    
let fileSize fileOpt = 
    fileOpt 
    |> Option.map (fun file -> file.fileSize)
    |> ifNone 0

Putting it all together then, we have our largestFile function:

let largestFile fileSystemItem =

    // helper to provide a default if missing
    let ifNone deflt opt =
        defaultArg opt deflt 

    // helper to get the size of a File option
    let fileSize fileOpt = 
        fileOpt 
        |> Option.map (fun file -> file.fileSize)
        |> ifNone 0

    // handle File case        
    let fFile (file:File) = 
        Some file

    // handle Directory case        
    let fDir (name,size,subfiles) = 
        match subfiles with
        | [] -> 
            None  // empty directory
        | subfiles -> 
            // find the biggest File option using the helper
            subfiles 
            |> List.maxBy fileSize  

    // call the catamorphism
    cataFS fFile fDir fileSystemItem

If we test it, we get the results we expect:

readme |> largestFile  
// Some {name = "readme.txt"; fileSize = 1}

src |> largestFile     
// Some {name = "build.bat"; fileSize = 3}

bin |> largestFile     
// None

root |> largestFile    
// Some {name = "build.bat"; fileSize = 3}

Again, a little bit tricky to set up, but no more than if we had to write it from scratch without using a catamorphism at all.

Catamorphism example: Product domain

Let's work with a slightly more complicated domain. This time, imagine that we make and sell products of some kind:

  • Some products are bought, with an optional vendor.

  • Some products are made on our premises, built from subcomponents,

    where a subcomponent is some quantity of another product.

Here's the domain modelled as types:

type Product =
    | Bought of BoughtProduct 
    | Made of MadeProduct 
and BoughtProduct = {
    name : string 
    weight : int 
    vendor : string option }
and MadeProduct = {
    name : string 
    weight : int 
    components:Component list }
and Component = {
    qty : int
    product : Product }

Note that the types are mutally recursive. Product references MadeProduct which references Component which in turn references Product again.

Here are some example products:

let label = 
    Bought {name="label"; weight=1; vendor=Some "ACME"}
let bottle = 
    Bought {name="bottle"; weight=2; vendor=Some "ACME"}
let formulation = 
    Bought {name="formulation"; weight=3; vendor=None}

let shampoo = 
    Made {name="shampoo"; weight=10; components=
    [
    {qty=1; product=formulation}
    {qty=1; product=bottle}
    {qty=2; product=label}
    ]}

let twoPack = 
    Made {name="twoPack"; weight=5; components=
    [
    {qty=2; product=shampoo}
    ]}

Now to design the catamorphism, we need to do is replace the Product type with 'r in all the constructors.

Just as with the previous example, the Bought case is easy:

// case constructor
Bought  : BoughtProduct -> Product

// function parameter to handle Bought case 
fBought : BoughtProduct -> 'r

The Made case is trickier. We need to expand the MadeProduct into a tuple. But that tuple contains a Component, so we need to expand that as well. Finally we get to the inner Product, and we can then mechanically replace that with 'r.

Here's the sequence of transformations:

// case constructor
Made  : MadeProduct -> Product

// case constructor (MadeProduct unpacked as tuple)
Made  : (string,int,Component list) -> Product

// case constructor (Component unpacked as tuple)
Made  : (string,int,(int,Product) list) -> Product
//  replace with 'r ===> ~~~~~~~           ~~~~~~~

// function parameter to handle Made case 
fMade : (string,int,(int,'r) list)      -> 'r

When implementing the cataProduct function we need to the same kind of mapping as before, turning a list of Component into a list of (int,'r).

We'll need a helper for that:

// Converts a Component into a (int * 'r) tuple
let convertComponentToTuple comp =
    (comp.qty,recurse comp.product)

You can see that this uses the recurse function to turn the inner product (comp.product) into an 'r and then make a tuple int * 'r.

With convertComponentToTuple available, we can convert all the components to tuples using List.map:

let componentTuples = 
    made.components 
    |> List.map convertComponentToTuple

componentTuples is a list of (int * 'r), which is just what we need for the fMade function.

The complete implementation of cataProduct looks like this:

let rec cataProduct fBought fMade product :'r = 
    let recurse = cataProduct fBought fMade 

    // Converts a Component into a (int * 'r) tuple
    let convertComponentToTuple comp =
        (comp.qty,recurse comp.product)

    match product with
    | Bought bought -> 
        fBought bought 
    | Made made -> 
        let componentTuples =  // (int * 'r) list
            made.components 
            |> List.map convertComponentToTuple 
        fMade (made.name,made.weight,componentTuples)

Product domain: productWeight example

We can now use cataProduct to calculate the weight, say.

let productWeight product =

    // handle Bought case
    let fBought (bought:BoughtProduct) = 
        bought.weight

    // handle Made case
    let fMade (name,weight,componentTuples) = 
        // helper to calculate weight of one component tuple
        let componentWeight (qty,weight) =
            qty * weight
        // add up the weights of all component tuples
        let totalComponentWeight = 
            componentTuples 
            |> List.sumBy componentWeight 
        // and add the weight of the Made case too
        totalComponentWeight + weight

    // call the catamorphism
    cataProduct fBought fMade product

Let's test it interactively to make sure it works:

label |> productWeight    // 1
shampoo |> productWeight  // 17 = 10 + (2x1 + 1x2 + 1x3)
twoPack |> productWeight  // 39 = 5  + (2x17)

That's as we expect.

Try implementing productWeight from scratch, without using a helper function like cataProduct. Again, it's do-able, but you'll probably waste quite bit of time getting the recursion logic right.

Product domain: mostUsedVendor example

Let's do a more complex function now. What is the most used vendor?

The logic is simple: each time a product references a vendor, we'll give that vendor one point, and the vendor with the highest score wins.

Again, let's think about what it should return. That is, what is the 'r?

You might think that it's just a score of some kind, but we also need to know the vendor name. Ok, a tuple then. But what if there are no vendors?

So let's make 'r a VendorScore option, where we are going to create a little type VendorScore, rather than using a tuple.

type VendorScore = {vendor:string; score:int}

We'll also define some helpers to get data from a VendorScore easily:

let vendor vs = vs.vendor
let score vs = vs.score

Now, you can't determine the most used vendor over until you have results from the entire tree, so both the Bought case and the Made case need to return a list which can added to as we recurse up the tree. And then, after getting all the scores, we'll sort descending to find the vendor with the highest one.

So we have to make 'r a VendorScore list, not just an option!

The logic for the Bought case is then:

  • If the vendor is present, return a VendorScore with score = 1, but as a one-element list rather than as a single item.

  • If the vendor is missing, return an empty list.

let fBought (bought:BoughtProduct) = 
    // set score = 1 if there is a vendor
    bought.vendor
    |> Option.map (fun vendor -> {vendor = vendor; score = 1} )
    // => a VendorScore option
    |> Option.toList
    // => a VendorScore list

The function for the Made case is more complicated.

  • If list of subscores is empty, then return an empty list.

  • If list of subscores is non-empty, we sum them by vendor and then return the new list.

But the list of subresults passed into the fMade function will not be a list of subscores, it will be a list of tuples, qty * 'r where 'r is VendorScore list. Complicated!

What we need to do then is:

  • Turn qty * 'r into just 'r because we don't care about the qty in this case. We now have a list of VendorScore list. We can use List.map snd to do this.

  • But now we would have a list of VendorScore list. We can flatten a list of lists into a simple list using List.collect. And in fact, using List.collect snd can do both steps in one go.

  • Group this list by vendor so that we have a list of key=vendor; values=VendorScore list tuples.

  • Sum up the scores for each vendor (values=VendorScore list) into a single value, so that we have a list of key=vendor; values=VendorScore tuples.

At this point the cata function will return a VendorScore list. To get the highest score, use List.sortByDescending then List.tryHead. Note that maxBy won't work because the list could be empty.

Here's the complete mostUsedVendor function:

let mostUsedVendor product =

    let fBought (bought:BoughtProduct) = 
        // set score = 1 if there is a vendor
        bought.vendor
        |> Option.map (fun vendor -> {vendor = vendor; score = 1} )
        // => a VendorScore option
        |> Option.toList
        // => a VendorScore list

    let fMade (name,weight,subresults) = 
        // subresults are a list of (qty * VendorScore list)

        // helper to get sum of scores
        let totalScore (vendor,vendorScores) =
            let totalScore = vendorScores |> List.sumBy score
            {vendor=vendor; score=totalScore}

        subresults 
        // => a list of (qty * VendorScore list)
        |> List.collect snd  // ignore qty part of subresult
        // => a list of VendorScore 
        |> List.groupBy vendor 
        // second item is list of VendorScore, reduce to sum
        |> List.map totalScore 
        // => list of VendorScores 

    // call the catamorphism
    cataProduct fBought fMade product
    |> List.sortByDescending score  // find highest score
    // return first, or None if list is empty
    |> List.tryHead

Now let's test it:

label |> mostUsedVendor    
// Some {vendor = "ACME"; score = 1}

formulation |> mostUsedVendor  
// None

shampoo |> mostUsedVendor  
// Some {vendor = "ACME"; score = 2}

twoPack |> mostUsedVendor  
// Some {vendor = "ACME"; score = 2}

This isn't the only possible implementation of fMade, of course. I could have used List.fold and done the whole thing in one pass, but this version seems like the most obvious and readable implementation.

It's also true that I could have avoided using cataProduct altogether and written mostUsedVendor from scratch. If performance is an issue, then that might be a better approach, because the generic catamorphism creates intermediate values (such as the list of qty * VendorScore option) which are over general and potentially wasteful.

On other hand, by using the catamorphism, I could focus on the counting logic only and ignore the recursion logic.

So as always, you should consider the pros and cons of reuse vs. creating from scratch; the benefits of writing common code once and using it in a standardized way, versus the performance but extra effort (and potential bugginess) of custom code.

Summary

We've seen in this post how to define a recursive type, and been introduced to catamorphisms.

And we have also seen some uses for catamorphisms:

  • Any function that "collapses" a recursive type, such as Gift -> 'r, can be written in terms of the catamorphism for that type.

  • Catamorphisms can be used to hide the internal structure of the type.

  • Catamorphisms can be used to create mappings from one type to another by tweaking the functions that handle each case.

  • Catamorphisms can be used to create a clone of the original value by passing in the type's case constructors.

But all is not perfect in the land of catamorphisms. In fact, all the catamorphism implementations on this page have a potentially serious flaw.

See you then!

UPDATE: Fixed logic error in mostUsedVendor as pointed out by Paul Schnapp in comments. Thanks, Paul!

In the we'll see what can go wrong with them, how to fix them, and in the process look at the various kinds of "fold".

The source code for this post is available at .

next post
this gist
previous post
Catamorphism example: File system domain
Catamorphism example: Product domain
Iteration vs. recursion
Fold example: File system domain
Common questions about "fold"
Defining a generic Tree type
The Tree type in the real world
Mapping the Tree type
Example: Creating a directory listing
Example: A parallel grep
Example: Storing the file system in a database
Example: Serializing a Tree to JSON
Example: Deserializing a Tree from JSON
Example: Deserializing a Tree from JSON - with error handling
LinkedList: A generic recursive type
Making the Gift domain generic
Defining a generic Container type
A third way to implement the gift domain
Abstract or concrete? Comparing the three designs
A simple recursive type
Parameterize all the things
Introducing catamorphisms
Benefits of catamorphisms
Rules for creating a catamorphism
A flaw in our catamorphism implementation
Introducing fold
Problems with fold
Using functions as accumulators
Introducing foldback
Rules for creating a fold