Implementing a builder: Combine
How to return multiple values at once
In this post we're going to look at returning multiple values from a computation expression using the Combine
method.
The story so far...
So far, our expression builder class looks like this:
And this class has worked fine so far. But we are about to run into a problem...
A problem with two 'yields'
Previously, we saw how yield
could be used to return values just like return
.
Normally, yield
is not used just once, of course, but multiple times in order to return values at different stages of a process such as an enumeration. So let's try that:
But uh-oh, we get an error message:
And if you use return
instead of yield
, you get the same error.
And this problem occurs in other contexts too. For example, if we want to do something and then return, like this:
We get the same error message about a missing 'Combine' method.
Understanding the problem
So what's going on here?
To understand, let's go back to the behind-the-scenes view of the computation expression. We have seen that return
and yield
are really just the last step in a series of continuations, like this:
You can think of return
(or yield
) as "resetting" the indentation, if you like. So when we return/yield
and then return/yield
again, we are generating code like this:
But really this can be simplified to:
In other words, we now have two values in our computation expression. And then the obvious question is, how should these two values be combined to give a single result for the computation expression as a whole?
This is a very important point. Return and yield do not generate an early return from a computation expression. No, the entire computation expression, all the way to the last curly brace, is always evaluated and results in a single value. Let me repeat that. Every part of the computation expression is always evaluated -- there is no short circuiting going on. If we want to short circuit and return early, we have to write our own code to do that (and we'll see how to do that later).
So, back to the pressing question. We have two expressions resulting in two values: how should those multiple values be combined into one?
Introducing "Combine"
The answer is by using the Combine
method, which takes two wrapped values and combines them to make another wrapped value. Exactly how this works is up to us.
In our case, we are dealing specifically with int options
, so one simple implementation that leaps to mind it just to add the numbers together. Each parameter is an option
of course (the wrapped type), so we need to pick them apart and handle the four possible cases:
Running the test code again:
But now we get a different error message:
The Delay
method is a hook that allows you to delay evaluation of a computation expression until needed -- we'll discuss this in detail very soon; but for now, let's create a default implementation:
Running the test code again:
And finally we get the code to complete.
The result of the entire workflow is the sum of all the yields, namely Some 3
.
If we have a "failure" in the workflow (e.g. a None
), the second yield doesn't occur and the overall result is Some 1
instead.
We can have three yields
rather than two:
The result is what you would expect, Some 6
.
We can even try mixing up yield
and return
together. Other than the syntax difference, the overall effect is the same.
Using Combine for sequence generation
Adding numbers up is not really the point of yield
, although you might perhaps use a similar idea for constructing concatenated strings, somewhat like StringBuilder
.
No, yield
is naturally used as part of sequence generation, and now that we understand Combine
, we can extend our "ListBuilder" workflow (from last time) with the required methods.
The
Combine
method is just list concatenation.The
Delay
method can use a default implementation for now.
Here's the full class:
And here it is in use:
And here's a more complicated example with a for
loop and some yield
s.
And the result is:
You can see that by combining for..in..do
with yield
, we are not too far away from the built-in seq
expression syntax (except that seq
is lazy, of course).
I would strongly encourage you to play around with this a bit until you are clear on what is going on behind the scenes. As you can see from the example above, you can use yield
in creative ways to generate all sorts of irregular lists, not just simple ones.
Note: If you're wondering about While
, we're going to hold off on it for a bit, until after we have looked at Delay
in an upcoming post.
Order of processing for "combine"
The Combine
method only has two parameters. So what happens when you combine more than two values? For example, here are four values to combine:
If you look at the output you can see that the values are combined pair-wise, as you might expect.
A subtle but important point is that they are combined "backwards", starting from the last value. First "3" is combined with "4", and the result of that is then combined with "2", and so on.
Combine for non-sequences
In the second of our earlier problematic examples, we didn't have a sequence; we just had two separate expressions in a row.
How should these expressions be combined?
There are a number of common ways of doing this, depending on the concepts that the workflow supports.
Implementing combine for workflows with "success" or "failure"
If the workflow has some concept of "success" or "failure", then a standard approach is:
If the first expression "succeeds" (whatever that means in context), then use that value.
Otherwise use the value of the second expression.
In this case, we also generally use the "failure" value for Zero
.
This approach is useful for chaining together a series of "or else" expressions where the first success "wins" and becomes the overall result.
For example, for the maybe
workflow, it is common to return the first expression if it is Some
, but otherwise the second expression, like this:
Example: Parsing
Let's try a parsing example with this implementation:
We get the following result:
You can see that the first return!
expression is None
, and ignored. So the overall result is the second expression, Some (I 42)
.
Example: Dictionary lookup
In this example, we'll try looking up the same key in a number of dictionaries, and return when we find a value:
We get the following result:
You can see that the first lookup is None
, and ignored. So the overall result is the second lookup.
As you can see, this technique is very convenient when doing parsing or evaluating a sequence of (possibly unsuccessful) operations.
Implementing combine for workflows with sequential steps
If the workflow has the concept of sequential steps, then the overall result is just the value of the last step, and all the previous steps are evaluated only for their side effects.
In normal F#, this would be written:
Or using the semicolon syntax, just:
In normal F#, each expression (other than the last) evaluates to the unit value.
The equivalent approach for a computation expression is to treat each expression (other than the last) as a wrapped unit value, and "pass it into" the next expression, and so on, until you reach the last expression.
This is exactly what bind does, of course, and so the easiest implementation is just to reuse the Bind
method itself. Also, for this approach to work it is important that Zero
is the wrapped unit value.
The difference from a normal bind is that the continuation has a unit parameter, and evaluates to b
. This in turn forces a
to be of type WrapperType<unit>
in general, or unit option
in our case.
Here's an example of sequential processing that works with this implementation of Combine
:
Here's the following trace. Note that the result of the whole expression was the result of the last expression in the sequence, just like normal F# code.
Implementing combine for workflows that build data structures
Finally, another common pattern for workflows is that they build data structures. In this case, Combine
should merge the two data structures in whatever way is appropriate. And the Zero
method should create an empty data structure, if needed (and if even possible).
In the "list builder" example above, we used exactly this approach. Combine
was just list concatenation and Zero
was the empty list.
Guidelines for mixing "Combine" and "Zero"
We have looked at two different implementations for Combine
for option types.
The first one used options as "success/failure" indicators, when the first success "won". In this case
Zero
was defined asNone
The second one was sequential, In this case
Zero
was defined asSome ()
Both cases worked nicely, but was that luck, or are there are any guidelines for implementing Combine
and Zero
correctly?
First, note that Combine
does not have to give the same result if the parameters are swapped. That is, Combine(a,b)
need not be the same as Combine(b,a)
. The list builder is a good example of this.
On the other hand there is a useful rule that connects Zero
and Combine
.
Rule: Combine(a,Zero)
should be the same as Combine(Zero,a)
which should the same as just a
.
To use an analogy from arithmetic, you can think of Combine
like addition (which is not a bad analogy -- it really is "adding" two values). And Zero
is just the number zero, of course! So the rule above can be expressed as:
Rule: a + 0
is the same as 0 + a
is the same as just a
, where +
means Combine
and 0
means Zero
.
If you look at the first Combine
implementation ("success/failure") for option types, you'll see that it does indeed comply with this rule, as does the second implementation ("bind" with Some()
).
On the other hand, if we had used the "bind" implementation of Combine
but left Zero
defined as None
, it would not have obeyed the addition rule, which would be a clue that we had got something wrong.
"Combine" without bind
As with all the builder methods, if you don't need them, you don't need to implement them. So for a workflow that is strongly sequential, you could easily create a builder class with Combine
, Zero
, and Yield
, say, without having to implement Bind
and Return
at all.
Here's an example of a minimal implementation that works:
And here it is in use:
Similarly, if you have a data-structure oriented workflow, you could just implement Combine
and some other helpers. For example, here is a minimal implementation of our list builder class:
And even with the minimal implementation, we can write code like this:
A standalone "Combine" function
In a previous post, we saw that the "bind" function is often used as standalone function, and is normally given the operator >>=
.
The Combine
function too, is often used as a standalone function. Unlike bind, there is no standard symbol -- it can vary depending on how the combine function works.
A symmetric combination operation is often written as ++
or <+>
. And the "left-biased" combination (that is, only do the second expression if the first one fails) that we used earlier for options is sometimes written as <++
.
So here is an example of a standalone left-biased combination of options, as used in a dictionary lookup example.
Summary
What have we learned about Combine
in this post?
You need to implement
Combine
(andDelay
) if you need to combine or "add" more than one wrapped value in a computation expression.Combine
combines values pairwise, from last to first.There is no universal implementation of
Combine
that works in all cases -- it needs to be customized according the particular needs of the workflow.There is a sensible rule that relates
Combine
withZero
.Combine
doesn't requireBind
to be implemented.Combine
can be exposed as a standalone function
In the next post, we'll add logic to control exactly when the internal expressions get evaluated, and introduce true short circuiting and lazy evaluation.
Last updated
Was this helpful?