Implementing a builder: Delay and Run
Controlling when functions execute
In the last few posts we have covered all the basic methods (Bind, Return, Zero, and Combine) needed to create your own computation expression builder. In this post, we'll look at some of the extra features needed to make the workflow more efficient, by controlling when expressions get evaluated.
The problem: avoiding unnecessary evaluations
Let's say that we have created a "maybe" style workflow as before. But this time we want to use the "return" keyword to return early and stop any more processing being done.
Here is our complete builder class. The key method to look at is Combine
, in which we simply ignore any secondary expressions after the first return.
Let's see how it works by printing something, returning, and then printing something else:
The debugging output should look something like the following, which I have annotated:
We can see a problem here. The "Part 2: after return" was printed, even though we were trying to return early.
Why? Well I'll repeat what I said in the last post: return and yield do not generate an early return from a computation expression. The entire computation expression, all the way to the last curly brace, is always evaluated and results in a single value.
This is a problem, because you might get unwanted side effects (such as printing a message in this case) and your code is doing something unnecessary, which might cause performance problems.
So, how can we avoid evaluating the second part until we need it?
Introducing "Delay"
The answer to the question is straightforward -- simply wrap part 2 of the expression in a function and only call this function when needed, like this.
Using this technique, part 2 of the computation expression can be processed completely, but because the expression returns a function, nothing actually happens until the function is called.
But the Combine
method will never call it, and so the code inside it does not run at all.
And this is exactly what the Delay
method is for. Any result from Return
or Yield
is immediately wrapped in a "delay" function like this, and then you can choose whether to run it or not.
Let's change the builder to implement a delay:
As you can see, the Delay
method is given a function to execute. Previously, we executed it immediately. What we're doing now is wrapping this function in another function and returning the delayed function instead. I have added a number of trace statements before and after the function is wrapped.
If you compile this code, you can see that the signature of Delay
has changed. Before the change, it returned a concrete value (an option in this case), but now it returns a function.
By the way, we could have implemented Delay
in a much simpler way, without any tracing, just by returning the same function that was passed in, like this:
Much more concise! But in this case, I wanted to add some detailed tracing information as well.
Now let's try again:
Uh-oh. This time nothing happens at all! What went wrong?
If we look at the output we see this:
Result for Part1 without Part2: <fun:Delay@84-5>
Hmmm. The output of the whole trace
expression is now a function, not an option. Why? Because we created all these delays, but we never "undelayed" them by actually calling the function!
One way to do this is to assign the output of the computation expression to a function value, say f
, and then evaluate it.
This works as expected, but is there a way to do this from inside the computation expression itself? Of course there is!
Introducing "Run"
The Run
method exists for exactly this reason. It is called as the final step in the process of evaluating a computation expression, and can be used to undo the delay.
Here's an implementation:
Let's try one more time:
And the result is exactly what we wanted. The first part is evaluated, but the second part is not. And the result of the entire computation expression is an option, not a function.
When is delay called?
The way that Delay
is inserted into the workflow is straightforward, once you understand it.
The bottom (or innermost) expression is delayed.
If this is combined with a prior expression, the output of
Combine
is also delayed.And so on, until the final delay is fed into
Run
.
Using this knowledge, let's review what happened in the example above:
The first part of the expression is the print statement plus
return 1
.The second part of the expression is the print statement without an explicit return, which means that
Zero()
is calledThe
None
from theZero
is fed intoDelay
, resulting in a "delayed option", that is, a function that will evaluate to anoption
when called.The option from part 1 and the delayed option from part 2 are combined in
Combine
and the second one is discarded.The result of the combine is turned into another "delayed option".
Finally, the delayed option is fed to
Run
, which evaluates it and returns a normal option.
Here is a diagram that represents this process visually:
If we look at the debug trace for the example above, we can see in detail what happened. It's a little confusing, so I have annotated it. Also, it helps to remember that working down this trace is the same as working up from the bottom of the diagram above, because the outermost code is run first.
"Delay" changes the signature of "Combine"
When Delay
is introduced into the pipeline like this, it has an effect on the signature of Combine
.
When we originally wrote Combine
we were expecting it to handle options
. But now it is handling the output of Delay
, which is a function.
We can see this if we hard-code the types that Combine
expects, with int option
type annotations like this:
If this is done, we get an compiler error in the "return" expression:
The error is:
In other words, the Combine
is being passed a delayed function (unit -> 'a
), which doesn't match our explicit signature.
So what happens when we do want to combine the parameters, but they are passed in as a function instead of as a simple value?
The answer is straightforward: just call the function that was passed in to get the underlying value.
Let's demonstrate that using the adding example from the previous post.
In this new version of Combine
, the second parameter is now a function, not an int option
. So to combine them, we must first evaluate the function before doing the combination logic.
If we test this out:
We get the following (annotated) trace:
Understanding the type constraints
Up to now, we have used only our "wrapped type" (e.g. int option
) and the delayed version (e.g. unit -> int option
) in the implementation of our builder.
But in fact we can use other types if we like, subject to certain constraints. In fact, understanding exactly what the type constraints are in a computation expression can clarify how everything fits together.
For example, we have seen that:
The output of
Return
is passed intoDelay
, so they must have compatible types.The output of
Delay
is passed into the second parameter ofCombine
.The output of
Delay
is also passed intoRun
.
But the output of Return
does not have to be our "public" wrapped type. It could be an internally defined type instead.
Similarly, the delayed type does not have to be a simple function, it could be any type that satisfies the constraints.
So, given a simple set of return expressions, like this:
Then a diagram that represents the various types and their flow would look like this:
And to prove that this is valid, here is an implementation with distinct types for Internal
and Delayed
:
And the method signatures in the builder class methods look like this:
Creating this artifical builder is overkill of course, but the signatures clearly show how the various methods fit together.
Summary
In this post, we've seen that:
You need to implement
Delay
andRun
if you want to delay execution within a computation expression.Using
Delay
changes the signature ofCombine
.Delay
andCombine
can use internal types that are not exposed to clients of the computation expression.
The next logical step is wanting to delay execution outside a computation expression until you are ready, and that will be the topic on the next but one post. But first, we'll take a little detour to discuss method overloads.
Last updated
Was this helpful?