Tail recursion for branching problems
A case study of finding paths in trees.
Contents
Recently one of my clients asked me for help in learning about graphs. As graphs are a particular favorite topic of mine I took on this challenge with zeal and alacrity. In our ensuing sessions we touched upon some interesting topics regarding recursion, which inspired me to write this article.
I assume you, dear reader, are already somewhat familiar with recursion. Nevertheless, I begin this article with a brief recap of a classic recursive solution: calculating a factorial.
Then I introduce the what and why of tail recursion and how to modify regular recursion to tail recursion.
The meat of this article is a discussion of how to use regular recursion and tail recursion to solve branching problems. The particular branching problem I use as an example is finding paths in a tree.
I discuss not one but two different approaches to finding paths in trees. The first is more flexible, the second is more performant.
Tail recursion
This is a typical recursive implementation of factorial:


If we perform the call fac(4)
, this is what the simplified evaluation diagram looks like:


When the deepest recursive step, fac(1)
, is evaluated, all other recursive steps are still on hold. This means that at some point all recursive steps exist on ‘the stack’ (part of your RAM) at the same time. If your recursion is very deep  you have many recursive steps  you could cause a stack overflow which crashes your program.
A step must be kept alive for the full duration of the next step, because after the next step there’s still work to do: multiplying the result by n
.
Luckily there’s a solution: tail recursion.




The difference is that the last thing that happens in a step is no longer the multiplication but the recursive call. Nothing needs to happen after the recursive call: all information is passed to the next step, so the current step can safely die. At any point in time, only a single step needs to ever be on the stack. No more stack overflows.
But wait a minute, isn’t returning a value the last thing that happens in a step? Indeed it is. But look in the evaluation diagram: all calls return the exact same value. If a compiler is smart it can recognize tail recursion and perform some trickery to just return whatever the final recursive step returns and immediately kill off intermediate steps after they’ve made their recursive call.
So, tail recursion is a way to enable very deep recursion. You only have to bother with it if you expect your calculations to stretch the limit of your hardware.
Recipe
In tail recursion, at each step we add to an accumulative result and pass it on to the next step. So we need to introduce a new parameter to the recursive function and think of a good initial value. In our basecase we finally return the accumulative.
Let’s take another look at our tailrecursive fac
:


The accumulative information is the product of all n
s so far. product
is now a parameter of fac
and in each step we accumulate the n
of that particular step to product
via multiplication. A good initial value for this product is 1
, because 1 * n = n
.
Hide details
To do tail recursion we must introduce a new parameter. This has two downsides. One, we have to update all calls to fac
. Two, we now expose details that are only relevant to the inner workings of the recursion. When I want the factorial of 3
, I shouldn’t have to worry about passing a 1
!
Luckily there are ways to circumvent these problems. I’ll briefly show three and discuss a fourth in more detail. Your favorite language will support at least one of them.
Some languages, like Python, allow you to specify a default value for a parameter:


You could make an accessor function that hides the details of how to kick off the recursion:


Instead of two outer functions, you could create an outer and an inner function:


Finally, all tailrecursive functions can trivially be written as a whileloop:


Invert the condition of the basecase in your recursive function to obtain the condition of the whileloop. Instead of passing arguments, reassign values to variables outside the scope of the whileloop.
The smart trickery that some compilers do to optimize tail recursion is indeed to rewrite them as whileloops.
Finding paths
Graphs are a fundamental tool for modeling data and one of the first things you want to do with a graph is find paths in it. Before we get started, first some vocabulary.
A graph is a collection of things (vertices, singular: vertex) and relationships between these things (edges). Typically a graph is visualized as circles with arrows between them.
We’re going to look at a specific kind of graph: a tree. A tree is directed, meaning the arrows point in one direction. It is also acyclic, meaning no path in the tree crosses over itself. Each vertex can have multiple children but only one parent. The root is the only vertex without a parent. If a vertex has no children it’s called a leaf.
I chose to use trees for the examples because finding paths in trees involves very few edgecases. This makes the code easier to read. However, the techniques you’ll see in this article can be used for any graph related algorithm on any type of graph; indeed, not even necessarily for graph problems but for any type of branching problem.
Soon I’ll go over what I mean by branching problem and why it requires special attention with regards to tail recursion.
As an aside, trees form the very foundation of programming. A program is a tree of instructions and the scope hierarchy is also a tree.
This is the tree we’ll be working with:


Modeled as an adjacency list:


Our goal is to write the function all_complete_paths
that returns all complete paths starting at a given vertex, like so:


I start with an approach implemented first with regular recursion followed by tail recursion. Afterwards I discuss a second approach, again implemented with both regular and tail recursion, that is more performant but has requires extra attention to detail.
After both approaches are covered I briefly discuss how the first and lesser performant approach is more flexible.
First approach
Let’s first specify our task. In each step we are given a vertex and we should return a list of paths. A path is a list of vertices. So we return a list of lists of vertices.
Now the approach. First we think about our basecase. When should our recursion end? When the given vertex is a leaf. From a leaf v
there is but a single path: [v]
. But we should return a list of paths, so for a leaf we return [[v]]
.
Up next is the recursive step. Given a v
that is not a leaf, we should obtain the result of the next step (by making a a recursive call) and somehow add information to it to create the result of the current step.
The crux is that within the current step for v
, we might have to do multiple recursive calls, one for each child of v
. Alright, now the timetravelling magic of writing recursive functions: let’s assume our function all_complete_paths
already works. We can call it for a child to obtain all paths starting at that child. If we do this for all children we have all paths that end at a leaf and stop right before v
. To obtain all paths that start at v
, all we have to do is add v
to the front of all of these childpaths.


Calling all_complete_paths(1)
leads to the following simplified evaluation diagram:


Tailrecursive
Whereas fac
had a single recursive call each step, here a step can have multiple recursive calls. This makes fac
linear and all_complete_paths
branching.
The strategy we’ve seen to go from nontail recursion to tail recursion is to pass along information to the next step by introducing a new parameter. Even if we apply that strategy to a branching problem like this, after the first recursive call there is still work left to do: making the second recursive call. Clearly this strategy isn’t sufficient for branching problems.
Branching problems can be made tailrecursive, however. Try to think of a solution before you continue reading.
In a branching step we want to do multiple things: there are multiple plans. Specifically, we want to do something for each child. However, these plans are also put into action: a recursive call is made for each child. Unfortunately in tail recursion we can only make one recursive call.
What we must do is separate creating plans from putting a plan into action. We can let a step produce two plans, because producing plans does not involve making recursive calls. These plans are then passed to the next step using a single recursive call.
The next step can then take one plan from the pile of plans. Putting this one plan into action involves either 1) adding to the accumulating result, or 2) producing more plans. Taking action no longer involves making a recursive call. The recursive call is used only to pass along the pile of plans and the accumulating result.
If a plan describes a basecase, the action is to only add to the accumulating result. A plan was taken from the pile and no new plans were added: the pile of plans shrinks.
If a step can no longer take a plan from the pile because the pile is empty, the overall recursive process is complete and the accumulating result can be returned as the final result.
Let’s go over the specific approach for our pathfinding. A plan is a pathunderconstruction and we look at the last vertex in that path.
If that last vertex is a leaf, we hit our basecase and add the path to a list of complete paths.
If it’s not a leaf it has children and we construct a new pathunderconstruction for each child. We do this by adding the child to the end of the current path. Be careful here: if you’re working with mutable lists, you need to make a copy of the current path before adding a child. If you don’t you will add both children to the same path, which is wrong.
Our plans, or pathsunderconstruction (plural) are kept in the parameter incomplete_paths
. Initially the only pathunderconstruction (singular) we know is [v]
so the initial value for incomplete_paths
should be [[v]]
.
The accumulative result is the list of completed paths completed_paths
. Its initial value is []
.


Note that we sort of have two basecases now. One for our specific recursive problem: when we reached a leaf. The other is the basecase for our general strategy: when the pile of plans is empty.
Here follows in detail what happens if we call all_complete_paths(1)
:


Probability tree
In another article I use this tailrecursive approach (written as a whileloop) to calculate the probabilities of people surviving a deadly game.
Second approach
The first nontailrecursive approach was, for a vertex v
, to find all complete paths starting at the children of v
and then add v
to the front of each of these paths.
This is an entirely different approach. In a step we have, in addition to v
, the ‘path so far’. We add v
to it. If v
is a leaf we hit our basecase and we add the path (now with v
at the end) to our list of complete paths. If v
is not a leaf we make a recursive call for each child, passing along the path that is now extended with v
.
This approach works because each step is given a v
and the path that stops right before v
: the path from the root up to and including the parent of v
.
You might notice that it is actually quite similar to the tailrecursive solution we saw earlier: a path is extended with v
and in some way passed to a next step.


There are a couple of problems with this implementation.
Most obviously, nothing is ever returned. This is due to the very nature of this approach. A step only sets up the right circumstance for the next step: it doesn’t do anything with the result of the next step.
Trying to force an implementation that doesn’t fit the approach is asking for complexity and headaches. So instead of returning values let’s think of something else.
The only reason we would want a return
somewhere is to extract the result from our recursion. However, instead of extracting the value from our recursion via return
, we can let the recursion place its result in something that lives outside of the recursion. We can accomplish this by removing complete_paths
as a parameter of the recursion function and instead have it live outside the scope of the recursive function.


Now at least we can access the result of the recursion. However, if we run all_complete_paths(2)
we get some weird behavior:


The trouble is caused by path
being the same mutable list in each of the recursive steps. This mutability causes trouble in two ways. One, it makes the contents of complete_paths
change. Two, the second child of a vertex is given a path that is made ‘dirty’ by the first child.
We fix the first problem by not appending path
to complete_paths
, but a copy of path
:


We can fix the second problem in two ways. Either each step gets its own copy of path
, or each step ‘cleans up’ its mutation to path
.
To give each step its own copy, a step can either copy the path it’s given, or create a copy to give the next steps. Both options are fine. Let’s go for immediately copying the given path:


Now we’re making a copy of path at line 5 anyway, you might feel safe to remove the copy on line 14. I’ve had my fair share of mutation bugs so I’m keeping it there. Who knows you will refactor this code a little at some point and the copying starts to matter again. I’m happy I usually work in a language where everything is immutable by default so I don’t have to worry about such things.
Below is a new simplified evaluation diagram of performing the call all_complete_paths(2)
. The three unique copies, one for each step, are marked as path
, path'
and path''
. Mutation bugs solved, complete_paths
is what we want it to be!


Instead of giving each step its own copy of path
, we can ‘clean up’ the mutation to path
as the very last thing we do in each step:


If we now call all_complete_paths(2)
:


As you can see, the cleanup implementation also brings complete_paths
to our desired state.
This cleanup version of all_complete_paths
is the fastest we’ve seen so far because instead of making additional copies it mutates in place.
Let’s take a look at how to turn this into tail recursion.
Tailrecursive
Like before, instead of making two recursive calls, in a step we take one plan from the pile (here which vertex v
we should process next) and either contribute to our result (complete_paths
) or add new plans (the children of v
) to the pile.
Note that the recursive call (line 19) now happens after removing v
from the path, making the recursive call the last thing that happens in a step, thus satisfying the condition for tail recursion.


If we call all_complete_paths(1)
we get:


Woe! 1
was removed from path
too early. It should be removed from path
only after all of its children have been processed. But how are we supposed to make that happen in tail recursion, where going to the next step is the last thing that happens?
The order of when things happen is decided by our pile of plans. In a step, instead of only adding plans to process children, we also include a plan to eventually clean up the mutation we did in this step. So now we have two types of plans, either a plan to clean up mutation or to process a vertex. We only clean up if we encounter a plan that says so.




In this particular approach, not only do we not care about return values, we also deliberately work with inplace mutation. Making recursive calls and passing parameters feels a bit superfluous.
We saw in the beginning of this article that tail recursion can be written as a loop. Let’s see what that looks like:


Dangerous optimization
Assuming you work with mutable lists, there might be cases where you don’t necessarily need to copy a complete path.
Let’s say that sometimes you need the paths for further computation, other times you only want to print them. Only if you collect the paths it’s truly necessary to make a copy. For printing, you don’t need to copy.
How to support both cases with maximal code reuse?
We can extract traversing the paths to a function traverse_paths
, and have a parameter f
which is a function that is called each time a complete path is found. The actual path, not a copy, is passed as an argument to f
. In f
you can decide whether to copy or not.
Note that the integrity of the passed path is vital to the continued operation of traverse_paths
. Exposing the mutable path to f
is EXTREMELY dangerous. It sends shivers down my spine. I’m dead serious. I recommend that you don’t do this.


To make it safe again, you could pass a copy of path
to f
, but that defeats the purpose of extracting to traverse_paths
. Then you might as well write print(all_complete_paths(1))
. I just wanted to take the opportunity to show a higher order function (in this case, a function that has another function as a parameter).
Depth vs. breadth
The mutateandcleanup version of all_complete_paths
might be the fastest, but it isn’t as flexible.
Let’s take our first tailrecursive algorithm and print the last vertex of the path we process in each step:


Calling all_complete_paths(1)
will print 1 3 7 6 2 5 4
.
If we take another look at our tree


we see that the vertices are printed in socalled depthfirst order. The algorithm first tries to go deeper in the tree. Note that 1 2 4 5 3 6 7
would also be depthfirst.
The opposite of depthfirst is breadthfirst. In a breadthfirst order all vertices at the same depth are processed before the vertices of the next depth. Examples of breadthfirst orders for this tree are 1 2 3 4 5 6 7
and 1 3 2 7 6 5 4
.
Sometimes it can be useful to traverse vertices in breadthfirst order instead of depthfirst order. For example if you want to mark the shortest path to a cell in a floodfill algorithm.
Spend a moment to think of how to change the code such that the vertices are printed in breadthfirst order instead of depthfirst.
The solution is elegantly simple. Taking the last plan from the pile causes the algorithm to be depthfirst. Taking the first plan will cause it to be breadthfirst:


Because it’s so simple to switch between orders, you could easily add a parameter to let a caller decide the order.
Only for this algorithm (no inplace mutation, tailrecursive) is changing between depthfirst and breadthfirst so simple. In the other versions of all_complete_paths
covered in this article the depthfirst order is much more baked into the core.
Out of memory
Tail recursion is useful because it prevents stack overflows but it isn’t bulletproof. Tailrecursive implementations can still grow too large for memory.
To make our branching pathfinding algorithms tailrecursive we introduced a pile of plans. The pile of plans grows proportionally to the depth of the tree. If the tree is extremely large, you will run out of memory. This happened to me while working on some statistics.
A note on testing
In the beginning of this article I described the behavior of all_complete_paths(1)
with the following test:


However, some of the various implementations of all_complete_paths
we’ve seen might yield the same paths but in a different order. This test would fail for those particular implementations.
You might not really care about the order, as long as all the paths are there. Then you can test against a set instead of a list:


Unfortunately, in some languages including Python it’s impossible to create sets of lists. The above code will cause an error. Depending on your language you might have to write a small helper function:

