Testing Input & Output
Using functions to write advanced tests.
In a previous article I wrote about what testing is, why automated tests help improve your code and I showed the simplest way to start writing automated test in any language.
In this article I’ll build upon that simple approach to show a more advanced one. Although the previous article will give you more detail, this article can be read stand-alone.
I refactor and extend 23 lines of code in small and clear steps, using nothing more than the familiar concept of a function, to end up with a version that is automatically tested in a nice way. At the very end I briefly discuss how what you’ve seen is related to some more advanced topics : higher order functions, closures and polymorphism.
Although the more advanced testing approach discussed in this article can be used for all sorts of IO programs (such as making web requests and writing to databases) the example program in this article is a small game that is played in the console.
This game is the
Guess My Number exercise of
MITx 6.001x Introduction to Computer Science and Programming Using Python. A client of mine wrote a working program for this exercise but asked me to help him improve his program’s structure. This article is based on our coaching sessions.
Guess My Number exercise you are tasked to program a small game. The game asks you to think of a number between 0 and 100. It guesses a number and asks you to reply with ‘too high’ (h), ‘too low’ (l) or ‘correct’ (c). It continues to guess until you answer ‘correct’.
After a user has put in
q (an invalid input) and
c, this is what the console should look like:
Below is the code my client came to me with. Manual testing verifies that ‘it works’, meaning the program’s behavior is correct.
My client’s goal was to change (hopefully improve!) the program’s structure while keeping the program’s behavior the same.
In this article I’ll show how we first pinned down the program’s behavior with automated tests. Having done that we were able to make changes to the structure without having to spend time and energy on manually checking if the behavior had remained the same.
First we’ll focus on automating the input. We’ll still have to manually verify the behavior by looking at what is printed in the console. Automatically checking the output will come afterwards.
Automating the input
Let’s look at line 13 of the code:
answer = input().
input is a standard Python function. Calling it halts the program until a user types something in the console and hits the enter key. Whatever was typed is returned as a string.
We want to use the game code for two purposes: to manually play the game and to test the code automatically. Calling
input triggers a manual action so we can’t use
input for testing.
Because we want to call the game code multiple times (for playing and for testing) we wrap it in a function. Next we want to decide per call to this function if we use
input (for playing) or a different function (for testing). To accomplish this we replace
input with the newly introduced parameter
Passing functions as arguments might seem strange to you, but if you run this program you will notice that the behavior is exactly the same as before.
Indeed, nothing has changed: we haven’t automated anything yet.
play_game(input) is a manual run of the game. We want to call
play_game with a different function as argument such that the input is automated. We want to do something like this:
Very important is that our
play_game function should not be able to notice any difference between
get_input_automated! Before we replaced
input with a parameter, what did the game code expect of
input? That it is a function that can be called without arguments and that every call to it returns a string. These expectation must also be met by
Let’s use the
c example of before as our first case. What should the behavior of the function
get_input_automated be? The first time
get_input is called it should return
"l", the second time
"h", the third time
"q" and the fourth time
To achieve this behavior we can use a list and the function
inputs.pop(0) removes the first element from the list
inputs and returns that removed element.
We can implement
get_input_automated as follows:
Note that it is crucial
inputs is created outside the scope of
get_input_automated. If it were created inside we would get different behavior:
inputs is created inside the scope of
get_input_automated, a new list is created each time
get_input_automated is called. But we don’t want that. We want all calls to
get_input_automated to take from the same list. To accomplish that we seperate creating the list from calling
get_input_automated by creating the list outside the scope of
get_input_automated. Shortly we’ll look at how to do this a bit more nicely but for now let’s add it to our program:
Note that playing a manual game is currently commented out on line 35. If we run this code we see a full game being printed in the console, without us having to provide input manually.
Take a few moments to appreciate how cool this is. Automation achieved!
What if we want multiple test cases? Let’s say we want to also automate the case in which the player inputs
l followed by
c. We could do something like this:
This structure could be improved. It repeats names a lot (three per test case) and if you squint a little you’ll see that lines 1-3 and 5-7 are almost duplicates.
Apart from their names,
get_inputs_l_c only differ in one aspect: which list they
pop from. We can create a function that has the list to
pop from as a parameter and returns a function that
pops from this list.
Removing duplication by refactoring in this way is called extract function.
We can clean this up further by inlining: replacing variables by the values assigned to them.
The complete code so far looks like this:
Automating checking the output
According to the simple testing method of the previous article we verify the behavior of the program by checking the value that is returned by a call to the function we wrapped around our initial code.
In our current situation that would look something like this:
We’ll first try this method. Afterwards we’ll try something new which is tidier when testing these kinds of programs.
The return value method
play_game return so we can verify it behaves correctly? If we test
play_game manually we look at the lines that are printed to the console. So let’s make
play_game not only print but also return those lines as a list of strings. We can then have an automated test as follows:
We can replace
print_and_archive. Given a message the function will both print it and
append it to the list named
Although this approach does work, it is not so nice because now we’re always doing something we don’t want. When playing a manual game we collect strings for no reason. Although perhaps we could live with this. A true sin, however, is that in a test we still print.
Waste of resources aside, if we first run the tests and finally the manual game the console will already contains a bunch of prints that have nothing to do with our current playing session! Zero stars, thumbs down, money back.
In all seriousness, if the output of our program was not printing to a console but let’s say writing to a database, we don’t want every single one of our tests to actually write to a database.
For our game we could clear the console after the tests but it is nicer to prevent the issue by not printing in the tests.
Next I’ll show a new approach that no longer has the wrapping function return a value and will no longer print in the tests.
The parameter method
We made the input a parameter because we want different ways of inputting for playing and for testing. Similarly, we can refactor the way of outputting to be a parameter as well.
play_game, let’s replace
put_out. We also add
Just as when we made the input a parameter, this refactor does not change any behavior so we still have not automated checking the output.
Now we create an alternative to
put_out. What did
play_game expect of
put_out will also have to meet this expectation.
Before we asserted a list of strings returned by
play_game. Now we’ll assert the list
messages_put_out that is filled by
If we do this for multiple test cases we get duplication similar to before:
Again we have some duplication. This time the
assert is part of the duplication. If we extract to function,
assert should become part of the function. Doing so results in the new function
Both test cases will run and pass (lines 44 and 58) and afterwards a manual game will be started (line 68). The tests did not print anything and our game starts in a clean console!
Now we have solidly pinned down the original program’s behavior with automated tests without printing to the console. Only in our manual game do we print to the console, which is what we want. We can now fully focus on changing the structure of our original program, which has become the body of
play_game, without having to spend any time or energy on checking if we broke anything.
It took some effort to get to this point, but writing tests is an investment that might (probably will) pay off and is worth considering!
There are names for some things we’ve seen in this article. Let’s go over them.
Higher order function
A higher order function is a function that satisfies one or both of the following:
- at least one parameter is a function
- it returns a function
play_game a higher order function as both parameters
put_out are functions.
create_get_input is a higher order function because it returns a function.
get_input returned by
create_get_input has dynamic behavior. Each time we call it, it returns a different value. We control exactly what this dynamic behavior is: each time return the next string from a list.
Each time we call
get_input a scope is created and then destroyed. And yet there is coordinated behavior between calls: each consequtive call returns the next string from the same list. How is this possible?
Each call to
get_input operates on the same state that exists beyond any call to
get_input. In this case the variable
inputs, which is a parameter of the function
inputs is not part of the scope of
get_input so it is ‘kept alive’ after a call to
get_input has completed.
A closure is a function that accesses state that is maintained beyond the lifetime of calls to this function. This is typically used to create coordinated behavior between individual calls to this function.
I want to say many things about closures. However, to maintain focus I decided to put that in a future article.
If you are more experienced you might recognize that to achieve this coordinated behavior you can also use objects with methods. Indeed you can. You could say methods are closures. However, in this case creating closures with functions is adequate.
Let’s compare the original code for playing a game with our final version:
The only difference really is that the original version of the game code called
Originally the game code depended specifically on the function
input. Now it depends on the abstract expectation that
get_input is any function that can be called without arguments and each time will return a string. Likewise, the game code depends on an abstract expectation of
We can call
play_game with other functions, as long as they satisfy the expectations. This is exactly what we do in our tests.
How the expectations of the game code are met can now take many shapes. Many, poly. Shapes, morphs. Polymorphism.
Typically polymorphism is explained with interfaces and objects, or classes and instances. These are more elaborate tools that also solve the problem of defining expectations and creating concrete realizations of these expectations.
Often the simple and familiar function is adequate for solving this problem. Moreso, in my experience students who first obtain a strong understanding of functions and closures have an easy time when they later learn about interfaces and classes.
Very many difficulties in programming arise because it matters when or how many times certain code runs. The more you write your code in such a way that it no longer matters when or how many times it runs, the fewer of these problems you’ll see.
In this article we saw an example of this phenomenon. Orginially our game code printed in the console. It matters how many time this code runs because we don’t want to see prints that have nothing to do with our current playing session.
After we refactored in such a way that we could run the game code without printing, it no longer mattered how many times we ran it and we could run it many times (once for each test case) without having to worry.