Testing Input & Output
Using functions to write advanced tests.
Contents
In a previous article I wrote about what testing is, why automated tests help improve your code and I showed the simplest way to start writing automated test in any language.
In this article I’ll build upon that simple approach to show a more advanced one. Although the previous article will give you more detail, this article can be read stand-alone.
I refactor and extend 23 lines of code in small and clear steps, using nothing more than the familiar concept of a function, to end up with a version that is automatically tested in a nice way. At the very end I briefly discuss how what you’ve seen is related to some more advanced topics : higher order functions, closures and polymorphism.
Although the more advanced testing approach discussed in this article can be used for all sorts of IO programs (such as making web requests and writing to databases) the example program in this article is a small game that is played in the console.
This game is the Guess My Number
exercise of MITx 6.001x Introduction to Computer Science and Programming Using Python
. A client of mine wrote a working program for this exercise but asked me to help him improve his program’s structure. This article is based on our coaching sessions.
The exercise
For the Guess My Number
exercise you are tasked to program a small game. The game asks you to think of a number between 0 and 100. It guesses a number and asks you to reply with ‘too high’ (h), ‘too low’ (l) or ‘correct’ (c). It continues to guess until you answer ‘correct’.
After a user has put in h
, l
, q
(an invalid input) and c
, this is what the console should look like:
|
|
Below is the code my client came to me with. Manual testing verifies that ‘it works’, meaning the program’s behavior is correct.
|
|
My client’s goal was to change (hopefully improve!) the program’s structure while keeping the program’s behavior the same.
In this article I’ll show how we first pinned down the program’s behavior with automated tests. Having done that we were able to make changes to the structure without having to spend time and energy on manually checking if the behavior had remained the same.
First we’ll focus on automating the input. We’ll still have to manually verify the behavior by looking at what is printed in the console. Automatically checking the output will come afterwards.
Automating the input
Let’s look at line 13 of the code: answer = input()
. input
is a standard Python function. Calling it halts the program until a user types something in the console and hits the enter key. Whatever was typed is returned as a string.
We want to use the game code for two purposes: to manually play the game and to test the code automatically. Calling input
triggers a manual action so we can’t use input
for testing.
Because we want to call the game code multiple times (for playing and for testing) we wrap it in a function. Next we want to decide per call to this function if we use input
(for playing) or a different function (for testing). To accomplish this we replace input
with the newly introduced parameter get_input
:
|
|
Passing functions as arguments might seem strange to you, but if you run this program you will notice that the behavior is exactly the same as before.
Indeed, nothing has changed: we haven’t automated anything yet. play_game(input)
is a manual run of the game. We want to call play_game
with a different function as argument such that the input is automated. We want to do something like this:
|
|
Very important is that our play_game
function should not be able to notice any difference between input
and get_input_automated
! Before we replaced input
with a parameter, what did the game code expect of input
? That it is a function that can be called without arguments and that every call to it returns a string. These expectation must also be met by get_input_automated
.
Let’s use the l
, h
, q
, c
example of before as our first case. What should the behavior of the function get_input_automated
be? The first time get_input
is called it should return "l"
, the second time "h"
, the third time "q"
and the fourth time "c"
.
To achieve this behavior we can use a list and the function pop
. inputs.pop(0)
removes the first element from the list inputs
and returns that removed element.
|
|
We can implement get_input_automated
as follows:
|
|
Note that it is crucial inputs
is created outside the scope of get_input_automated
. If it were created inside we would get different behavior:
|
|
If inputs
is created inside the scope of get_input_automated
, a new list is created each time get_input_automated
is called. But we don’t want that. We want all calls to get_input_automated
to take from the same list. To accomplish that we seperate creating the list from calling get_input_automated
by creating the list outside the scope of get_input_automated
. Shortly we’ll look at how to do this a bit more nicely but for now let’s add it to our program:
|
|
Note that playing a manual game is currently commented out on line 35. If we run this code we see a full game being printed in the console, without us having to provide input manually.
|
|
Take a few moments to appreciate how cool this is. Automation achieved!
What if we want multiple test cases? Let’s say we want to also automate the case in which the player inputs l
followed by c
. We could do something like this:
|
|
This structure could be improved. It repeats names a lot (three per test case) and if you squint a little you’ll see that lines 1-3 and 5-7 are almost duplicates.
Apart from their names, get_inputs_h_l_q_c
and get_inputs_l_c
only differ in one aspect: which list they pop
from. We can create a function that has the list to pop
from as a parameter and returns a function that pop
s from this list.
|
|
Removing duplication by refactoring in this way is called extract function.
We can clean this up further by inlining: replacing variables by the values assigned to them.
|
|
|
|
The complete code so far looks like this:
|
|
Automating checking the output
According to the simple testing method of the previous article we verify the behavior of the program by checking the value that is returned by a call to the function we wrapped around our initial code.
In our current situation that would look something like this:
|
|
We’ll first try this method. Afterwards we’ll try something new which is tidier when testing these kinds of programs.
The return value method
What should play_game
return so we can verify it behaves correctly? If we test play_game
manually we look at the lines that are printed to the console. So let’s make play_game
not only print but also return those lines as a list of strings. We can then have an automated test as follows:
|
|
We can replace print
with a different function named print_and_archive
. Given a message the function will both print it and append
it to the list named printed
. Ultimately play_game
returns printed
.
|
|
Although this approach does work, it is not so nice because now we’re always doing something we don’t want. When playing a manual game we collect strings for no reason. Although perhaps we could live with this. A true sin, however, is that in a test we still print.
Waste of resources aside, if we first run the tests and finally the manual game the console will already contains a bunch of prints that have nothing to do with our current playing session! Zero stars, thumbs down, money back.
In all seriousness, if the output of our program was not printing to a console but let’s say writing to a database, we don’t want every single one of our tests to actually write to a database.
For our game we could clear the console after the tests but it is nicer to prevent the issue by not printing in the tests.
Next I’ll show a new approach that no longer has the wrapping function return a value and will no longer print in the tests.
The parameter method
We made the input a parameter because we want different ways of inputting for playing and for testing. Similarly, we can refactor the way of outputting to be a parameter as well.
Within play_game
, let’s replace print
the new parameter put_out
. We also add print
as a second argument to calls to play_game.
|
|
Just as when we made the input a parameter, this refactor does not change any behavior so we still have not automated checking the output.
Now we create an alternative to print
to use for testing. Let’s name this function put_out
. What did play_game
expect of print
? That it is a function that can be called with a string as argument, any number of times. put_out
will also have to meet this expectation.
Before we asserted a list of strings returned by play_game
. Now we’ll assert the list messages_put_out
that is filled by put_out
.
|
|
If we do this for multiple test cases we get duplication similar to before:
|
|
Again we have some duplication. This time the assert
is part of the duplication. If we extract to function, assert
should become part of the function. Doing so results in the new function test
.
|
|
Both test cases will run and pass (lines 44 and 58) and afterwards a manual game will be started (line 68). The tests did not print anything and our game starts in a clean console!
Now we have solidly pinned down the original program’s behavior with automated tests without printing to the console. Only in our manual game do we print to the console, which is what we want. We can now fully focus on changing the structure of our original program, which has become the body of play_game
, without having to spend any time or energy on checking if we broke anything.
It took some effort to get to this point, but writing tests is an investment that might (probably will) pay off and is worth considering!
Vocabulary
There are names for some things we’ve seen in this article. Let’s go over them.
Higher order function
A higher order function is a function that satisfies one or both of the following:
- at least one parameter is a function
- it returns a function
This makes play_game
a higher order function as both parameters get_input
and put_out
are functions. create_get_input
is a higher order function because it returns a function.
A function get_input
returned by create_get_input
has dynamic behavior. Each time we call it, it returns a different value. We control exactly what this dynamic behavior is: each time return the next string from a list.
Closure
Each time we call get_input
a scope is created and then destroyed. And yet there is coordinated behavior between calls: each consequtive call returns the next string from the same list. How is this possible?
Each call to get_input
operates on the same state that exists beyond any call to get_input
. In this case the variable inputs
, which is a parameter of the function create_get_input
. inputs
is not part of the scope of get_input
so it is ‘kept alive’ after a call to get_input
has completed.
A closure is a function that accesses state that is maintained beyond the lifetime of calls to this function. This is typically used to create coordinated behavior between individual calls to this function.
I want to say many things about closures. However, to maintain focus I decided to put that in a future article.
If you are more experienced you might recognize that to achieve this coordinated behavior you can also use objects with methods. Indeed you can. You could say methods are closures. However, in this case creating closures with functions is adequate.
Polymorphism
Let’s compare the original code for playing a game with our final version:
|
|
|
|
The only difference really is that the original version of the game code called input
and print
directly whereas the refactored version calls input
and print
indirectly. We created an indirection.
Originally the game code depended specifically on the function input
. Now it depends on the abstract expectation that get_input
is any function that can be called without arguments and each time will return a string. Likewise, the game code depends on an abstract expectation of put_out
.
We can call play_game
with input
and print
as arguments. Both functions satisfy the expectations of the game code. We can also call play_game
with other functions, as long as they satisfy the expectations. This is exactly what we do in our tests.
How the expectations of the game code are met can now take many shapes. Many, poly. Shapes, morphs. Polymorphism.
Typically polymorphism is explained with interfaces and objects, or classes and instances. These are more elaborate tools that also solve the problem of defining expectations and creating concrete realizations of these expectations.
Often the simple and familiar function is adequate for solving this problem. Moreso, in my experience students who first obtain a strong understanding of functions and closures have an easy time when they later learn about interfaces and classes.
Closing words
Very many difficulties in programming arise because it matters when or how many times certain code runs. The more you write your code in such a way that it no longer matters when or how many times it runs, the fewer of these problems you’ll see.
In this article we saw an example of this phenomenon. Orginially our game code printed in the console. It matters how many time this code runs because we don’t want to see prints that have nothing to do with our current playing session.
After we refactored in such a way that we could run the game code without printing, it no longer mattered how many times we ran it and we could run it many times (once for each test case) without having to worry.