In a previous article I wrote about what testing is, why automated tests help improve your code and I showed the simplest way to start writing automated test in any language.

In this article I’ll build upon that simple approach to show a more advanced one. Although the previous article will give you more detail, this article can be read stand-alone.

I refactor and extend 23 lines of code in small and clear steps, using nothing more than the familiar concept of a function, to end up with a version that is automatically tested in a nice way. At the very end I briefly discuss how what you’ve seen is related to some more advanced topics : higher order functions, closures and polymorphism.

Although the more advanced testing approach discussed in this article can be used for all sorts of IO programs (such as making web requests and writing to databases) the example program in this article is a small game that is played in the console.

This game is the Guess My Number exercise of MITx 6.001x Introduction to Computer Science and Programming Using Python. A client of mine wrote a working program for this exercise but asked me to help him improve his program’s structure. This article is based on our coaching sessions.

The exercise

For the Guess My Number exercise you are tasked to program a small game. The game asks you to think of a number between 0 and 100. It guesses a number and asks you to reply with ‘too high’ (h), ‘too low’ (l) or ‘correct’ (c). It continues to guess until you answer ‘correct’.

After a user has put in h, l, q (an invalid input) and c, this is what the console should look like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
Please think of a number between 0 and 100!
Is your secret number 50?
Enter 'h', 'l' or 'c'.
h
Is your secret number 25?
Enter 'h', 'l' or 'c'.
l
Is your secret number 38?
Enter 'h', 'l' or 'c'.
q
Sorry, I did not understand your input.
Is your secret number 38?
Enter 'h', 'l' or 'c'.
c
Game over. Your secret number was: 38

Below is the code my client came to me with. Manual testing verifies that ‘it works’, meaning the program’s behavior is correct.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
low = 0
high = 100
keep_going = True

print("Please think of a number between 0 and 100!")

while keep_going:
    guess = round((low + high) / 2)
    
    print("Is your secret number " + str(guess) + "?")
    print("Enter 'h', 'l' or 'c'.")

    answer = input()
    
    if (answer == "c"):
        print("Game over. Your secret number was: " + str(guess))
        break
    elif (answer == "h"):
        high = guess
    elif (answer == "l"):
        low = guess
    else:
        print("Sorry, I did not understand your input.")

My client’s goal was to change (hopefully improve!) the program’s structure while keeping the program’s behavior the same.

In this article I’ll show how we first pinned down the program’s behavior with automated tests. Having done that we were able to make changes to the structure without having to spend time and energy on manually checking if the behavior had remained the same.

First we’ll focus on automating the input. We’ll still have to manually verify the behavior by looking at what is printed in the console. Automatically checking the output will come afterwards.

Automating the input

Let’s look at line 13 of the code: answer = input(). input is a standard Python function. Calling it halts the program until a user types something in the console and hits the enter key. Whatever was typed is returned as a string.

We want to use the game code for two purposes: to manually play the game and to test the code automatically. Calling input triggers a manual action so we can’t use input for testing.

Because we want to call the game code multiple times (for playing and for testing) we wrap it in a function. Next we want to decide per call to this function if we use input (for playing) or a different function (for testing). To accomplish this we replace input with the newly introduced parameter get_input:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
def play_game(get_input):
    low = 0
    high = 100
    keep_going = True

    print("Please think of a number between 0 and 100!")

    while keep_going:
        guess = round((low + high) / 2)
        
        print("Is your secret number " + str(guess) + "?")
        print("Enter 'h', 'l' or 'c'.")

        answer = get_input()
        
        if (answer == "c"):
            print("Game over. Your secret number was: " + str(guess))
            break
        elif (answer == "h"):
            high = guess
        elif (answer == "l"):
            low = guess
        else:
            print("Sorry, I did not understand your input.")

play_game(input)

Passing functions as arguments might seem strange to you, but if you run this program you will notice that the behavior is exactly the same as before.

Indeed, nothing has changed: we haven’t automated anything yet. play_game(input) is a manual run of the game. We want to call play_game with a different function as argument such that the input is automated. We want to do something like this:

26
27
28
29
def get_input_automated():
    # something

play_game(get_input_automated)

Very important is that our play_game function should not be able to notice any difference between input and get_input_automated! Before we replaced input with a parameter, what did the game code expect of input? That it is a function that can be called without arguments and that every call to it returns a string. These expectation must also be met by get_input_automated.

Let’s use the l, h, q, c example of before as our first case. What should the behavior of the function get_input_automated be? The first time get_input is called it should return "l", the second time "h", the third time "q" and the fourth time "c".

To achieve this behavior we can use a list and the function pop. inputs.pop(0) removes the first element from the list inputs and returns that removed element.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
inputs = ["l", "h", "q", "c"]

print(inputs)        # ["l", "h", "q", "c"]
print(inputs.pop(0)) #  "l"
print(inputs)        # ["h", "q", "c"]
print(inputs.pop(0)) #  "h"
print(inputs)        # ["q", "c"]
print(inputs.pop(0)) #  "q"
print(inputs)        # ["c"]
print(inputs.pop(0)) #  "c"
print(inputs)        # []

We can implement get_input_automated as follows:

1
2
3
4
5
6
7
8
inputs = ["h", "l", "q", "c"]
def get_input_automated():
    return inputs.pop(0)

print(get_input_automated()) # "h"
print(get_input_automated()) # "l"
print(get_input_automated()) # "q"
print(get_input_automated()) # "c"

Note that it is crucial inputs is created outside the scope of get_input_automated. If it were created inside we would get different behavior:

1
2
3
4
5
6
7
8
def get_input_automated():
    inputs = ["h", "l", "q", "c"]
    return inputs.pop(0)

print(get_input_automated()) # "h"
print(get_input_automated()) # "h"
print(get_input_automated()) # "h"
print(get_input_automated()) # "h"

If inputs is created inside the scope of get_input_automated, a new list is created each time get_input_automated is called. But we don’t want that. We want all calls to get_input_automated to take from the same list. To accomplish that we seperate creating the list from calling get_input_automated by creating the list outside the scope of get_input_automated. Shortly we’ll look at how to do this a bit more nicely but for now let’s add it to our program:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
def play_game(get_input):
    low = 0
    high = 100
    keep_going = True

    print("Please think of a number between 0 and 100!")

    while keep_going:
        guess = round((low + high) / 2)
        
        print("Is your secret number " + str(guess) + "?")
        print("Enter 'h', 'l' or 'c'.")

        answer = get_input()
        
        if (answer == "c"):
            print("Game over. Your secret number was: " + str(guess))
            break
        elif (answer == "h"):
            high = guess
        elif (answer == "l"):
            low = guess
        else:
            print("Sorry, I did not understand your input.")

# tests

inputs = ["h", "l", "q", "c"]
def get_input_automated():
    return inputs.pop(0)

play_game(get_input_automated)

# manual game
# play_game(input)

Note that playing a manual game is currently commented out on line 35. If we run this code we see a full game being printed in the console, without us having to provide input manually.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
Please think of a number between 0 and 100!
Is your secret number 50?
Enter 'h', 'l' or 'c'.
Is your secret number 25?
Enter 'h', 'l' or 'c'.
Is your secret number 38?
Enter 'h', 'l' or 'c'.
Sorry, I did not understand your input.
Is your secret number 38?
Enter 'h', 'l' or 'c'.
Game over. Your secret number was: 38

Take a few moments to appreciate how cool this is. Automation achieved!

What if we want multiple test cases? Let’s say we want to also automate the case in which the player inputs l followed by c. We could do something like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
inputs_h_l_q_c = ["l", "h", "q", "c"]
def get_inputs_h_l_q_c():
    return inputs_l_h_q_c.pop(0)

inputs_l_c = ["l", "c"]
def get_inputs_l_c():
    return inputs_l_c.pop(0)

play_game(get_inputs_h_l_q_c)
play_game(get_inputs_l_c)

This structure could be improved. It repeats names a lot (three per test case) and if you squint a little you’ll see that lines 1-3 and 5-7 are almost duplicates.

Apart from their names, get_inputs_h_l_q_c and get_inputs_l_c only differ in one aspect: which list they pop from. We can create a function that has the list to pop from as a parameter and returns a function that pops from this list.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
def create_get_input(inputs):
    def get_input():
        return inputs.pop(0)
    
    return get_input

inputs_h_l_q_c = ["l", "h", "q", "c"]
get_inputs_h_l_q_c = create_get_input(inputs_h_l_q_c)

inputs_l_c = ["l", "c"]
get_inputs_l_c = create_get_input(inputs_l_c)

play_game(get_inputs_h_l_q_c)
play_game(get_inputs_l_c)

Removing duplication by refactoring in this way is called extract function.

We can clean this up further by inlining: replacing variables by the values assigned to them.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
def create_get_input(inputs):
    def get_input():
        return inputs.pop(0)
    
    return get_input

get_inputs_h_l_q_c = create_get_input(["l", "h", "q", "c"])
get_inputs_l_c = create_get_input(["l", "c"])

play_game(get_inputs_h_l_q_c)
play_game(get_inputs_l_c)
1
2
3
4
5
6
7
8
def create_get_input(inputs):
    def get_input():
        return inputs.pop(0)
    
    return get_input

play_game(create_get_input(["l", "h", "q", "c"]))
play_game(create_get_input(["l", "c"]))

The complete code so far looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
def play_game(get_input):
    low = 0
    high = 100
    keep_going = True

    print("Please think of a number between 0 and 100!")

    while keep_going:
        guess = round((low + high) / 2)
        
        print("Is your secret number " + str(guess) + "?")
        print("Enter 'h', 'l' or 'c'.")

        answer = get_input()
        
        if (answer == "c"):
            print("Game over. Your secret number was: " + str(guess))
            break
        elif (answer == "h"):
            high = guess
        elif (answer == "l"):
            low = guess
        else:
            print("Sorry, I did not understand your input.")

# tests

def create_get_input(inputs):
    def get_input():
        return inputs.pop(0)
    
    return get_input

play_game(create_get_input(["l", "h", "q", "c"]))
play_game(create_get_input(["l", "c"]))

# manual game
play_game(input)

Automating checking the output

According to the simple testing method of the previous article we verify the behavior of the program by checking the value that is returned by a call to the function we wrapped around our initial code.

In our current situation that would look something like this:

1
assert play_game(create_get_input(["l","c"])) == # something

We’ll first try this method. Afterwards we’ll try something new which is tidier when testing these kinds of programs.

The return value method

What should play_game return so we can verify it behaves correctly? If we test play_game manually we look at the lines that are printed to the console. So let’s make play_game not only print but also return those lines as a list of strings. We can then have an automated test as follows:

1
2
3
4
5
6
7
8
assert play_game(create_get_input(["l","c"])) == [
        'Please think of a number between 0 and 100!',
        'Is your secret number 50?',
        "Enter 'h', 'l' or 'c'.",
        'Is your secret number 75?',
        "Enter 'h', 'l' or 'c'.",
        'Game over. Your secret number was: 75'
]

We can replace print with a different function named print_and_archive. Given a message the function will both print it and append it to the list named printed. Ultimately play_game returns printed.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
def play_game(get_input):
    printed = []
    
    def print_and_archive(message):
        printed.append(message)
        print(message)

    low = 0
    high = 100
    keep_going = True

    print_and_archive("Please think of a number between 0 and 100!")

    while keep_going:
        guess = round((low + high) / 2)
        
        print_and_archive("Is your secret number " + str(guess) + "?")
        print_and_archive("Enter 'h', 'l' or 'c'.")

        answer = get_input()
        
        if (answer == "c"):
            print_and_archive("Game over. Your secret number was: " + str(guess))
            break
        elif (answer == "h"):
            high = guess
        elif (answer == "l"):
            low = guess
        else:
            print_and_archive("Sorry, I did not understand your input.")
    
    return printed

# tests

def create_get_input(inputs):
    def get_input():
        return inputs.pop(0)
    
    return get_input

assert play_game(create_get_input(["h", "l", "q", "c"])) == [
        'Please think of a number between 0 and 100!',
        'Is your secret number 50?',
        "Enter 'h', 'l' or 'c'.",
        'Is your secret number 25?',
        "Enter 'h', 'l' or 'c'.",
        'Is your secret number 38?',
        "Enter 'h', 'l' or 'c'.",
        'Sorry, I did not understand your input.',
        'Is your secret number 38?',
        "Enter 'h', 'l' or 'c'.",
        'Game over. Your secret number was: 38'
]

assert play_game(create_get_input(["l","c"])) == [
        'Please think of a number between 0 and 100!',
        'Is your secret number 50?',
        "Enter 'h', 'l' or 'c'.",
        'Is your secret number 75?',
        "Enter 'h', 'l' or 'c'.",
        'Game over. Your secret number was: 75'
]

# manual game
play_game(input)

Although this approach does work, it is not so nice because now we’re always doing something we don’t want. When playing a manual game we collect strings for no reason. Although perhaps we could live with this. A true sin, however, is that in a test we still print.

Waste of resources aside, if we first run the tests and finally the manual game the console will already contains a bunch of prints that have nothing to do with our current playing session! Zero stars, thumbs down, money back.

In all seriousness, if the output of our program was not printing to a console but let’s say writing to a database, we don’t want every single one of our tests to actually write to a database.

For our game we could clear the console after the tests but it is nicer to prevent the issue by not printing in the tests.

Next I’ll show a new approach that no longer has the wrapping function return a value and will no longer print in the tests.

The parameter method

We made the input a parameter because we want different ways of inputting for playing and for testing. Similarly, we can refactor the way of outputting to be a parameter as well.

Within play_game, let’s replace print the new parameter put_out. We also add print as a second argument to calls to play_game.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
def play_game(get_input, put_out):
    low = 0
    high = 100
    keep_going = True

    put_out("Please think of a number between 0 and 100!")

    while keep_going:
        guess = round((low + high) / 2)
        
        put_out("Is your secret number" + str(guess) + "?")
        put_out("Enter 'h', 'l' or 'c'.")

        answer = get_input()
        
        if (answer == "c"):
            put_out("Game over. Your secret number was: " + str(guess))
            break
        elif (answer == "h"):
            high = guess
        elif (answer == "l"):
            low = guess
        else:
            put_out("Sorry, I did not understand your input.")

# tests

def create_get_input(inputs):
    def get_input():
        return inputs.pop(0)
    
    return get_input

play_game(create_get_input(["l", "h", "q", "c"]), print)
play_game(create_get_input(["l", "c"]), print)

# manual game
# play_game(input, print)

Just as when we made the input a parameter, this refactor does not change any behavior so we still have not automated checking the output.

Now we create an alternative to print to use for testing. Let’s name this function put_out. What did play_game expect of print? That it is a function that can be called with a string as argument, any number of times. put_out will also have to meet this expectation.

Before we asserted a list of strings returned by play_game. Now we’ll assert the list messages_put_out that is filled by put_out.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
messages_put_out = []
def put_out(message):
    messages_put_out.append(message)

play_game(create_get_input(["l","c"]), put_out)

assert messages_put_out == [
        'Please think of a number between 0 and 100!',
        'Is your secret number 50?',
        "Enter 'h', 'l' or 'c'.",
        'Is your secret number 75?',
        "Enter 'h', 'l' or 'c'.",
        'Game over. Your secret number was: 75'
]

If we do this for multiple test cases we get duplication similar to before:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
messages_put_out_h_l_q_c = []
def put_out_h_l_q_c(message):
    messages_put_out_h_l_q_c.append(message)

play_game(create_get_input(["h","l","q","c"]), put_out_h_l_q_c)

assert messages_put_out_h_l_q_c == [
        ...
]


messages_put_out_l_c = []
def put_out_l_c(message):
    messages_put_out_l_c.append(message)

play_game(create_get_input(["l","c"]), put_out_l_c)

assert messages_put_out_l_c == [
        ...
]

Again we have some duplication. This time the assert is part of the duplication. If we extract to function, assert should become part of the function. Doing so results in the new function test.

26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
# tests

def create_get_input(inputs):
    def get_input():
        return inputs.pop(0)
    
    return get_input

def test(inputs, expected_messages_put_out):
    messages_put_out = []

    def put_out(message):
        messages_put_out.append(message)

    play_game(create_get_input(inputs), put_out)

    assert messages_put_out == expected_messages_put_out

test(["h","l","q","c"], [
    'Please think of a number between 0 and 100!',
    'Is your secret number 50?',
    "Enter 'h', 'l' or 'c'.",
    'Is your secret number 25?',
    "Enter 'h', 'l' or 'c'.",
    'Is your secret number 38?',
    "Enter 'h', 'l' or 'c'.",
    'Sorry, I did not understand your input.',
    'Is your secret number 38?',
    "Enter 'h', 'l' or 'c'.",
    'Game over. Your secret number was: 38'
])

test(["l","c"], [
    'Please think of a number between 0 and 100!',
    'Is your secret number 50?',
    "Enter 'h', 'l' or 'c'.",
    'Is your secret number 75?',
    "Enter 'h', 'l' or 'c'.",
    'Game over. Your secret number was: 75'
])

# manual game
play_game(input, print)

Both test cases will run and pass (lines 44 and 58) and afterwards a manual game will be started (line 68). The tests did not print anything and our game starts in a clean console!

Now we have solidly pinned down the original program’s behavior with automated tests without printing to the console. Only in our manual game do we print to the console, which is what we want. We can now fully focus on changing the structure of our original program, which has become the body of play_game, without having to spend any time or energy on checking if we broke anything.

It took some effort to get to this point, but writing tests is an investment that might (probably will) pay off and is worth considering!

Vocabulary

There are names for some things we’ve seen in this article. Let’s go over them.

Higher order function

A higher order function is a function that satisfies one or both of the following:

  • at least one parameter is a function
  • it returns a function

This makes play_game a higher order function as both parameters get_input and put_out are functions. create_get_input is a higher order function because it returns a function.

A function get_input returned by create_get_input has dynamic behavior. Each time we call it, it returns a different value. We control exactly what this dynamic behavior is: each time return the next string from a list.

Closure

Each time we call get_input a scope is created and then destroyed. And yet there is coordinated behavior between calls: each consequtive call returns the next string from the same list. How is this possible?

Each call to get_input operates on the same state that exists beyond any call to get_input. In this case the variable inputs, which is a parameter of the function create_get_input. inputs is not part of the scope of get_input so it is ‘kept alive’ after a call to get_input has completed.

A closure is a function that accesses state that is maintained beyond the lifetime of calls to this function. This is typically used to create coordinated behavior between individual calls to this function.

I want to say many things about closures. However, to maintain focus I decided to put that in a future article.

If you are more experienced you might recognize that to achieve this coordinated behavior you can also use objects with methods. Indeed you can. You could say methods are closures. However, in this case creating closures with functions is adequate.

Polymorphism

Let’s compare the original code for playing a game with our final version:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
...

print("Please think of a number between 0 and 100!")

...
    
    print("Is your secret number " + str(guess) + "?")
    print("Enter 'h', 'l' or 'c'.")

    answer = input()
    
    ...
        print("Game over. Your secret number was: " + str(guess))
    ...
        print("Sorry, I did not understand your input.")
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
def play_game(get_input, put_out):
    ...

    put_out("Please think of a number between 0 and 100!")

    ...
        
        put_out("Is your secret number" + str(guess) + "?")
        put_out("Enter 'h', 'l' or 'c'.")

        answer = get_input()
        
        ...
            put_out("Game over. Your secret number was: " + str(guess))
        ...
            put_out("Sorry, I did not understand your input.")

play_game(input, print)

The only difference really is that the original version of the game code called input and print directly whereas the refactored version calls input and print indirectly. We created an indirection.

Originally the game code depended specifically on the function input. Now it depends on the abstract expectation that get_input is any function that can be called without arguments and each time will return a string. Likewise, the game code depends on an abstract expectation of put_out.

We can call play_game with input and print as arguments. Both functions satisfy the expectations of the game code. We can also call play_game with other functions, as long as they satisfy the expectations. This is exactly what we do in our tests.

How the expectations of the game code are met can now take many shapes. Many, poly. Shapes, morphs. Polymorphism.

Typically polymorphism is explained with interfaces and objects, or classes and instances. These are more elaborate tools that also solve the problem of defining expectations and creating concrete realizations of these expectations.

Often the simple and familiar function is adequate for solving this problem. Moreso, in my experience students who first obtain a strong understanding of functions and closures have an easy time when they later learn about interfaces and classes.

Closing words

Very many difficulties in programming arise because it matters when or how many times certain code runs. The more you write your code in such a way that it no longer matters when or how many times it runs, the fewer of these problems you’ll see.

In this article we saw an example of this phenomenon. Orginially our game code printed in the console. It matters how many time this code runs because we don’t want to see prints that have nothing to do with our current playing session.

After we refactored in such a way that we could run the game code without printing, it no longer mattered how many times we ran it and we could run it many times (once for each test case) without having to worry.