Let slip the dice of war
A computer-aided analysis of the probabilities of battle outcomes in the game RISK
Contents
Over the years there has been ample work done and shared on calculating win percentages for battles in RISK. However, what I find more useful while playing is knowing how many troops I need, against a given amount of opposing troops, to win with a certain probability. Sometimes I need to take a lot of risk, other times I want a lot of certainty. Additionally, I usually play with a non-standard rule. Some territories, called capitals, can defend with up to three dice. I couldn’t find any data on capitals.
So I had to do some work myself. In this article I share my findings and discuss how I obtained them. With some knowledge of basic probability theory you can follow most of it. I also share the code I wrote that did the heavy lifting, to be adapted by anyone who wants to calculate results for their variation of RISK. I won’t discuss the code in this article so if you have any questions, send me a message.
After I completed my calculations and analysis, I did find an article that tries to answer the same questions as I do. However, it is much more difficult to understand (elite level mathematics) and I daresay the results in said article are erroneous. I discuss this some more later in this article.
Results
The graphs below show the minimal amount of attacking troops (A) required on the territory from which the attack is launched to conquer a territory defended by D troops, with a probability of at least p.
The graphs serve to sharpen your intuitive understanding of the game. If you want to look up exact numbers, consult the csv files in the repository (is this cheating?).
Note that winning with a probability of 100% is impossible. If you’re extremely unlucky, with any amount of attacking troops there is always a chance of losing against a single defending troop.
The graphs on the left show data for standard territories (which defend with up to two dice), the graphs on the right for capitals (up to three).
However, it is easier to create rules of thumb from the relative amount of attacking troops required. We obtain these by dividing A by D.
Although these up-to-and-including-200-defending-troops graphs are still mostly to get a feel for the numbers. Let’s zoom in on up to 50 troops.
Now we can read more clearly. For example, to conquer a standard territory with 10 defending troops with a probability of at least 95%, you need about twice as many troops.
The spikes at the beginning of the lines in the capital graph make sense, because they occur at D=3, when capitals start making use of their unique advantage.
Personally, I mostly use the 99%, 75% and 25% lines. To acquire a card I must conquer at least one territory in my turn, which I want to do with 99% certainty. Otherwise, I usually try to attack with a 75% probability of conquering. If I want to take extra care to hold a territory, I put so many defending troops in it that the attacker will have only a 25% chance of victory.
What originally set me on the path of calculating all of this was that I wanted to understand just how much better capitals were at defending than standard territories. The juxtaposed graphs above clearly show that for the same D and p, it takes signfinicantly more troops to conquer a capital. To show the relative defensive strength of a capital, I divided the capital graph by the standard territory graph.
Roughly, to conquer a capital you need 1.5 to 2 times as many troops as you would need to conquer a standard territory.
Approach
How to mathematically calculate the probability of winning has been described in detail [0] [1]. To calculate it with computer programming has been somewhat documented to a lesser extent [2] [3]. My approach is to calculate win percentages for an ever increasing amount of attacking troops versus a fixed amount of defending troops until the probability of the attacking side winning is high enough.
From here on I switch from ‘I’ to ‘we’, because when I write an explanation I always feel like a wizard guiding a traveler through a foggy swamp.
Creating a probability tree
In our calculations we work with battle states, which look like [A D]
, representing the amount of troops each side has left. A
is exactly one less than the A of the resulting graphs, because one troop must always remain in the territory from which the attack was launched.
Before we look at probabilties, let’s first look at the transitions from one state to another, a so called ‘state space’.
In state [1 1]
, each side has a single troop and throws with a single die. Either the attacking side suffers one loss (the loss is [1 0]
) or the defending side suffers one loss (the loss is [0 1]
). Loss [1 0]
leads to end state [0 1]
, loss [0 1]
leads to end state [1 0]
. An end state is a state in which the battle has ended: one side has 0 troops remaining.
Here is a bigger state space:
Note that some states have different losses than others. E.g. unlike state [1 1]
, state [3 2]
can have losses [0 2]
, [1 1]
and [2 0]
.
To obtain a probability tree we want to replace each loss with its associated probability, resulting in something like this:
First we need to find out, for a state, how many dice each side will use. As the literature shows and is widely known, for both sides it is best to throw with as many dice as possible.
The amount of dice rolled by the attacker is whichever is lowest: 3 or A
. For the defender, whichever is lowest: 2 or D
.
Both sides roll their dice. The highest roll of the attacker is compared to the highest roll of the defender; the second highest roll of the attacker is compared to the second highest roll of defender.
Let’s look at [3 2]
dice (3 dice for the attacker, 2 for the defender) where the attacker rolls [3 2 6]
and the defender rolls [4 3]
.
6 is compared to 4, resulting in a loss for the defender, denoted as loss [0 1]
.
3 is compared to 3, resulting in a loss for the attacker, denoted as loss [1 0]
. Summing these two losses, overall the losses are [1 1]
.
With [3 2]
dice, [[3 2 6] [4 3]]
is just one of the possible roll outcomes.Using [3 2]
dice, there are many equally likely roll outcomes (assuming all dice are fair): 6 * 6 * 6 * 6 * 6 = 7776
.
Each roll outcome corresponds to a loss:
|
|
Resulting in 7776 losses. However, there are ducplicates. We can count how many time each loss occurs:
|
|
So when throwing with [3 2]
dice, 2890 of 7776 equally likely occurrences result in the loss [0 2]
. We can now say that when throwing with [3 2]
dice, the probability of the loss [0 2]
is 2890/7776
.
We can similarly obtain the possible losses and their probabilities for all relevant amounts of dice:
|
|
Putting together what we have so far: by calculating the used dice for a state we can determine the losses and associated probabilties that lead to its child states. If we start building the probability tree at the largest state we want with so called depth-first recursion, we end up with an almost complete probability tree. Here is an example, starting at state [2 2]
:
However, because states [1 2]
and [2 1]
are not reachable from [2 2]
they aren’t included in the tree. To include them, we can restart the recursion from these states. Now when the recursion reaches state [1 1]
, we can stop because the tree from [1 1]
onwards has already been constructed.
The full probability tree for all states up to and including [3 3]
looks as follows:
Calculating end-state->p
What we’re after is to calculate for a state, for each reachable end state the probability of reaching that end state. We can do that by looking at each path from the state to an end state. The probability of reaching an end state is the product of the probabilities that make up the path.
The probability of reaching end state [2 0]
from state [3 2]
is 2611/7776 * 125/216 = 326375/1679616
.
If there are multiple paths to an end-state, the probability of reaching the end-state is the sum of each path’s probability.
The probability of reaching end state [1 0]
from state [3 2]
is the sum of the probability of the blue-and-green path (2275/7776 * 55/216 * 15/36
) and the probability of the yellow-and-green path (2611/7776 * 91/216 * 15/36
): 625625/20155392 + 1188005/20155392 = 906815/10077696
, roughly 0.09
or 9%.
By traversing all paths starting at state [3 2]
(7 paths in total) we can obtain end-state->p
for state [3 2]
:
|
|
Note that these probabilities should and indeed do sum to exactly 1.
Calculating win percentages
The probability of the attacker winning a battle between 3 attacking troops (so 4 attacking troops on the territory from which the attack is launched) and 2 defending troops is the probabilty of reaching end state [1 0]
, [2 0]
or [3 0]
. We calculate this by taking the sum of the probabilities of reaching these end states: 906815/10077696 + 326375/1679616 + 1445/3888 = 6610505/10077696
, roughly 0.656
or 65.6%.
No path traversal needed
To obtain state->end-state->p
, one implementation of this probability tree approach is to calculate the probability tree and afterwards traverse all paths. However, with low effort we can construct the tree and calculate the probabilities at the same time. This approach is just a short-cut version of the path traversal approach. The underlying mathematical idea is the same.
An end state can reach exactly one end state, itself, with a probability of 1. E.g. the end-state->p
for state [0 1]
is {[0 1] 1}
.
We can calculate end-state->p
for a state using, for each child of the state, the probability of reaching that child and the child’s end-state->p
. This is possible only if we calculate the end-state->p
for each state in the right order. Whenever we start the calculating for a state, the calculations for all of its children should have already been completed. We can do this with a depth-first approach as shown before.
By multiplying the probability of reaching a child with the probability of the child reaching an end state, we obtain the probability of the current state reaching that end state. E.g. the probability of [1 1]
reaching its child [0 1]
is 21/36
; [0 1]
reaches a single end state, [0 1]
with a probability of 1
. So [1 1]
reaches end states [0 1]
with a probability of 21/36 * 1 = 21/36
. Through child [1 0]
, [1 1]
reaches end state [1 0]
with a probability of 15/36
.
With [1 1]
and [2 0]
done we can calculate for [2 1]
. Via child [2 0]
it can reach end state [2 0]
with a probability of 161/216 * 1 = 161/216
. Via child [1 1]
it can reach end states [0 1]
and [1 0]
, respectively with probabilities 55/216 * 21/36 = 385/2592
and 55/216 * 15/36 = 275/2592
.
Similar to the path traversal approach, when two children can reach the same end state, the probabilities should be added. The probability of the current state reaching the end state is the probability of reaching it via the first child added to the probability of reaching it via the second child.
Capitals
The only difference a capital brings to the table is the possibility for an extra defending die, changing the maximum allowed dice for defenders from 2 to 3. This results in three new amounts of dice: [1 3]
, [2 3]
and [3 3]
.
|
|
Unfortunately, this means capitals have an entirely different state space. Compare the following transitions to the previous state space of standard territories:
Not only do some states suddenly lead to different states (such as [3 3]
now leading to [0 3]
), the probability of transitions can also be different. The loss [1 0]
from state [1 3]
to state [0 3]
for capitals has an associated probability of 225/1296
, whereas the same transition for standard territories has an associated probability of 55/216
.
So for capitals we have to recalculate the end-state->p
for all states. Calculating win percentages and minimum required amounts of troops stays the same.
Calculating the results
To find the minimal amount of attacking troops needed to conquer D
defending troops with a probability of at least p, we increase A
until the probability of the attacking side winning is at least p.
To obtain the resulting graphs, do this for all combinations of D
(1, 2, .., 200) and p (0.01, 0.05, …, 0.99). Don’t forget to add 1 to A
to obtain A. Luckily we have to calculate the end-state->p
for a state [A D]
only once and can then reuse it for each different p.
Correctness
I tested the code that computes end-state->p
by comparing the results to semi-manually calculated results via the path traversal method.
To test the code that calculates win percentages, I compared my results to the ones of Table 1 in 1. Note that my results are exact (e.g. 1/3
) whereas theirs are rounded (0.333
), so I had to round down my results as well.
Working with exact numbers by using ratios demands much more computing power, but completely removes the problems caused by rounding, mainly loss of information. Using ratios made the computation take many hours, whereas the rounded numbers version took mere minutes.
With all of the complicated and tested code neatly tucked away, the code that is responsible for finding the lowest A given a D and p remains so simple I found no value in writing automated tests for it. Still, I had it peer reviewed and we find it falls in the category ‘obviously correct’. Yet another form of quality assurance is making the code publicly available.
Compared to previous work
Atfter I completed my work I found an article [4] that, like this article, tries to answer the question “What is the minimal amount of attacking troops (A) required on the territory from which the attack is launched to conquer a territory defended by D troops, with a probability of at least p?".
Actually, the semantics of their reported results are communicated ambiguously: “the minimum number of attacking armies A required to have an n% chance of conquering D defenders." (description of Table 1). Is it supposed to mean exactly an n% chance,closest to n% or at least n%?
They report in Table 1 (the AC rows) that A=3 for D=2 and p=0.2 (n=20%). According to my calculations the probability of the attacking side winning against two defending troops is around 0.106 for two attacking troops and 0.363 for three attacking troops. This suggests that indeed the authors' intent is at least n%, the same as mine.
I compared their reported results to mine, and out of 98 results 17 were different. In all 17 cases, their result is one less than mine.
|
|
Assuming my results are correct, something in the other article is incorrect. Is their mathematical reasoning flawed or did they make a mistake putting numbers into their formulas? I can’t say.
The authors never explicitly express the formulas they used to calculate these results. They discuss at length how to approximate the results and add that “one could easily adapt the virtual formulas to obtain exact formulas”. I, for one, am unable to do so and hence can’t reproduce their results.
Closing words
Early on I planned to release an actual tool that players could use to look up information while playing. After a while I realized such a thing has no place in the playing of RISK. Apart from questionable ethics, using a tool is too slow. Players have to rely on their knowledge and intuition.
I hope my colorful graphs made you understand the odds of battles in RISK a little better. Certainly, the process of doing this project has greatly improved my understanding, something which reading an article could never hope to rival. Programming will give you a deep, intimate understanding of the thing you’re modeling.
One reason why mathematics is useful is that it exchanges labor, which takes time, for abstract models. The downside is that as an approach becomes more abstract, fewer people are able to understand it and correct mistakes. A programmatic approach tends to lean more towards the labor side (such as collecting data and counting it in some way) which can make it easier to understand. This topic warrants its own article. Stay tuned!
With the end-state->p
for all states we could answer more detailed questions such as “How many attacking troops do I need to conquer D defending troops with a probability of at least p such that I have at least T troops remaining?". When calculating a win percentage, instead of looking at all states [A 0]
, only look at the ones where T <= A
… Out of scope for this article, but feel free to investigate.
References
- [0] Blatt, 2002
- RISKy Business: An In-Depth Look at the Game RISK
- [1] Osborne, 2003
- Markov Chains for the RISK Board Game Revisited
Erratum: in Table 3, D and A are swapped.
- [2] Berry, 2011
- RISK Analysis
- [3] McLoone, 2017
- How to Win at Risk: Exact Probabilities
- [4] Hendel, Hoffman, Manack, Wagaman, 2014
- Taking the risk out of RISK: Conquer odds in the board game RISK
Erratum: ‘table 4’ mentioned in the text is actually Table 1.