Monday, November 30, 2020

Who has an advantage in Risk, attacker or defender?

Let's first assume that we're talking about the frequent case where the attacker is rolling 3 dice, while the defender can only roll 2 dice, but ties go to the defender. More dice is better. Winning ties is better. But which of those wins out? Let's see.

One thing that's tricky here is that we compare the highest attacker die to the highest defender die, and the second highest attacker die to the lowest defender die. So we have to consider the joint probability distribution of the two attacker dice and the two defender dice. We can treat the attacker and defender separately, though, since they are independent.


Let's start by working with the defender, and looking at the CDF of the better roll of 2d6. Recall the CDF $F_X(x) = P(X \leq x)$. Let's say $D_1$ is the the higher die of the defender, with $D_2$ being the lower die. Let's say that X is the result of rolling a 1d6.

\begin{align} F_{D_1}(d) = P(D_1 \leq d) = P(X \leq d)^2 \end{align}

That is, the probability that the highest die is less than or equal to $d$ is the probability that both dice are less than or equal to $d$.

\begin{align} F_{D_1}(d) = \left ( \frac{d}{6} \right)^2 \end{align}

What about $F_{D_2}(d)$? Similar to above, we can say that

\begin{align} F_{D_2}(d) &= P(D_2 \leq d) \\ &= 1 - P(D_2 > d) \\ &= 1 - P(X > d)^2 \\ &= 1 - \left ( \frac{6-(d+1)+1}{6} \right)^2 \\ &= 1 - \left ( \frac{6-d}{6} \right)^2 \\ &= \frac{6^2}{6^2} - \frac{6^2-12d+d^2}{6^2} \\ &= \frac{6^2-6^2+12d-d^2}{6^2} \\ &= \frac{12d-d^2}{6^2} \\ &= \frac{d}{6}\cdot \frac{12-d}{6} . \end{align}

Those make sense, though we should check that they work out at say $d=1$ and $d=6$. But those are assuming that they're independent. Given that the highest die comes out to be $d_1$, what's the distribution of $D_2$? Or what is $P(D_2 = d_2 | D_1 = d_1)$. Well, it's certainly 0 if $d_2 > d_1$.


This seems to be a hard one.


It may just be easier to calculate the joint probability directly.

\begin{align} P(D_1=d_1, D_2=d_2) = \begin{cases} 0 & d_1 < d_2 \\ P(d_1\text{ and }d_2\text{ on 2d6}) & d_1 = d_2 \\ P(d_1\text{ and }d_2\text{ on 2d6}) & d_1 > d_2 \end{cases} \end{align}

Let's drill down more, in the $d_1=d_2$ case, it's just the probability of getting both of that value when rolling 2d6.

\begin{align} P(d_1\text{ on 1d6}) &= \frac{1}{6}\\ P(d_1\text{ on both of 2d6}) &= P(d_1\text{ on 1d6})^2 \\ &=\left ( \frac{1}{6} \right ) ^2 \\ &=\frac{1}{36} \\ P(D_1=d_1, D_2=d_2) & = \frac{1}{36}, \quad d_1=d_2 \end{align}

Okay, now the $d_1 > d_2$ case. If we have two different values, there's two ways we could get them. Let's say we have two dice $A$ and $B$. We roll $d_1$ on $A$ and $d_2$ on $B$, or $d_1$ on $B$ and $d_2$ on $A$. The probability of rolling $d_1$ on a particular die is $1/6$, similarly for $d_2$. Thus, the probability of getting $D_1=d_1$ and $D_2=d_2$ is twice the probability of rolling $d_1$ and then $d_2$.

\begin{align} P(D_1=d_1, D_2=d_2) & = 2\cdot\frac{1}{6}\cdot\frac{1}{6}, \quad d_1>d_2\\ & = 2\cdot\frac{1}{36}, \quad d_1>d_2\\ & = \frac{1}{18}, \quad d_1>d_2 \end{align} \begin{align} P(D_1=d_1, D_2=d_2) = \begin{cases} 0 & d_1 < d_2 \\ \frac{1}{36} & d_1 = d_2 \\ \frac{1}{18} & d_1 > d_2 \end{cases} \end{align}

This makes some sense, and we can see that the probabilities are going to add up to one, though we should probably check more explicitly. You can think of this as rolling two dice in order. Then, if the second one came out larger, we switch the order. Thus, all the probabilities for $d_1 < d_2$ went from $1/36$ to 0, and all the probabilities for $d_1 > d_2$ went from $1/36$ to $2/36=1/18$. I think, ultimately, this last explanation is probably the way to think about it.


So let's take that kind of thinking and apply it to the attacker. Here we have three dice, but only use two. If we use this algorithmic approach we can conceive of rolling three dice, and then reordering them to be from largest to smallest. With two dice, we reordered, or we didn't. With three dice, there are more cases. Let's assume that we have three distinct results on our rolls. Then there are $3! = 3 \cdot 2 \cdot 1 = 6$ ways to order them. Thus,

\begin{align} P(A_1=a_1, A_2=a_2, A_3=a_3) &= 3! \cdot \left ( \frac{1}{6} \right ) ^3\\ &= 6 \cdot \frac{1}{6^3}\\ &= \frac{1}{6^2}\\ &= \frac{1}{36}. \end{align}


Interestingly, we only care about two of the values. So we want to sum up for all possible $a_3$. However, we've assumed here that $a_1 > a_2 > a_3$. Thus, the number of cases to sum is limited. The number of cases where $a_3 < a_2$ is equal to $a_2 - 1$.

\begin{align} P(A_1=a_1, A_2=a_2) &= \sum_{a_3 < a_2} P(A_1=a_1, A_2=a_2, A_3=a_3) \\ &= (a_2-1) \cdot \frac{1}{36} \\ &= \frac{a_2-1}{36} \end{align}

Oops, we left out one above. We left one case out. We were assuming $a_3 < a_2$, but that doesn't show up in the equation as we wrote it. We only really need to assume $a_3 \leq a_2$ (otherwise, it won't occur at all and the probability is zero). Thus, we can fix our equation as follows.

\begin{align} P(A_1=a_1, A_2=a_2) &= \sum_{a_3 \leq a_2} P(A_1=a_1, A_2=a_2, A_3=a_3) \\ &= a_2 \cdot \frac{1}{36} \\ &= \frac{a_2}{36} \end{align}

Oops, now I think I've done some double counting. Because if $a_1 > a_2 = a_3$ then there aren't $3!=6$ orderings There are only $(3!)/(2!) = 6/2 = 3$ ways. Thus,

\begin{align} P(A_1=a_1, A_2=a_2, A_3=a_2) &= 3 \cdot \frac{1}{6^3}\\ &= \frac{1/2}{36}. \end{align}

So

\begin{align} P(A_1=a_1, A_2=a_2) &= P(A_1=a_1, A_2=a_2, A_3 < a_2) + P(A_1=a_1, A_2=a_2, A_3=a_2) \\ &= \sum_{a_3 < a_2} P(A_1=a_1, A_2=a_2, A_3=a_3) + P(A_1=a_1, A_2=a_2, A_3=a_2) \\ &= (a_2-1) \cdot \frac{1}{36} + \frac{1/2}{36}\\ &= \frac{a_2-1/2}{36} \end{align}

We now have more cases to consider. As before, we need to consider the all equal case $a_1=a_2=a_3$, which, similar to before, occurs with probability $1/6^3 = 1/216$. We also have to consider all the cases where two of the dice are equal, but not to the third. We considered the $a_1 > a_2=a_3$ case above. But we also have $a_1=a_2 > a_3$. Again, we only have 3 orderings, so

\begin{align} P(A_1=a_1, A_2=a_1, A_3 < a_1) &= 3 \cdot \left ( \frac{1}{6} \right ) ^3\\ &= \frac{1/2}{36} . \end{align}

Note: I'm writing this weird fraction to keep a common denominator of 36. Perhaps we should really be working with a denominator of 72. Again we combine to get the probability independent of $a_3$.

\begin{align} P(A_1=a_1, A_2=a_1) &= P(A_1=a_1, A_2=a_1, A_3 < a_1) + P(A_1=a_1, A_2=a_1, A_3=a_1) \\ &= \sum_{a_3 < a_1} P(A_1=a_1, A_2=a_1, A_3=a_3) + P(A_1=a_1, A_2=a_1, A_3=a_1) \\ &= (a_1-1) \cdot \frac{1/2}{36} + \frac{1}{216}\\ &= (a_1-1) \cdot \frac{3}{6^3} + \frac{1}{6^3}\\ &= \frac{3\cdot (a_1-1) + 1}{6^3} \\ &= \frac{3 a_1-2}{6^3} \end{align}

Do these all add up to 1?

\begin{align} \sum_{a_1} P(A_1=a_1, A_2=a_1) + \sum_{a_1 > a_2} P(A_1 = a_1, A_2 = a_2) &= 1 \\ \sum_{a_1=1}^6 \frac{3 a_1 - 2}{6^3} + \sum_{a_1=2}^6 \sum_{a_2=1}^{a_1-1} \frac{a_2-1/2}{36} &= 1 \\ \sum_{a_1=1}^6 \frac{3 a_1 - 2}{6^3} + \sum_{a_1=2}^6 \sum_{a_2=1}^{a_1-1} \frac{a_2-1/2}{36} &= 1 \end{align}

Recall that,

\begin{align} \sum_{k=1}^n k = \frac{n(n+1)}{2}. \end{align}

Thus, continuing on, we have as follows.

\begin{align} \frac{3 \cdot \frac{6(6+1)}{2} - 2\cdot 6}{6^3} + \sum_{a_1=2}^6 \left (\frac{(a_1-1)a_1}{2}-\frac{1}{2}\cdot (a_1-1)\right) \cdot \frac{1}{36} &= 1 \\ \frac{51}{6^3} + \sum_{a_1=2}^6 \frac{a_1^2-2a_1+1}{2} \cdot \frac{1}{36} &= 1 \end{align}

Now we have the sum of a squared term, which is given by,

\begin{align} \sum_{k=1}^n k^2 = \frac{n(n+1)(2n+1)}{6}. \end{align} \begin{align} \frac{51}{6^3} + \frac{\left(\frac{6(6+1)(12+1)}{6}-1\right)-2 \cdot \left(\frac{6(6+1)}{2}-1\right) + 5}{2} \cdot \frac{1}{36} &= 1 \\ \frac{51}{6^3} + \frac{\left((6+1)(12+1)-1\right)-2 \cdot \left(3(6+1)-1\right)+5}{2} \cdot \frac{1}{36} &= 1 \\ \frac{51}{6^3} + \frac{(90)-2 \cdot (20)+5}{2} \cdot \frac{1}{36} &= 1 \\ \frac{51}{6^3} + \frac{(90)-(40)+5}{2} \cdot \frac{1}{36} &= 1 \\ \frac{51}{6^3} + \frac{55}{2 \cdot 36} &= 1 \\ \frac{51}{6^3} + \frac{55}{2 \cdot 36} &=1 \\ \frac{3 \cdot 17}{6^3} + \frac{55}{2 \cdot 36} &=1 \\ \frac{17}{2 \cdot 6^2} + \frac{55}{2 \cdot 36} &=1 \\ \frac{17+55}{2 \cdot 6^2} &=1 \\ \frac{72}{72} &=1 \end{align}

Wow, we've made a big mistake here somewhere!  Yay, I fixed it. 


Note, that the first time I did this, I made a few mistakes. First, I copied the wrong expression in for one of the terms, then I made some algebra and arithmetic mistakes. That's why it's helpful to do these kinds of checks. It is also possible that I could have made such a mistake deriving the expressions themselves instead of in the check combining them.


Let's summarize our result for the attacker.

\begin{align} P(A_1=a_1, A_2=a_2) = \begin{cases} 0 & a_1 < a_2 \\ \frac{3a_1 - 2}{6^3} & a_1 = a_2 \\ \frac{a_2 - 1/2}{36} & a_1 > a_2 \end{cases} \end{align}

Recall the result for the defender.

\begin{align} P(D_1=d_1, D_2=d_2) = \begin{cases} 0 & d_1 < d_2 \\ \frac{1}{36} & d_1 = d_2 \\ \frac{1}{18} & d_1 > d_2 \end{cases} \end{align}



Also, I could use a computer to help me evaluate the expression to confirm that my mistake was in the evaluation and not in deriving the probabilities. A computer simulation could also be helpful, but less conclusive. I draw the distinction as this. You can use a computer to evaluate or to simulate. When you have a computer evaluate, you're using it as a glorified calculator or a plotting tool. Maybe you use it to solve some transcendental equations. If you use a computer to simulate, then you aren't doing much thinking, you're just setting up the problem, running one or more experiments, and hoping the result tells you something both correct and meaningful. Here, though, we can also brute force all the combinations. This is still using the computer like a calculator, but not bothering to simplify before chugging away.


With the probability distributions for both the attacker's and the defender's dice rolls, we can do the comparison and see what the outcome of a round of combat is. In this case there are three outcomes,

1) attacker loses two units,

2) defender loses two units, or

3) attacker and defender each lose one unit.

These cases come from the results of our die rolls.

\begin{align} P(\text{attacker loses two units}) &= P(A_1 \leq D_1, A_2 \leq D_2) \\ P(\text{defender loses two units}) &= P(A_1 > D_1, A_2 > D_2) \\ P(\text{both lose one unit}) &= P(A_1 \leq D_1, A_2 > D_2) + P(A_1 > D_1, A_2 \leq D_2) \end{align}

Note the use of $\leq$, corresponding to the fact that ties go to the defender. Also note that there are two ways that both can lose one unit: attacker can win with the top die and lose with the lower die, and vice versa.  We can compute these by looking at all the cases.

\begin{align} P(A_1 \leq D_1, A_2 \leq D_2) &= \sum_{d_1=1}^6 \sum_{d_2=1}^{d_1} P(A_1 \leq d_1, A_2 \leq d_2) \cdot P(D_1 = d_1, D_2 = d_2) \end{align} \begin{align} P(A_1 \leq d_1, A_2 \leq d_2) = \sum_{a_1=1}^{d_1} \sum_{a_2=1}^{d_2} P(A_1 = a_1, A_2 = a_2) \end{align} \begin{align} P(A_1 \leq D_1, A_2 \leq D_2) &= \sum_{d_1=1}^6 \sum_{d_2=1}^{d_1} \sum_{a_1=1}^{d_1} \sum_{a_2=1}^{d_2} P(A_1 = a_1, A_2 = a_2) \cdot P(D_1 = d_1, D_2 = d_2) \end{align}


Because our expressions for $P(A_1=a_1, A_2=a_2)$ and $P(D_1 = d_1, D_2 = d_2)$ depend on whether $a_1=a_2$ and whether $d_1=d_2$, respectively, we can break up the sum to look at the four cases.

\begin{align} 1: & \quad a_1=a_2 \quad d_1=d_2 \\ 2: & \quad a_1=a_2 \quad d_1>d_2 \\ 3: & \quad a_1>a_2 \quad d_1=d_2 \\ 4: & \quad a_1>a_2 \quad d_1>d_2 \end{align}

We can label the summations related to these cases at $s_1$--$s_4$.

\begin{align} P(A_1 \leq D_1, A_2 \leq D_2) &= s_1 + s_2 + s_3 + s_4 \end{align}

In many of the cases, collapsing some terms to be equal simplifies the summation.

\begin{align} s_1 &= \sum_{d_1=1}^6 \sum_{d_2=d_1}^{d_1} \sum_{a_1=1}^{d_1} \sum_{a_2=a_1}^{a_1} P(A_1 = a_1, A_2 = a_2) \cdot P(D_1 = d_1, D_2 = d_2) \\ &=\sum_{d_1=1}^6 \sum_{a_1=1}^{d_1} P(A_1 = a_1, A_2 = a_1) \cdot P(D_1 = d_1, D_2 = d_1) \\ &=\sum_{d_1=1}^6 \sum_{a_1=1}^{d_1} \left (\frac{3a_1-2}{6^3} \right) \cdot \frac{1}{36} \\ &=\sum_{d_1=1}^6\frac{3 \frac{d_1 \cdot (d_1+1)}{2}-2d_1}{6^5}\\ &=\sum_{d_1=1}^6\frac{\frac{3}{2}d_1^2 -\frac{1}{2}d_1}{6^5}\\ &= \frac{1}{6^5} \cdot \left ( \frac{3}{2}\cdot \frac{6 \cdot (6+1) \cdot (2\cdot 6 +1 )}{6} - \frac{1}{2} \cdot \frac{6 \cdot (6+1)}{2} \right ) \\ &=\frac{6\cdot 7 \cdot 13 - 6 \cdot 7 }{2^2 \cdot 6^5}\\ &=\frac{6\cdot 7 \cdot 2 \cdot 6 }{2^2 \cdot 6^5}\\ &= \frac{7}{2 \cdot 6^3}\\ &\approx 0.0162 \end{align}

Now on to the next term.

\begin{align} s_2 &= \sum_{d_1=1}^6 \sum_{d_2=1}^{d_1-1} \sum_{a_1=1}^{d_1} \sum_{a_2=a_1}^{a_1} P(A_1 = a_1, A_2 = a_2) \cdot P(D_1 = d_1, D_2 = d_2) \\ &=\sum_{d_1=1}^6 \sum_{d_2=1}^{d_1-1} \sum_{a_1=1}^{d_1} P(A_1 = a_1, A_2 = a_1) \cdot P(D_1 = d_1, D_2 = d_2) \\ &=\sum_{d_1=1}^6 \sum_{d_2=1}^{d_1-1} \sum_{a_1=1}^{d_1} \left (\frac{3a_1-2}{6^3} \right) \cdot \frac{1}{18} \\ &=\sum_{d_1=1}^6 \sum_{d_2=1}^{d_1-1}\frac{3 \frac{d_1 \cdot (d_1+1)}{2}-2d_1}{3 \cdot 6^4}\\ &=\sum_{d_1=1}^6 (d_1-1) \cdot \frac{3 d_1^2 -d_1}{6^5}\\ &=\sum_{d_1=1}^6 \frac{3 d_1^3 -4d_1^2+d_1}{6^5}\\ \end{align}

To solve this part, we need to know what the sum of a sequence of cubed terms is.

\begin{align} \sum_{k=1}^n k^3 = \frac{n^2(n+1)^2}{2^2}. \end{align}

I will avoid proving the above, as I did with the previous summations. This is not one I had known existed, but it was easy to find. A proof by induction should work here easily as well.

\begin{align} s_2 &= \frac{3 \cdot\frac{6^2 \cdot 7^2}{2^2} - 4\cdot\frac{6 \cdot 7 \cdot 13}{6} + \frac{6\cdot7}{2}}{6^5}\\ &= \frac{3 \cdot 3^2\cdot7^2 - 4\cdot7\cdot13+3\cdot7}{6^5}\\ &=\frac{2^2\cdot5\cdot7^2}{6^5}\\ &\approx 0.126 \end{align}


Alright, I'm trying to make a point, but this is getting too tiresome. We haven't even completed one of the scenarios! Let's just write a program to run all the cases, there aren't that many, only $6^5=7776$, and then be done with it.


The construction is this: we'll run 5 loops, one for each of the 5 dice, each taking the values from 1 to 6 to cover all possible cases. The first three loops correspond to the attacker's dice, the last two to the defender's dice. Then we sort the values and compare to see which of the three scenarios we end up in. We'll initialize three counters to zero before the loops and use them to keep track of how many times each case occurs. After the loops complete, we can check that they sum to the required 7776 cases as a basic check that we didn't skip any outcomes. From the counts, we can divide by 7776 to get the probabilities of each case.


\begin{align} P(\text{attacker loses two units}) &= \frac{2275}{7776} \approx 0.293 \\ P(\text{defender loses two units}) &= \frac{2890}{7776} \approx 0.372 \\ P(\text{both lose one unit}) &= \frac{2611}{7776} \approx 0.336 \end{align}


As we can see, all three outcomes are quite likely, but the probability that the defender loses two units is the highest, which gives the edge to the attacker.


Note that while I have used a computer to obtain this result (and indeed to compute several values and generate many plots in the past), we have not performed a simulation. That is, we are not limited here by the randomness of the computer and the number of trials run. On the other hand, we've used the same technique as when we first looked at the distribution of 2d6: counting all the cases. This can be an exhaustive tedious method, but in cases where we can make the computer do it for us it is often the most efficient path to a solution. It doesn't necessarily give us a lot of insight here, but neither was all the analysis we did above.

1 comment: