Monday, October 19, 2020

Simplifying Combat

Many "dudes on a map'' games, involve combat mini-games that mostly consist of taking a break from the normal game, and taking turns rolling lots of dice, losing some units, and repeating until only one player has units left. Sometimes you may be able to target specific units (e.g. Buck Rogers), play a card that influences the dice rolling that round (e.g. Star Wars: Rebellion), or decide whether to retreat or give up (e.g. War of the Ring).

This is similar to combat in many RPGs, or skill challenges in D&D 4th edition.


However, some games (e.g. Bottom of the 9th) don't offer any choices in between rounds of rolling dice. My contention is that such a game is likely bad design, simply based on the math. That is, you could replace any decision-free process with a single-step process with equivalent probabilities. Even for games with choices in between rounds of combat, the value of having such a choice should outweigh the cost of game time and complexity.


Let's look at a theoretical game. Suppose in this game when you move a number of units into an enemy territory you engage in rounds of combat. Each round you roll one die for each attacker and destroy one defender with probability $p$. Similarly, your opponent rolls one die for each defender and destroys one attacker with the same probability $p$. Each round is simultaneous; that is, remove destroyed units at the end of each round. If only one player has units at the end of a round, then stop. Suppose you start with $A$ attackers, and your opponent starts with $D$ defenders.


The combat will end with either the attacker remaining with 1 to $A$ units, the defender will remain with 1 to $D$ units, or both armies will be destroyed. If we could compute the probability of each of these results, then we could simply use the result instead of all the dice rolling.


What's tricky is that there's a probability of $(1-p)^{(A+D)}$ that no units are destroyed in a round. This comes from all $A+D$ units missing, which occurs with probability $1-p$, since $P(\text{hit}) + P(\text{miss}) = 1$. That means that the combat could go on for a long time. In fact, it's not guaranteed to ever end. So, while it's relatively easy to show the probability distribution of the state of combat after $r$ rounds, the end of the combat is not so clear.


Well, let's start by looking at what happens in round $r$ and see what the distribution is first. Let's denote $a_r$ and $d_r$ as the number of attackers and defenders, respectively, remaining at the end of round $r$. Let's also just say that $a_0 = A$ and $d_0 = D$.


Actually, let's look at the simple case of one attacking unit and one defending unit. After the first round of combat, there are four possible outcomes: both units are destroyed, the defender is destroyed, the attacker is destroyer, or neither are destroyed. In the first three outcomes the combat ends. In the last, another round commences with the same possible outcomes and the same probabilities, since the combat starts in the same state. The probabilities are,

\begin{align} P(\text{both destroyed in first round}) &= p^2 \\ P(\text{attacker destroyed in first round}) &= p\cdot (1-p) \\ P(\text{defender destroyed in first round}) &= p\cdot (1-p) \\ P(\text{neither destroyed in first round}) &= (1-p)^2 . \end{align}

Then, the probability of what happens in the second round is the same as the first, multiplied by the probability that you get to the second round.

\begin{align} P(\text{both destroyed in second round}) &= p^2\cdot (1-p)^2 \\ P(\text{attacker destroyed in second round}) &= p\cdot (1-p)^3 \\ P(\text{defender destroyed in second round}) &= p\cdot (1-p)^3 \\ P(\text{neither destroyed in second round}) &= (1-p)^4 . \end{align}

The probability that the combat ends with both destroyed is equal to the probability that it happens in the first round, plus the probability that it happens in the second round, and so on.

\begin{align} &P(\text{both destroyed at end of combat}) = \sum_{r=1}^\infty P(\text{both destroyed in round } r)\\ &= \sum_{r=1}^\infty P(\text{both destroyed in a round}) \cdot P(\text{get to round } r)\\ &= \sum_{r=1}^\infty p^2 \cdot (1-p)^{2(r-1)} \\ &= p^2 \sum_{r=1}^\infty \cdot \left((1-p)^2\right)^{r-1} \\ &= p^2 \sum_{r=0}^\infty \cdot \left((1-p)^2\right)^r \end{align}

Recall that (Honestly, right now I'm just relying on Wikipedia, but I'm pretty sure I have derived or will derive this.)

\begin{align} \sum_{i=0}^\infty a^i &= \frac{1}{1-a} \end{align}

Thus,

\begin{align} P(\text{both destroyed at end of combat}) &= p^2 \sum_{r=1}^\infty \cdot \left((1-p)^2\right)^r \\ &= p^2 \cdot \frac{1}{1 - (1-p)^2} \end{align}

Similarly,

\begin{align} P(\text{attacker destroyed at end of combat}) &= p \cdot (1-p) \cdot \frac{1}{1 - (1-p)^2} \\ P(\text{defender destroyed at end of combat}) &= p \cdot (1-p) \cdot \frac{1}{1 - (1-p)^2} \end{align}


As a check, do these probabilities add up to one? Let's find out.

\begin{align} &P(\text{both destroyed at end of combat}) \\ &+ P(\text{attacker destroyed at end of combat}) \\ &+ P(\text{defender destroyed at end of combat}) = \\ &=p^2 \cdot \frac{1}{1 - (1-p)^2} + p \cdot (1-p) \cdot \frac{1}{1 - (1-p)^2} + p \cdot (1-p) \cdot \frac{1}{1 - (1-p)^2} \\ &=p \cdot (p + 2\cdot (1-p)) \frac{1}{1 - (1-p)^2} \\ &=p \cdot (p + 2\cdot (1-p)) \frac{1}{1 - (1-2p+p^2)} \\ &=p \cdot (p + 2\cdot (1-p)) \frac{1}{2p-p^2} \\ &=p \cdot (p + 2\cdot (1-p)) \frac{1}{p\cdot(2-p)} \\ &=(p + 2\cdot (1-p)) \frac{1}{2-p} \\ &=(p + 2-2p) \frac{1}{2-p} \\ &=(2-p) \frac{1}{2-p} \\ &=1 \end{align}

So we're correct. (Initially I had $(1-p)^2$ in the numerator because I had $r$ instead of $r-1$ in the early equations above. As indicated by my errata, these checks are important.) That's mostly algebra, but I need a lot more explanation here.


We used an infinite series above, but we could instead argue that we can simply ignore the case where neither are destroyed in a round since we only care about the end state. Getting such a result simply delays the final outcome. Thus, we take the probabilities of the remaining three cases and divide by their sum to get the probability of ending in the corresponding states.

\begin{align} P(\text{both destroyed at end of combat}) &= \frac{P(\text{both destroyed in first round})}{P(\text{something destroyed in first round})} \\ P(\text{attacker destroyed at end of combat}) &= \frac{P(\text{attacker destroyed in first round})}{P(\text{something destroyed in first round})} \\ P(\text{defender destroyed at end of combat}) &= \frac{P(\text{defender destroyed in first round})}{P(\text{something destroyed in first round})} \end{align}

Where $P(\text{something destroyed in first round})$ is the sum of the three first-round probabilities above.

\begin{align} &\begin{split} P(\text{something destroyed in first round}) &= P(\text{both destroyed in first round}) \\ &+ P(\text{attacker destroyed in first round}) \\ &+ P(\text{defender destroyed in first round}) \end{split} \\ &= p^2 + p\cdot (1-p) + p\cdot (1-p) \\ &= p^2 +2\cdot p - 2\cdot p^2 \\ &= 2 \cdot p - p^2 \\ &= p \cdot (2-p) \end{align}

Plugging in, we find that,

\begin{align} P(\text{both destroyed at end of combat}) &= \frac{p^2}{p \cdot (2-p)} \\ P(\text{attacker destroyed at end of combat}) &= \frac{p\cdot (1-p)}{p \cdot (2-p)} \\ P(\text{defender destroyed at end of combat}) &= \frac{p\cdot (1-p)}{p \cdot (2-p)} \end{align}

These are the same probabilities that we calculated earlier. Note that the two different versions of the equations are written a little differently, but $1-(1-p)^2 = p\cdot (2-p)$.


That's enough for now. I'll later expand this same idea to more complicated combat mechanics.

Monday, October 12, 2020

Errata: Are you a Cylon in Battlestar Galactica?

There's an error in my previous analysis of the probability of me being a Cylon if you've gotten to look at one of my loyalty cards randomly and it was a Human card. We seemingly innocently transformed the previous scenario into the condition that I have one or more human cards, which are not quite the same thing. While it is true that I have one or more human cards, you have additional information: that a randomly selected card was human. Actually, even if you made the statement about having one or more Cylon cards, we'd still be in the same situation, as all Cylon detection abilities either allow you to look at all Loyalty cards or one randomly (see this handy chart: https://boardgamegeek.com/filepage/41329/crisis-card-chart). If there were an ability that allowed the target to select the card to be observed, the analysis would hold.

To see what's happening here, let's look at a three player game to make the number of cases simpler. You, Chelsea, and I are playing and you get to look at one of our loyalty cards. All of us have two cards, you know you're human, so there's one Cylon card among the four loyalty cards that Chelsea and I have. I could either have the Cylon card or not, without loss of generality we can say that it's my "first'' card. You randomly get one of my cards. Table 1 shows all the cases, which occur with equal probability.


\begin{align*} \begin{array}{c|c|c|c|c} \text{Scenario #} & \text{Loyalty card 1 } & \text{Loyal card 2} & \text{Card # observed} & \text{Card observed} \\ \hline 1 & \text{Cylon} & \text{Human} & 1 & \text{Cylon} \\ 2 & \text{Cylon} & \text{Human} & 2 & \text{Human} \\ 3 & \text{Human} & \text{Human} & 1 & \text{Human} \\ 4 & \text{Human} & \text{Human} & 2 & \text{Human} \\ \end{array} \end{align*}

Table 1: Cylon Detection in Battlestar Galactica


As we can see here, the probability of me being a Cylon given that you randomly looked at one of my cards, which was human, is $1/3$. There are three cases where you see a human card, one of which involves me having a Cylon card.


Let's go back to the five player case. The differentiating factor is that the probability of me getting one Cylon card is not $1/2$, but instead $24/56=3/7\approx0.429$ as we computed before, and the probability of me getting two human cards is $30/56=15/28\approx0.536$. We don't care about the two Cylon card case, as we're only looking at the scenarios where you see a human card. The probability of that happening is as follows.

\begin{align} P(\text{see human}) &= \underbrace{0.5}_\text{Chance to see each of my cards.} \cdot P(\text{1 Cylon}) + P(\text{human})\\ &=0.5 \cdot \frac{24}{56} + \frac{30}{56} \\ &=\frac{12+30}{56} \\ &=\frac{3}{4}\\ &=0.75 \end{align}

The probability that I am Cylon, given that you see a human card from me uses this probability in the denominator.

\begin{align} P(\text{Cylon} | \text{see human}) &= \frac{P(\text{1 Cylon} \cap \text{see human})}{P(\text{see human})} \\ &= \frac{0.5 \cdot P(\text{1 Cylon})}{P(\text{see human})} \\ &= \frac{0.5 \cdot \frac{24}{56}}{\frac{3}{4}}\\ &=\frac{12}{56} \cdot \frac{4}{3}\\ &=\frac{4}{14}\\ &=\frac{2}{7} \end{align}

This is the same probability as if we saw that the first loyalty card was human before the sleeper agent phase.

Monday, October 5, 2020

Isn't it better to use a dice deck in Catan?

First of all, better is a value judgement. Let's examine the properties and you can evaluate whether you like them better or not.

The idea of a dice deck in Catan is that you have a deck of 36 cards, where instead of rolling 2d6 you flip over a card and it gives you the number for resource generation. (See here, although I'll ignore the events and the New Year card.) The values on the cards match the distribution of 2d6, such that the first draw has the same probability as dice. However, if you draw the deck to exhaustion, then you're guaranteed to get exactly five results of "6".


Let's start with a question about using dice: if the probability of rolling a six is $5/36$, then do we expect to roll 5 sixes out of 36 rolls? First, let me say that in probability, expect is a word with particular meaning. It refers to the average, or mean value, or something. We write it as an operator using a special capital E. It being an operator means that we write it to the left and it operates on whatever is to the right of it. So when we write $\mathbb{E} X$, we're taking the expectation of the random variable $X$. I'm writing with special typesetting tools (Well, just $\LaTeX$.), but if you're writing with pencil and paper, on a blackboard, or on an internet forum, you may not have access to such special symbols. Then just a capital E will do.


So our question, reformulated is whether the expected value of the number of occurrences of the number six is $5/36$—i.e. $P(\text{rolling a six on 2d6})$—or is it some other value? To find out, let's calculate the expected value of the number of occurrences of the number six on 2d6. Wow, that's long winded. Let's assign it a variable name, $Y$.

\begin{align} Y = \text{number of results equal to six across 36 rolls of 2d6}. \end{align}

How do we count this? Let's assign names to these 36 rolls. We'll call the results of the 36 2d6 rolls $X_1$—$X_{36}$.

As we've been saying, we know from our previous analysis of 2d6 that

\begin{align} P(X_i = 6) = \frac{5}{36} \quad i \in \{1,2,\cdots,36\}. \end{align}

Recall that the probability of something not happening is equal to 1 minus the probability of it happening, or

\begin{align} P(\text{not some thing}) = 1 - P(\text{some thing}). \end{align}

So

\begin{align} P(X_i \neq 6) = 1 - P(X_i = 6) = 1 - \frac{5}{36} = \frac{31}{36}. \end{align}

Let's define the random variables $Y_1$—$Y_{36}$ to be 1 if $X_i=6$ and 0 if $X_i \neq 6$. We write this as,

\begin{align} Y_i = \begin{cases} 1 & X_i = 6 \\ 0 & X_i \neq 6 \end{cases} \quad i \in \{1,2,\cdots,36\}. \end{align}

From this definition, we can see that the sum of these $Y_i$ is the count $Y$ we are looking for.

\begin{align} Y = \sum^{36}_{i=1} Y_i \end{align}

What then, is the expected number of sixes $\mathbb{E}Y$? Expectation is a linear operator. That means a few things, that we'll go into elsewhere, but what's important here is that it means that since $Y_i$ are independent, we can distribute the expectation across the different terms in the summation $Y_i$. Basically, we can move it from the left to the right of the summation, like this

\begin{align} \mathbb{E}Y = \mathbb{E} \sum^{36}_{i=1} Y_i = \sum^{36}_{i=1} \mathbb{E}Y_i. \end{align}

As mentioned above, the expected value of a random variable is the average so,

\begin{align} \mathbb{E}Y_i &= \sum^1_{y=0} y \cdot P(Y_i = y). \end{align}

Note that the summation above is from 0 to 1 because those are the only two values possible for $Y_i$. Using the probabilities of $X_i = 6$ and $X_i \neq 6$ above, we can find

\begin{align} \mathbb{E}Y_i &= 0 \cdot P(Y_i = 0) + 1 \cdot P(Y_i = 1)\\ \mathbb{E}Y_i &= 0 \cdot P(X_i \neq 6) + 1 \cdot P(X_i = 6)\\ \mathbb{E}Y_i &= 0 \cdot \frac{31}{36} + 1 \cdot \frac{5}{36}\\ \mathbb{E}Y_i &= \frac{5}{36}. \end{align}

Plugging this back into our equation for $\mathbb{E}Y$ above yields,

\begin{align} \mathbb{E}Y &= \sum^{36}_{i=1}\frac{5}{36} \\ \mathbb{E}Y &= 36 \cdot\frac{5}{36} \\ \mathbb{E}Y &= 5 . \end{align}

So we do expect to get 5 sixes when rolling 2d6 36 times. Note that above, we changed from summing $5/36$ 36 times to multiplies $5/36$ by 36. That's because those are equivalent. That's what multiplication is! We can do this whenever we are summing a constant value.


The same thing also holds for all other result here, not just six.


We can also ask, what's the distribution of $Y$? Or, what's the probability of getting 5 sixes, or 4, or 3, or however many? We could try to follow the approaches we've used to find the distribution of the sums of dice. But we have 36 rolls here, that doesn't sound like a fun calculation. On the other hand, each $Y_i$ has a simple distribution, so maybe it's not so bad.


Fortunately, this is a problem with an existing solution that has been well studied: the binomial distribution. We've used it before, but this time we'll derive what the distribution of $Y$ is. It's probably not that bad. Let's change our notation a little bit, though. The binomial distribution is a parameterized distribution, which means it takes parameters. Basically, there's a bunch of different versions of the distribution. A normal (I mean, regular) single die has a uniform distribution, but the particulars of the uniform distribution depend on the number of faces of the die as a parameter. Similarly a binomial distribution, written as $Y ~ B(n, p)$ takes two parameters: the number of trials, $n$, and the probability of the success of each one, $p$. The number of trials is the number of times that we roll the die. Here we'll use 36 trials to match the 36 cards in the dice deck. The probability, $p$, depends on which face we're focused on. For a result of six, the probability of getting that value in each trial, which we call a success, is $5/36$.


So then in each trial we either get a success, with probability $p$, or failure with probability $1-p$. We're interested in the count of the successes. The extremes are relatively easy to calculate. The probability that we get $n$ successes in $n$ trials is the probability that we succeed on every roll.

\begin{align} f_Y(n) = p^n \end{align}

Similarly, the probability that we get no successes is the probability that we fail on every roll.

\begin{align} f_Y(0) = (1-p)^n \end{align}

Now, let's consider some result in between, where we get $k$ successes. This could happen by the first $k$ rolls succeeding and the last $n-k$ rolls failing.

\begin{align} p^k \cdot (1-p)^{n-k} \end{align}

However, we could also fail the first $n-k$ rolls, and succeed in the last $k$.

\begin{align} (1-p)^{n-k} \cdot p^k \end{align}

Note that these occur with the same probability. We can construct several different distinct orders of success and failure that result in $k$ successes. All of them share this same $p^k \cdot (1-p)^{n-k}$ term, but we need to multiply this by the number of ways that we can construct a sequence with $k$ successes in $n$ trials. We need the number of ways to select $k$ elements out of a set of size $n$. We know that this is $\binom{n}{k} = \frac{n!}{k! (n-k)!}$. Thus, we can write the pmf of $Y$ as follows.

\begin{align} f_Y(k) &= \binom{n}{k} \cdot p^k \cdot (1-p)^{n-k} \\ f_Y(k) &= \binom{36}{k} \cdot \left(\frac{5}{36}\right)^k \cdot \left(1-\frac{5}{36}\right )^{n-k} \end{align}

By replacing $p$ with the probabilities for each valid result on 2d6, we can get the distributions for the number of times that each of those results appear across a sequence of 36 rolls. These distributions are plotted in Fig. 1. Note that these random variables are not independent. We analyzed each separately, but it's clear that in the case when there are an excess of one value, it decreases the probability of getting a large number of another value.


Figure 1: Probabilities of rolling 2d6 

We can contrast that with the certainty of a dice deck expressed in terms of the probability mass functions shown in Fig. 2. Here we know the number of times that each value appears, and so each value has a probability of 1 at the number of times that it occurs in the deck. While these values correspond to both the average and peak values of the distribution of 2d6, it does not capture the sizable variation. This is particularly evidence for the results that come up more rarely. It's almost as likely to get no twos across 36 rolls as to get 1 two. At the other extreme, if we consider the number of sevens, we see that the probability of getting exactly the expected number of sevens is significantly lower than for twos. Getting 4 or 7 of them, instead of 6,may seem close to the average and within desired variation, but neither is possible with the dice deck.


Figure 2: Probabilities using a dice deck


There are a couple of extreme cases where what we get compared to what we want may break down with the dice deck. First, if near the end of the game enough of the deck remains that it's pretty clear that it will not get reshuffled, but certain values have already been exhausted, the benefits of building certain settlements and cities is quite different than with dice. With dice you can build on a twelve and hope that it pays out, but with the dice deck you may know that it never will. The other extreme case is when the deck does get reshuffled very near the end of the game. Through most of the game you've been playing with the knowledge that the deck ensures that the rolls across several turns produce a set distribution. However, if you end shortly after a reshuffle, you now have very similar probability properties to that of dice, where there's the possibility that something more unexpected may come up.


By modifying the simple dice deck that we considered you can adjust the properties between these two extremes. For example, could reshuffle before reaching the end of the deck. By adjusting the time of reshuffling, you can slide on a spectrum between a our pure example and using dice. Of course, if you shuffle after every draw you get the same probabilities as dice. In that case, dice are better, because they're so much faster to use.


So far, we've covered differences between dice and a deck of cards which basically boil down to this: a deck has memory. What we draw on one term influences the next, because the state of the deck is changed. If you draw a seven, it's less likely you'll draw another until the deck is reshuffled. On the other hand, each roll of the dice is independent. There's no influence of one roll on the next.


There's another fundamental difference between the two: the deck is shuffled at the beginning. This means that the result of each draw is predetermined (unless you believe the entire deck can be in a state of superposition). If you make a poor choice, you can wallow in regret, as you knew that if you had made a different choice you could have had a better outcome. Not so with dice. Following the butterfly effect, any decisions you make before a roll influence that roll. This is not in a controllable way, but just that minute differences in the timing or action of a die roll may yield a different result. Thus, no second guessing is necessary, as if you had decided to do something different, the dice may still have been against you.