Monday, July 27, 2020

What is the best weapon in D&D?

There's something interesting about the damage progression for the Monk class in D&D 3.5. For a Medium sized Monk, the damage progression is as shown in Table 1 (source: SRD http://www.d20srd.org/srd/classes/monk.htm).

\begin{align*}
\begin{array}{c|c}
\text{Level} & \text{Damage Dice} \\ \hline
\text{1--3} & 1d6 \\
\text{4--7} & 1d8 \\
\text{8--11} & 1d10 \\
\text{12--15} & 2d6 \\
\text{16--19} & 2d8 \\
\text{20} & 2d10
\end{array}
\end{align*}
Table 1: Monk Damage Progression in D&D 3.5

For levels 1–11 things are pretty clear. Each time the damage dice increase it's strictly better. However, that's not true when moving from 1d10 to 2d6. Certainly, the minimum and maximum are both higher for 2d6 (2 and 12, respectively, as opposed to 1 and 10 for 1d10). The average is also better: 7 versus 5.5. But often we don't care about such metrics. For example, say our Monk is fighting some opponents that each have 5 hit points each. In such as case, we'd like to take them out in one hit as often as possible. Thus, the probability of rolling 5 or higher is of particular interest.

To consider the general case, let's look at the probabilities of rolling a given number or higher for each of the two damage rolls, as shown in Table 2.
\begin{align*}
\begin{array}{ccc}
x & P(1d10 \geq x) & P(2d6 \geq x) \\ \hline
1 & 1 & 1 \\
2 & 0.9 & 1 \\
3 & 0.8 & 0.972 \\
4 & 0.7 & 0.917 \\
5 & 0.6 & 0.833 \\
6 & 0.5 & 0.722 \\
7 & 0.4 & 0.583 \\
8 & 0.3 & 0.417 \\
9 & 0.2 & 0.278 \\
10 & 0.1 & 0.167 \\
11 & 0 & 0.0833 \\
12 & 0 & 0.0278
\end{array}
\end{align*}
Table 2: Monk Damage Probabilities at Levels 11 & 12

This is actually strictly better. But it makes you wonder whether there are cases where it looks like something is better, but it's actually worse. If we had looked at the pmf for 1d10 vs 2d6 we might have been misled, because at 10, 2d6 looks worse (lower probability of getting a 10), even though it is strictly better when considering the probability of rolling 10 or higher.

So let's change the question to be about the best weapon. There are many different weapons in Dungeons and Dragons. Even just looking at the 5th edition Player Basic Rules download, there is almost a full page table on page 47. Weapons differ in cost, dice rolled for damage, type of damage dealt, weight, and properties such as whether it is thrown, requires two-hands to use, etc..

To contain the discussion, let's first just focus on which does the most damage. A lot of the other characteristics of the weapon matter in very context-specific ways. Glancing through the table it's pretty easy to narrow down the list into two potential groups. Those weapons which do 1d12 damage (e.g. a greataxe), and those which do 2d6 damage (i.e. a maul). Both can do up to 12 damage. The maul always does at least 2 damage, so that's better. What about average damage? The average roll for a dN is $(N+1)/2$. This means the average greataxe damage is 6.5, while the average maul damage is 7.

Let's compare the distributions. We've already computed 2d6 for something else. 1d12 is pretty simple, $1/12$ for every possibility. Fig. 1 shows the probability mass functions (PMFs), while Fig. 2 shows the cumulative distribution functions (CDFs). We see some cases in which 2d6 is better, while in some cases 1d12 is better. Look at rolling at least a specific number, that's what matters, which is close to the complementary cumulative distribution function (CCDF), which is plotted in Fig. 3. Here we see that the probability of rolling towards the higher end of the damage distribution with a greataxe is greater than with a maul. Consider just the probability of rolling 12 or higher. With a greataxe the probability is $1/12$, with a maul it's $1/36$.

Figure 1: Damage PMF of Greataxe and Maul 

Figure 2: Damage CDF of Greataxe and Maul

Figure 3: Damage CCDF of Greataxe and Maul
Now let's consider enemies with different number of hit points. We'll look at three possibilities: a goblin with 7 HP, a hobgoblin with 11 HP, and a bugbear with 27 HP. We can find the probability of eliminating each of these enemies in a single round by evaluating the CCDF at one less than the number of HP. This is impossible for the bugbear, as there's no way to roll 27 damage in a single attack with these weapons (Note: throughout this analysis I'm ignoring any bonuses that may increase the damage. I'm also ignoring whether or not you actually hit first). For the goblin, looking at the CCDF at 6, we see that the maul has a higher probability of being eliminated in a single round. For the hobgoblin we get the opposite result. The CCDF at 10 is higher for the greataxe.

Looking at just the probability of elimination in the first round is a small part of the analysis. Next, we'll look at the distribution of the the round in which the target is eliminated, $R$. To determine this, we need a straightforward way to check whether or not we've eliminated a target in a given round. In this case, I found it easier to look at the CDF of elimination, that is, the probability that the target is eliminated by a certain round of combat. In round $r$, we've rolled damage for each weapon $r$ times. If the damage if at least equal to the target's HP, then it must have been eliminated by round $r$. We previously learned how to use convolution to compute the PMF of rolling multiple dice. We can apply this iteratively to get the PMF of rolling $r$d12 for a greataxe or $2r$d6 for a maul in total by round $r$. Performing the running sum to get the CDF from the PMF is straightforward. Then the CDF of the round of elimination, $F_R(r)$, is given by,
\begin{align}
F_R(r) = 1 - F_{D(r)}(h-1) ,
\end{align}
where $D(r)$ is the total damage by the relevant weapon by round $r$ and $h$ is the number of hit-points of the target. Running through the computations for our selected targets we get the plots in Figures 4, 5, and 6. We can see that the maul is almost universally better. The two exception points are the aforementioned eliminating a hobgoblin in 1 round, as well as the probability of eliminating a bugbear in 3 rounds.

Figure 4: Probabilities of eliminating a goblin by a given round 
Figure 5: Probabilities of eliminating a hobgoblin by a given round
Figure 6: Probabilities of eliminating a bugbearby a given round
While we risk the danger of the reductionism we saw earlier in looking at the average damage per roll for each of the two weapons, and noting that perhaps the result is obvious given the above distributions, there remains some allure in combining the above information into a single metric with which we can compare the two weapons and declare which is the best definitively, if not correctly. To accordance with this, we'll compute the expected value of the number of rounds to eliminate a target, $\mathbb{E}R$. We'll do this not just for the selected targets, but for a wide range of hit points, from 1 to 1000 (I did not compute every point in between, and started log spacing after 20 hit points to save time). To do this, two steps are required. First, we must find the CDF for each hit-point valued target. Second, we recall and use the following relationship to calculate the expected value.
\begin{align}
\mathbb{E} R &= \sum_{r=0}^\infty P(R > r) \\
&= \sum_{r=0}^\infty 1 - F_R(r)
\end{align}
We can combine this with our earlier equation for $F_R(r)$ to come up with a simpler expressoin based on the CDF of the total damage up through round $r$.
\begin{align}
\mathbb{E} R &= \sum_{r=0}^\infty 1 - (1 - F_{D(r)}(h-1) ) \\
& = \sum_{r=0}^\infty F_{D(r)}(h-1) \\
\end{align}
While the above sum is shown over an infinite range, we know that it is limited. In the case of the greataxe, since we do at least 1 damage per round, we need only consider up to $h$ rounds. Similarly, for the maul, which does at least 2 damage per round, we need only consider up to $\lfloor \frac{h}{2} \rfloor$ rounds. The results are plotted in Fig. 7. While it's hard to see for low values of the target's HP, it clearly shows that the maul requires fewer rounds, on average, for any number of hit points.

Figure 7: Expected number of rounds to eliminate a target vs. hit points

One might propose the following argument to approximate the number of rounds expected for elimination. Since we know the average damage per roll, all we must do is to take the target's HP and divide by this average damage to find the expected number of rounds to elimination. To aid in discussion of this thought, consider the plot in Fig. 8, which shows the expected number of rounds to elimination per target HP. Also plotted are asymptotes, which are equal to one divided by the average damage per round. That is, if this argument were true, the plot would coincide with these dotted lines. It is not true, and thus they do not coincide. However, they do seem to predict values which the curves are approaching asymptotically, as emphasized in Fig. 9. For a small number of hit points, the non-linear impacts of needing to roll a whole number of attacks means that the inverse of the average is a poor estimate. For example, to eliminate a target with only 1 HP, at 1 round is always needed. An attempt to incorporate this is shown in Fig. 10, which shows estimates of the $R$ per HP, by estimating $R$ as HP divided by the average damage per round, rounding up to the nearest integer number of rounds, and then dividing by HP. We can see that this shows a similar shape, but has discontinuities at various points, corresponding to when the number of rounds is rounded up to the next integer value. If we consider a target like the hobgoblin with 11 HP, there are a number of possible rounds of elimination. To determine $\mathbb{E}R$, first the distribution of $R$ is considered and then averaging is done. The plot in Fig. 10 averages first, and then tries to come up with the number of rounds. Since there is essentially non-linearity involved, we cannot change the order of the averaging without introducing error. The nonlinearity I'm talking about is this: in the round of elimination there may be some excess damage done. Going into that around you may only need to do 3 more points of damage, but it's likely that more damage will be done. Thus, this damage, while it contributes to the average damage per round, does not affect the number of rounds until elimination. As the number of hit points that a target has increases, this excess damage becomes less and less of an impact, which is why the plots approach the asymptotes. Across these plots we see that the maul is universally better than the greataxe.

Figure 8: Expected number of rounds per hit-point to eliminate a target vs. hit points 

Figure 9: Expected number of rounds per hit-point to eliminate a target vs. hit points (zoomed in)
Figure 10: Expected number of rounds per hit-point to eliminate a target vs. hit points compared to a crude estimate\label

That was a little disappointing. I was really expecting to find some scenarios where the greataxe is better. Sure, if we really need to eliminate a hobgoblin in 1 round it's better, but that's too specific to satisfy me. The consistency provided by rolling 2d6 just seems to overpower the slightly better chance of a high roll on 1d12. So what if we compare against something that we might expect to be worse. What if we compare 1d12 to 2d4? When we look at the same comparison, shown in Fig. 11, we see that for target HP less than 5, rolling 2d4 results in a faster expected elimination. Here, the higher minimum appears to prevail over the lower average damage, the latter of which wins for large target HP.

Figure 11: $\mathbb{E}R$ for 1d12 and 2d4

Monday, July 6, 2020

Modeling value in Century: Spice Road

When playing Century: Spice Road (or the Golem Edition, I imagine), it's easy to notice some patterns in the values of the cards.

The Merchant cards are less clear, so let me push them aside first. Not all Merchant cards are equal. Some comparisons are hard to make and depend a lot on the situation. For example, is it better to have a card that transforms 2 saffron (red) into 2 cardamom (green), or 3 saffron into 3 cardamom? If you only have two saffron, the first may be better. With three it's likely the latter. With four saffron we're back to likely preferring the first, as we can perform the trade twice. That is, unless we only need three cardamom and need to hold onto a saffron. You could perhaps come up with some metric that attempts to incorporate all of these types of considerations. However, there's a clear counter example. There's one Merchant card that gives you three turmeric (yellow) and another that gives you four. The latter is strictly better than the first. Even if you can't always make use of the fourth turmeric, it is all that the first card is and more. This one example is all we need to show that Merchant cards are not all balanced.

Now let's look at the Point cards.

Figure 1: Points cards

The pattern I noticed when looking at a bunch of these is that it appears that the number of points, $p$, follows a simple formula based on the number of turmeric (yellow), $y$, saffron (red), $r$, cardamom (green), $g$, and cinnamon (brown), $b$.
\begin{align}
p = y + 2r + 3g + 4b
\end{align}
That is, the number of points is equal to one point for each turmeric, while each higher level of spice is worth one more point each. This tracks for a large number of the Point cards (24 out of 36). Then you run into a card like the following.

Figure 2: Another Point card

Here we'd predict the card to be worth 12 points.
\begin{align}
12 = 1 \cdot 3 + 2 \cdot 1 + 4 \cdot 3 + 4 \cdot 1
\end{align}
However, it's worth two additional points. Why? If you look at this spreadsheet, posted by "GameSnake", it becomes quite clear what the pattern is. There's a column specifying the "Spice cost", matching our analysis up to this point, as well as a "Delta" column. This clearly shows that a Point card is worth an extra point if it requires three different types of spices to claim, and two extra points if it requires four different types of spices. This makes some sense, as it may be more difficult to produce a variety of different spices than just having a good combination to produce exclusively one or two. There's no difference between cards that require only one type of spice and cards that require two types of spice. So our revised equation becomes,
\begin{align}
p = y + 2r + 3g + 4b + \max(n-2, 0) ,
\end{align}

where $n$ is the number of distinct spices needed to claim the Point card. Note that $n$ is dependent on $y$, $r$, $g$, and $b$. We could write an equation without this intermediate variable, but it'd be quite a bit more cumbersome. This equation exactly predicts the value of every Point card in the game, which you can check against in Table 1.


Spices
Victory points Spice cost Delta
YYRR 6 6 0
YYYRR 7 7 0
RRRR 8 8 0
YYGG 8 8 0
YYRRR 8 8 0
YYYGG 9 9 0
RRGG 10 10 0
RRRRR 10 10 0
YYBB 10 10 0
YYGGG 11 11 0
YYYBB 11 11 0
GGGG 12 12 0
RRBB 12 12 0
RRRGG 12 12 0
RRGGG 13 13 0
GGBB 14 14 0
RRRBB 14 14 0
YYBBB 14 14 0
GGGGG 15 15 0
BBBB 16 16 0
RRBBB 16 16 0
GGGBB 17 17 0
GGBBB 18 18 0
BBBBB 20 20 0
YYRB 9 8 1
RRGB 12 11 1
YGGB 12 11 1
YYRRGG 13 12 1
YYRRBB 15 14 1
YYGGBB 17 16 1
RRGGBB 19 18 1
YRGB 12 10 2
YYYRGB 14 12 2
YRRRGB 16 14 2
YRGGGB 18 16 2
YRGBBB 20 18 2

Table 1: Century: Spice Road Points cards. (data from this spreadsheet)

Note: while I leveraged the table above, I did not read any existing analyses of the game.  I imagine that others have come to similar conclusions.