Monday, April 6, 2020

How often am I going to hit in D&D?

There are several versions of Dungeons and Dragons (D&D), but the to-hit mechanics generally involve trying to roll high on a d20. (A d20 is a 20-sided die numbered 1–20. We'll use this terminology often, especially in the context of roleplaying games. Similarly 2d6 refers to rolling two six-sided dice, each numbered one through six, generally taking the sum.) The specifics vary, so we'll focus on the most recent (5th) edition as of this writing. The rules state (https://media.wizards.com/2018/dnd/downloads/DnD_BasicRules_2018.pdf p. 76) that you roll 1d20, add modifiers, and compare that to a target number equal to the Armor Class (AC) of your target plus other modifiers. If the roll plus modifiers is equal to above the target number, you hit. Otherwise, you miss. If we let $D$ be a random variable that represents the result of the d20 roll, $m$ is equal to the modifiers, and $t$ is equal to the target number, then we can write this mathematically as follows.
\begin{align}
D + m & \ge t && \text{Hit}\\
D + m & <   t && \text{Miss}
\end{align}
We can rearrange this to create an effective to-hit number $t' = t - m$, since the probabilities only depend on this difference, not the parameters individually.
\begin{align}
D  & \ge t' = t - m  && \text{Hit}\\
D  & <   t' = t - m  && \text{Miss}
\end{align}
The edge cases are obvious. If $t' \leq 1$, then a hit is guaranteed, since you always roll at least 1. Similarly, if $t' > 20$, then a miss is guaranteed, since you cannot roll higher than 20 on a d20. (If you're screaming at me for missing a rule, just keep reading, I'll get to it.)

For the rest of the cases, we look at the probability. Assuming a fair die with a fair roll, then the probability of getting any of the sides is equal. This is essentially our definition of a fair die, but it makes sense. If you ignore the numbers, each of the faces are symmetrical; you can reposition the die with any face up and the appearance of the shape remains the same. We call this a uniform distribution, because the probability of rolling a number is uniformly distributed between the integers 1 through 20. We can write this as follows.
\begin{align}
P(D = d) = \begin{cases}
0 &\qquad d < 1 \\
0 &\qquad d > 20\\
\frac{1}{20} &\qquad d \in \mathbb{Z}, 1 \leq d \leq 20 \\
0 &\qquad \text{else}
\end{cases}
\end{align}
I've tried to both be precise and follow our train of thought. (While I endeavor to be precise, I may occasionally get lazy or overlook something.) To explain the precise bit: $d \in \mathbb{Z}$ means that $d$ is in the set of all integers. We can simplify this to remove the multiple zero cases.
\begin{align}
P(D = d) = \begin{cases}
\frac{1}{20} &\qquad d \in \mathbb{Z}, 1 \leq d \leq 20 \\
0 &\qquad \text{else}
\end{cases}
\end{align}
This is essentially the probability mass function, similar to the probability density function, whose name you might recall, but for discrete-valued random variables. (Discrete-valued means the values may only come from a list of discrete, or distinct, values. We'll often be concerned with integers or whole numbers, anything in between is not allowed. Discrete is opposed to continuous, meaning that the value can be anything along a spectrum. A digital clock shows discrete values, an analog clock shows continuous values.) This is often written as $f_D(d) = P(D = d)$. Often it's useful to plot this function, although here it's fairly uninteresting.

But we are interested in the probability of hitting, $P(\text{hit}) = P(D \geq t')$. We can get this by summing up all the cases that satisfy a hit. If we knew nothing about the probability mass function (pmf), then we'd have to compute the sum of the pmf from the effective to-hit number, $t'$, to infinity ($\infty$). Instead, we can sum up to  the maximum: 20.
\begin{align}
P(\text{hit}) &= P(D \geq t')\\
 &= \sum_{d = t'}^{20} f_D(d)  \\
 &= \sum_{d = t'}^{20} P(D = d)
\end{align}
We can then substitute in our previous expression for $P(D = d)$.
\begin{align}
P(D \geq t') = \begin{cases}
1 & \qquad  t' \leq 1 \\
\sum\limits_{d = t'}^{20} \frac{1}{20} &\qquad t' \in \mathbb{Z}, 1 < t' \leq 20 \\
0 & \qquad t' > 20
\end{cases}
\end{align}
Now, if we look at that summation, we can write it more explicitly.
\begin{align}
\sum\limits_{d = t'}^{20} \frac{1}{20} &= \underbrace{\frac{1}{20}}_{d=t'} + \underbrace{\frac{1}{20}}_{d=t'+1} + \ldots + \underbrace{\frac{1}{20}}_{d=20}\\
\sum\limits_{d = t'}^{20} \frac{1}{20} &= \underbrace{\frac{1}{20} + \frac{1}{20} + \ldots + \frac{1}{20}}_{20 - t' + 1 \text{terms}} \\
\sum\limits_{d = t'}^{20} \frac{1}{20} &= \frac{20-t'+1}{20} \\
\sum\limits_{d = t'}^{20} \frac{1}{20} &= \frac{21-t'}{20}
\end{align}
Thus, putting this together we can get the following equation.
\begin{align}
P(\text{hit}) = \begin{cases}
1 & \qquad  t' \leq 1 \\
\frac{21-t'}{20} &\qquad t' \in \mathbb{Z}, 1 < t' \leq 20 \\
0 & \qquad t' > 20
\end{cases}
\end{align}
The middle term is consistent with the other cases when $t' = 1$ and when $t' = 21$. The first needs to be true, because 1 is a valid effective target number. The fact that
\begin{align}
\left .\frac{21-t'}{20}\right \vert_{t'=1} = \frac{21 - 1}{20} = 1
\end{align}
is true is a check that we derived our equation correctly. The second case, $t' = 21$, is possibly a happy accident, but it can give us the warm and fuzzies.

Whoops!  We forgot some of the attack rules. Rolling a natural 1 always misses, and rolling a natural 20 always hits. Here a natural means the unmodified result of the die, $D$. This means the probability of hitting is at most $\frac{19}{20} = 0.95$ and at least $\frac{1}{20} = 0.05$. We need to replace the endpoints of our equation with these values.
\begin{align}
P(\text{hit}) = \begin{cases}
0.95 & \qquad  t' \leq 2 \\
\frac{21-t'}{20} &\qquad t' \in \mathbb{Z}, 2 < t' < 20 \\
0.05 & \qquad t' \geq 20
\end{cases}
\end{align}
Here we've also modified when the endpoints occur. If the target number of 2 or less, it's all the same. Rolling anything except a 1 succeeds. Similarly, if the target value is 20 or higher, it's all the same because rolling a 20 always hits.

We've derived the answer to the question, but it doesn't tell the whole story. We've skipped the implications of rolling a 20, which also affects how many dice are rolled for damage. We've also ignored advantage and disadvantage. These are important considerations, but we'll save them for another day.

Note that if the cumulative distribution function (CDF) were the other way around, it would be very useful for us here. We'll probably get into this later.

No comments:

Post a Comment