Figure 1: PMF of 3d6 compared to a bell curve |

Let's go back and derive the pmf we just saw.

At first we might attempt to extend the method we used to compute the pmf of 1d$N$+1d$M$ in our discussion related to the dots in Catan. This works up to a point. We know there are $6^3= 216$ possible outcomes from rolling 3d6. For each sum we can count the number of ways to generate that sum with three numbers valued 1–6 by looking at the number of ways to put two separators on the inside of that many pips. For example, for a sum of 6, we can see that there are 5 choose 2, or $\binom{5}{2} = 5! / (3! \cdot 2!) = 10$ ways to do it. The 5 slots are illustrated below.

* | * | * | * | * | *

Unfortunately, this breaks down as we get to larger sums. For example, let's consider a sum of 9. We might think that there are 7 choose 2 possible combinations. However, some of those are invalid, like the one below.

* | * * * * * * * | *

Here we've chosen the separators such that the three dice sum to 9 with the values 1, 7, and 1. However, 7 is not a valid result on 1d6. We've over counted. We may be able to figure out how much we're over counting by and subtract that number out. However, it's going to be kludgey and probably not give us an equation from which we can glean anything intuitively. Thus, I'm not very interested in doing that. Let's find a better way.

To do that, I'll introduce the probability generating function (pgf). This is related to the moment generating function and the characteristic function of a random variable in ways that we won't get into here. The probability generating function $G_X(z)$ of a non-negative random variable $X$ with probability mass function $f_X(x)$ is defined as the expected value of $z^X$.

\begin{align}

G_X(z) &= \mathbb{E} z^X \\

&= \sum_{x=0}^\infty f_X(x) z^x

\end{align}

This looks very much like the $z$-transform of the probability mass function, the only exception being that we have a positive exponent of $z$, whereas the $z$-transform has a negative exponent. However, given that this function is only expected to converge for complex values of $z$ where $|z| \leq 1$, this difference is relatively minor. This gives us access to a bunch of tools and properties if we're familiar with the $z$-transform, which is related to Fourier and Laplace transforms, but for discrete-time signals. We won't get into all of them, but I'll show you the one we'll use here.

Let's say that $Y$ is the sum of rolling 3d6, where each die has its own result given by $X_1$, $X_2$, and $X_3$. The pmf of any of $X_1$, $X_2$, and $X_3$ is given as follows.

\begin{align}

f_X(x) &= \begin{cases}

\dfrac{1}{6} & \qquad x \in \{1, 2, 3, 4, 5, 6\} \\

0 &\qquad \text{else}

\end{cases}

\end{align}

Thus, the pgf is given by.

\begin{align}

G_X(z) &= \frac{1}{6} \left ( z^1 + z^2 + z^3 + z^4 + z^5 + z^6 \right )

\end{align}

Looking at the pgf of $Y$, we find an interesting property.

\begin{align}

G_Y(z) &= \mathbb{E} z^Y \\

&= \mathbb{E} z^{X_1+X_2+X_3} \\

&= \mathbb{E} z^{X_1} \cdot z^{X_2} \cdot z^{X_3} \\

&= \mathbb{E} z^{X_1}\mathbb{E}\cdot z^{X_2}\mathbb{E}\cdot z^{X_3} \\

&=G_{X_1}(z) \cdot G_{X_2}(z) \cdot G_{X_3}(z) \\

&= G_X(z)^3

\end{align}

When we sum random variables, the pgf of the sum is the product of the pgfs of the addends. Let's do this first for the 2d6 case, where $Y_2 = X_1 + X_2$.

\begin{align}

G_{Y_2}(z) &= G_X(z)^2 \\

&= \left( \frac{1}{6} \left ( z^1 + z^2 + z^3 + z^4 + z^5 + z^6 \right ) \right)^2 \\

&= \frac{1}{6^2} \left ( z^1 + z^2 + z^3 + z^4 + z^5 + z^6 \right ) ^2

\end{align}

Now let's multiply this out.

\begin{align*}

G_{Y_2}(z) = \frac{1}{6^2} ( z^2 +& z^3 + &z^4 +& z^5 +& z^6 +& z^7 + &&&&& \\

&z^3 + &z^4 + &z^5 + &z^6 + &z^7 + &z^8 +&&&&\\

&& z^4 + &z^5 + &z^6 + &z^7 + &z^8 + &z^9+&&&\\

&& &z^5 + &z^6 + &z^7 + &z^8 + &z^9 +&z^{10}+&&\\

&& & &z^6 + &z^7 + &z^8 + &z^9 +&z^{10} +&z^{11}+&\\

&&& & &z^7 + &z^8 + &z^9 +&z^{10}+&z^{11}+&z^{12} ) \hfill\\

\end{align*}

For our simple case, I think it's pretty clear that $f_{Y_2}(y)$ is the coefficient of the $z^y$ term. By grouping the like terms, which I've aligned vertically, we can see the familiar triangular shape of the 2d6 pmf.

A more computationally efficient way of looking at this is to take advantage of a property of $z$-transforms. Namely, that multiplication in the $z$ domain is convolution in the $x$ domain. (This is often stated as multiplication in the frequency domain is convolution in the time domain, although those are not the types of variables we're looking at. The opposite is also true: multiplication in the time domain is convolution in the frequency domain.) The convolution of two functions of integers, $f()$ and $h()$, is written as follows.

\begin{align}

f(x) * h(x) &= \sum_{n=-\infty}^\infty f(n) \cdot h(x-n)

\end{align}

Convolution is symmetric, so we can write either function first.

\begin{align}

f(x) * h(x) &= \sum_{n=-\infty}^\infty f(x-n) \cdot h(n) \\

&= h(x) * f(x)

\end{align}

Let's understand what this is doing graphically a little bit. The following figures depict various values of the convolution computation. The first plot in each figure shows one pmf function. The second plot shows the second pmf, which is flipped horizontally and shifted by the value at which the convolution is being computed. The third plot shows the multiplication of these two functions for each value of $n$. The fourth plot shows the result of the convolution, which at each value of $x$ is equal to the sum of the third plot across all $n$ for that value of $x$. As $x$ changes, it moves the flipped pmf horizontally. When it overlaps with the first pmf the result of the convolution is nonzero. When the two pmfs line up the convolution is maximized. Since the pmfs are either $1/6$ or 0, the product term takes on the value $1/36$ for some set of $n$. As the overlap increases, the result of the convolution increases linearly. Note that for $x=8$ the overlap starts decreasing, and so does the result of the convolution.

Figure 2: Illustrating convolution to find 2d6 pmf |

We can repeat this process to find the pmf of 3d6 by convolving the previous result with the pmf of 1d6. In this case, the product term increases linearly once the overlap starts and continues through $x=8$. Up to the point, it's somewhat clear that the resulting pmf is given by,

\begin{align}

f_Y(y) &= \sum_{n=2}^{y-1} f_{Y_2}(n) .

\end{align}

Since $f_{Y_2}(n)$ is linear in that range, we can easily substitute it in.

\begin{align}

f_Y(y) &= \sum_{n=2}^{y-1} \frac{n-1}{6^2} \\

&= \sum_{n=1}^{y-2} \frac{n}{6^2}

\end{align}

You may recall that the sum of integers 1 to $n$ is equal to $n\cdot(n+1)/2$. If not, we can quickly prove this by induction. Consider $n=2$, the sum,

\begin{align}

\sum_{k=1}^2 k &= 1 + 2 = 3.

\end{align}

And when we evaluate $n\cdot(n+1)/2$ for $n=2$ we get $2 \cdot (2+1) / 2 = 3$. So it holds here. Now, let's assume it's true for $n$ and check if it holds for $n+1$.

\begin{align}

\sum_{k=1}^{n+1} k&= \left ( \sum_{k=1}^{n} k \right )+ (n+1) \\

&= \frac{n \cdot (n+1)}{2} + (n+1) \\

&=\frac{n \cdot (n+1) + 2\cdot(n+1)}{2} \\

&=\frac{(n+2) \cdot (n+1)}{2} \\

&=\frac{(n+1) \cdot (n+2)}{2}

\end{align}

We see that it does, thus it hold for all integer $n$. Multiplying this out we see clearly that this is a quadratic function.

\begin{align}

\sum_{k=1}^{n} k &= \frac{n \cdot (n+1)}{2} = \frac{1}{2}n^2 + \frac{1}{2} n

\end{align}

This holds for the other regions of this function as well. In general, if we calculate a running sum of a linear function, we'll get a quadratic function. If we do it piecewise, it holds piecewise. This extends to higher orders as well, as expressed in Faulhaber's formula.

Below we see the same plots showing how $f_Y(y)$ is calculated using the following convolution.

\begin{align}

f_Y(y) = f_{Y_2}(y) * f_X(y)

\end{align}

Figure 3: Illustrating convolution to find 3d6 pmf |

This solution appears unsatisfying in some ways. We have a result we can express in an equation, but we can't look at it directly and glean much information. However, what we often care about are having an intuitive understanding of what the result is and an easy way to calculate it numerically. We do have both of those. The central limit theorem and the properties of convolution we looked at here gives us the intuitive understanding. We can also extend this by looking at specific critical values, such as the extreme and central values. Calculating it is also simple. Many tools have simple functions to compute this convolution, so that we can easily tabulate or plot any specific case that we're interested in.

## No comments:

## Post a Comment