Tools of the Trade

As we work our way through some entry level statistics it will be useful to keep track of various “tools” we have at our disposal. I’ll try my best to keep these in the order they appear, starting with some “axioms” we’ll use to get started. I don’t know if we’re going to use 100% of the starting axioms, but I’m going to include them all anyway. Check back periodically for additions as we discover them.

1 Beginning Equations

1.1 Wallis Product

This is an odd little identity that I thought was nothing more than a fun curiosity until I started researching more for these posts. As it turns out, this is a very valuable tool that was discovered by John Wallis in 1656. It says: $$ \prod _{n=1}^{\infty} \frac{2n}{2n-1}\cdot \frac{2n}{2n+1} = \left(\frac{2}{1} \cdot \frac{2}{3} \right) \left(\frac{4}{3} \cdot \frac{4}{5} \right) \left(\frac{6}{5} \cdot \frac{6}{7} \right) ... = \frac{\pi}{2} $$ Definitely a little surprising seeing the product of rational numbers equaling $\frac{\pi}{2}$. It turns out with some clever algebra, the product can be rewritten like so: $$ \lim_{n \rightarrow \infty} \frac{2^{4n}(n!)^{4}}{((2n)!)^{2}(2n+1)} = \frac{\pi}{2} $$ This might look like absolute useless garbage, but it turns out it will be useful to state it this way in certain instances. The proof of this formula can be found in the section 1.2. Also for what it's worth, I've done a little poking around and it turns out that a lot of seemingly random $\pi$ terms appearing in fomrulas and identities can be explained with this formula, which brings us to our second tool: \par

1.2 Stirling’s Approximation

Stirling's Approximation is very useful formula that allows us to approximate factorials using more elementary operations. It is as follows: $$ n! \approx \sqrt{2\pi n}\left( \frac{n}{e} \right)^{n} $$ It might be immediately obvious why we would want to approximate $n!$ when frankly it's not that difficult to work with, but at the very minimum $n$ is not required to be an integer which can help us extend factorial definitions beyond integers. (Note there is an exact function that does this known as the gamma function, but we're going to stick with this for right now.) The source at the bottom contains an excellent proof of the previous two formulas that only requires some calc II as a prerequisite. It uses integration by parts to derive the Wallis Product, and then uses some cleverly defined series to derive Stirling's Approximation, and yes the $\pi$ in Stirling's Approximation comes from the Wallis Product. Many other derivations of Stirling's Approximation that I found rely on the use of the Euler-Maclaurin formula. There's nothing wrong with this (an extremely powerful tool for approximating series using integrals and vice-versa) but I believe I've read that Stirling would have used the Wallis Product to derive it, so I prefer this proof. (Also full disclosure I don't yet have a great grasp of the Euler Maclaurin formula).

1.3 Binomial Distribution

Let's define a random variable X that represents the number of successes in a given number of random trials. If we are running $n$ trials with a probability of success $p$, the likelihood of getting exactly $k$ successes is: $$ P(X = k) = {n \choose k}p^{k}(1-p)^{n-k} $$ Feel free to google a proof of this one if you like. They're all pretty similar.

1.4 Linearity of Expectation

If we have two independent random variables, $X_{1}$ and $X_{2}$, we can state the following: $$ E(X_{1}+ X_{2}) = E(X_{1}) + E(X_{2}) $$ This might seem kind of obvious, but this is very useful result. There are absolutely no conditions on the distributions for $X_{1}$ and $X_{2}$ other than that they're independent.

1.5 Properties of Mean and Variance

Here are some useful properties of the mean and variance of random variables that we'll use quite a bit. Note that I'll use "the expected value of $X$", or $E(X)$, and the mean, or $\mu$, interchangeably. I'm not sure if that's correct but it's a habit I can't seem to break. Now, let's define a random variable $X$. If $X$ is discrete and can take on all values in some set $K$ with probability mass function $P$, we have the following expression for the mean: $$ E(X) = \mu = \sum_{k \in K} k\cdot P(X=k) $$ And if X is continuous for all values between $a$ and $b$ with probability density function $f$, we have the following: $$ E(X) = \mu = \int_{a}^{b} x f(x)dx $$ Shouldn't be too much of a surprise that the two formulas look kind of similar. Anyway, using those two definitions above we can derive the following expression for the expected value of $X$ multiplied by some constant $a$: $$ E(aX) = aE(X) $$ Moving on from the mean, we define the variance of $X$, or $var(X)$ as follows: $$ var(X) = \sigma^{2} = E((X-\mu)^{2}) = E(X^{2}) - (E(X))^{2} $$ The last form tends to be a slightly more convenient way to calculate variance for arbitrary random variables. Note that using the above equation, we have the following property for any constant $a$: $$ var(aX) = a^{2}var(X) $$

1.6 Source for Stirling and Wallis proof

Previous
Previous

Deriving the Normal Distribution