Z vs. T
However, we don't know the standard deviation. We can only get it from our original sample. Therefore, our new standardized variable is the quotient of two random variables: the mean of our sample, and the standard deviation of our sample. $$T = \dfrac{\bar{x} - \mu}{S/\sqrt{n}}$$ Now, how does this affect our distribution?
Shape of the Distribution
Simulation
To demonstrate this, I simulated 40,000 samples of size 3 from a normal distribution (mean = 1.5, stdev = 0.2)
import numpy as np from statistics import stdev from matplotlib import pyplot as plt deviations = [] for i in range(40000): deviations.append(stdev(np.random.normal(1.5, 0.2, 3))) plt.hist(deviations, bins = 30)
It produced this distribution:
As you can see, it is skew right. So, the mean of this distribution is less than its standard deviation. The mean is 0.1785, and the median is 0.1680.
From this, can we guess the shape of our final distribution?
Our numerator is a normal distribution, because it’s the mean of normal variables. Our denominator is skew-right, so we expect smaller values (meaning below the mean) more than larger values. If the denominator is smaller, then the quotient is greater.
So, we should expect to see more extreme values, and fewer values closer to 0 (implying a larger denominator). Therefore, this should look like the normal distribution with heavier tails.
Theory
So, we got a general understanding of the denominator, but what is the actual shape of the distribution? For that, we should look at the Chi^2 distribution. I will make a post on this but for now, there are two theorems:
If $X_1, X_2, ..., X_n$ are a random sample from a normal distribution, then $\dfrac{(n-1)S^2}{\sigma^2}$ is from a $\chi^2_{n-1}$ distribution.
$\chi^2$ distributions are skew-right, explaining the shape of our result. Now, let's analyze T again. $$T = \dfrac{\bar{x} - \mu}{S/\sqrt{n}}$$ $$ = \dfrac{\bar{x}- \mu}{\frac{S}{\sigma}\frac{\sigma}{\sqrt{n}}}$$ $$ = \dfrac{\bar{x}- \mu}{\sqrt{\frac{S^2}{\sigma^2}}\frac{\sigma}{\sqrt{n}}}$$ $$ = \dfrac{\bar{x}- \mu}{\sqrt{\frac{(n-1)S^2}{\sigma^2}/(n-1)}\frac{\sigma}{\sqrt{n}}}$$ $$ = \dfrac{(\bar{x}- \mu)/\frac{\sigma}{\sqrt{n}}}{\sqrt{\frac{(n-1)S^2}{\sigma^2}/(n-1)}}$$ The numerator is a standard normal with mean $\mu$ and variance $\dfrac{\sigma^2}{n}$. The denominator is a chi-squared variable with n-1 degrees of freedom divided by n-1. So, the t-distribution is a normal variable divided by chi-squared.