Sample Mean - Intuition
Click here to expand...
Let’s say we want the mean height of 1 trillion people.
There are 2 methods:
- population mean - measure the height of all 1 trillion people, sum them together then divide by 1 trillion
- sample mean - get a sample of 100 people, measure the height of those 100 people, sum them together then divide by 100
The first method will get you the actual mean 𝜇 height as it is the definition of mean over a population. However, it is impractical to measure 1 trillion people.
The second method is much easier, we only need to measure the height of 100 people. However, this sample mean may not accurately reflect the population mean. How well the sample mean reflects the population mean is through calculating sample mean’s:
- expected value - denoted as 𝐄(sample mean) = ?
- variance - denoted as 𝐕𝐚𝐫(sample mean) = ?
we want:
- 𝐄(sample mean) = population mean
- 𝐕𝐚𝐫(sample mean) = 0
Not surprisingly, the more samples we take from the population to calculate to sample mean:
- the closer 𝐄(sample mean) becomes the population mean
- the closer 𝐕𝐚𝐫(sample mean) becomes 0
below goes through the mathematics on why this is the case
Sample Mean - Definition / Formula
- sample mean (𝑋̅) = [𝑋1+ 𝑋2+ … + 𝑋𝑛] / 𝑛
where:
- each 𝑋𝑖is a random sample drawn from a population
- 𝑛 is the sample size
in order the draw conclusions about 𝑋̅ (such as the expected value of 𝑋̅, variance of 𝑋̅, etc) AT LEAST 1 of the following cases must occur:
- samples are drawn from a population/distribution that is Normal(unknown mean = 𝜇, unknown variance = 𝜎2)
- sample size 𝑛 is large (no restriction on what distribution the population exhibits)
Sample Mean - Expected Value / Mean
𝐄(sample mean 𝑋̅) = 𝜇
the expected value/mean of the sample mean 𝑋̅ is the population mean 𝜇. That is, we have shown that the mean of 𝑋̅ is the same as the mean of the individual 𝑋i
proof
- 𝐄(sample mean 𝑋̅) = 𝐄([𝑋1+ 𝑋2+ … + 𝑋𝑛] / 𝑛) # substitution of sample mean formula
- 𝐄(sample mean 𝑋̅) = 1/𝑛 (𝐄[𝑋1] + 𝐄[𝑋2] + … + 𝐄[𝑋𝑛]) # using the linear operator property of expectation
- 𝐄(sample mean 𝑋̅) = 1/𝑛 (𝜇 + 𝜇 + … + 𝜇) # 𝑋i are identically distributed, which means they have the same mean 𝜇
- 𝐄(sample mean 𝑋̅) = 1/𝑛 (𝑛𝜇) # there are 𝑛 𝜇‘s
- 𝐄(sample mean 𝑋̅) = 𝜇
Sample Mean - Variance
𝐕𝐚𝐫(sample mean 𝑋̅) = 𝜎2/𝑛
therefore, as the sample size 𝑛 increases the 𝐕𝐚𝐫(sample mean 𝑋̅) goes to 0. This is what we want!
proof
- 𝐕𝐚𝐫(sample mean 𝑋̅) = 𝐕𝐚𝐫([𝑋1+ 𝑋2+ … + 𝑋𝑛] / 𝑛) # substitution of sample mean formula
- 𝐕𝐚𝐫(sample mean 𝑋̅) = 𝐕𝐚𝐫((1/𝑛)𝑋1+ (1/𝑛)𝑋2+ … + (1/𝑛)𝑋𝑛) # rewrite as a linear combination of 𝑋i’s
- 𝐕𝐚𝐫(sample mean 𝑋̅) = (1/𝑛2)𝐕𝐚𝐫(𝑋1) + (1/𝑛2)𝐕𝐚𝐫(𝑋2) + … + (1/𝑛2)𝐕𝐚𝐫(𝑋𝑛) # some variance theorem
- 𝐕𝐚𝐫(sample mean 𝑋̅) = (1/𝑛2)𝜎2 + (1/𝑛2)𝜎2 + … + (1/𝑛2)𝜎2 # 𝑋i are identically distributed, which means they have the same mean 𝜎2
- 𝐕𝐚𝐫(sample mean 𝑋̅) = (𝑛𝜎2/𝑛2) # there are 𝑛 𝜎2’s
- 𝐕𝐚𝐫(sample mean 𝑋̅) = 𝜎2/𝑛
Sample Mean - Standard Deviation / Standard Error
- 𝐒𝐄(sample mean 𝑋̅) = 𝐒𝐭𝐝(sample mean 𝑋̅) = 𝑟𝑜𝑜𝑡(𝜎2/𝑛)
- 𝐒𝐄ˆ(sample mean 𝑋̅) = 𝑟𝑜𝑜𝑡(𝑠2/𝑛)
where:
- 𝐒𝐄(…) - is the standard error of the sample mean 𝑋̅
- 𝐒𝐄ˆ(…) - is the estimated standard error of the sample mean 𝑋̅
- 𝑠 - is the sample standard deviation of sample data
proof of 𝐒𝐭𝐝(sample mean 𝑋̅) = 𝑟𝑜𝑜𝑡(𝜎2/𝑛)
proof
- 𝐒𝐄(sample mean 𝑋̅) = 𝑟𝑜𝑜𝑡(𝐕𝐚𝐫(sample mean 𝑋̅))
- 𝐒𝐄(sample mean 𝑋̅) = 𝑟𝑜𝑜𝑡(𝜎2/𝑛)
Sample Mean - Distribution
Sampling Distribution of Sample Mean - Sample Mean Distribution