Sampling Distribution of Sampling Means

Suppose you are interested in the average height of the American male. So, you go out and collect a random sample of 20 men, find their heights, and average them. This may give you a pretty good estimate of the true mean for the population -- but it is unlikely to give you the exact population mean.

So you go out and collect another random sample of 20 (probably different) men, find their heights, and average them. You expect to see a sample mean in the ballpark of what you got last time, but it will most likely be a different value. Of course, neither of these means can be trusted as the true mean for the population...

So you go out and collect another random sample of 20 (probably different) men, find their heights, and average them, to arrive at a 3rd (and probably different) estimate of the true population mean. Always wary of the resulting average, you do this 1000 (or more) times -- and draw a histogram for the "pile-of-averages" you have accumulated.

The distribution reflected in this histogram -- this "pile-of-averages" -- is what we refer to as the sampling distribution of sample means for samples of size $n$ taken from the population in question.

If this distribution were a pile of sand, then each grain of sand in that pile corresponds to a sample mean from a different random sample.

Note, rarely does one actually go out and take 1000 different samples, find 1000 different sample means, and then pile them all up just to see the distribution of sample means (except for possibly instructive purposes).

Instead, knowledge of how this sampling distribution should behave is used to determine the probability of seeing what we see in one given sample, which then provides us a quantifiable measure of how likely the assumptions used to find that probability were correct.