Say you know that the average height of an American woman is close to 5 feet, 4 inches (about 1.63 m). Say you were also told that an auditorium in which 500 adult women are standing is a perfectly representative sample of the American population. That is, you can fairly expect that the average height of the women in the auditorium will also be 5' 4".
If you were to choose three people at random to exit the room, would you expect the average, or mean, of their heights to be exactly 5' 4"? Why or why not? What if you chose 10 people instead? Or 100? Furthermore, say you repeated the experiment of measuring the heights of three randomly chosen women in the room over and over, and then averaged these averages?
Over time, you might expect the average of these averages, each of which is called x-bar (x̄) or the sample mean, to approach the population mean of 5' 4". And if you used larger samples, you'd expect this convergence of the sampling means and the true (population) means to happen more quickly. But why?
The answers to the above questions lie in the statistical realm of sampling distributions. But first, some terminology and definitions are in order.
The population mean is an accepted, empirically determined value applying to the largest possible group of individuals you are studying. Thus if your auditorium contains 500 American women, the entire set of American women is the greater population implied.
p represents a similar concept: A known population proportion, such as "the proportion of dogs worldwide that can run over 15 miles per hour is 0.40 (40 percent)." p̂, called "p-hat," is the average proportion found after taking a number of samples of the same size (e.g., 10 dogs) from the at-large population.
For example, one group of 10 randomly selected dogs might have an average speed of 17.8 MPH, the next 14.3 MPH, the next 12.8 MPH and so on until you have analyzed as many samples as you like.
Sampling distributions allow you to determine whether the pool you are taking samples from is truly representative of the greater population. This is because, according to the Central Limit Theorem, as the number of x-bar (x̄) rises, a graph of their average and their distribution will resemble that of the true population mean. That is, it will be a normal (bell-shaped) distribution.
Back to the women in the auditorium: Over time, you might expect the average of these averages, called x-bar (x̄) or the sample mean, to approach the population mean of 5' 4" no matter how many data points (n) you include in each x-bar. And if you use larger samples, such as 100 people or dogs at a time instead of 10, you'd expect both that each individual x̄ will be closer to the true mean and that fewer instances of x̄ need to be averaged to get closer to this true mean.
For example, if you chose three women, you would not be surprised if their average height was 5' 9" or 5' 1" because a single very tall or very short "outlier" can throw off an average a lot when the number of data points is small.
But if you ran repeated trials of 100 women and saw x-bar values of 5' 8.2", 5' 7.3", and so on, you would have reason to conclude that the population sample of 500 in the auditorium was not, in fact, a randomly chosen sample of American women.
You can find the value of x-bar for any sample quickly by referring to a page like the one in the Resources. To sum these values to obtain a sampling distribution, you can use spreadsheet programs such as Microsoft Excel or Google Sheets that have various prepackaged statistical tools for uses like these.