While it is often impossible to sample an entire population of organisms, you can make valid scientific arguments about a population by sampling a subset. In order for your arguments to be valid, you have to sample enough organisms to make the statistics work out. A little bit of critical thinking about the questions you are asking and the answers you hope to get can help guide you in choosing an appropriate number of samples.
Estimated Population Size
Defining your population will help you estimate the population size. For example, if you are studying a single flock of ducks, then your population would consist of all the ducks in that flock. If, however, you are studying all of the ducks on a particular lake, then your population size would need to reflect all of the ducks in all of the flocks on the lake. Population sizes of wild organisms are often unknown and sometimes unknowable, so it is acceptable to hazard an educated guess about the total population size. If the population is large, then this number will not have a strong influence on the statistical calculation of the sample size needed.
Margin of Error
The amount of error you are willing to accept in your calculations is called the margin of error. Mathematically, the margin of error is equal to one standard deviation above and below your sample mean. Standard deviation is the measure of how spread out your numbers are around your sample mean. Let's say that you are measuring the wingspan of your duck population from above and you find a mean wingspan of 24 inches. To calculate the standard deviation you will need determine how different each measurement is from the mean, square each of those differences, add them together, divide by the number of samples and then take the square root of the result. If your standard deviation is 6 and you choose to accept a 5 percent margin of error, then you can be reasonably sure that the wingspans of 95 percent of the ducks in your sample will be between 18 (= 24 - 6) and 30 (= 24 + 6) inches.
A confidence interval is exactly what it sounds like: how much confidence you have in your result. This is another value that you determine ahead of time, and in turn it will help determine how rigorously you will need to sample your population. The confidence interval tells you how much of the population is actually likely to fall within your margin of error. Researchers typically choose confidence intervals of 90, 95 or 99 percent. If you apply a 95 percent confidence interval, then you can be confident that 95 percent of the time between 85 and 95 percent of the ducks' wingspans that you measure will be 24 inches. Your confidence interval corresponds to a z-score, which you can look up in statistical tables. The z-score for our 95 percent confidence interval is equal to 1.96.
When we don't have an estimate of the total population that we can use to calculate standard deviation, we assume that it is equal to 0.5, because that will give us a conservative sample size to ensure that we are sampling a representative portion of the population; call this variable p. With a 5 percent margin of error (ME) and a z-score (z) of 1.96, our formula for sample size translates from: sample size = (z^2 * (p_(1-p)))/ME^2 to sample size = (1.96^2 * (0.5(1-0.5)))/0.05^2. Working through the equation, we move to (3.8416_0.25)/0.0025 = 0.9604/.0025 = 384.16. Since you are unsure of the size of your duck population, you should measure the wingspans of 385 ducks in order to be 95 percent certain that 95 percent of your individuals will have a 24-inch wingspan.
- garytmarsh/iStock/Getty Images