How to Find Sample Standard Deviation

••• shironosov/iStock/GettyImages

Statistical tests such as the ​t​-test intrinsically depend on the concept of a standard deviation. Any student in statistics or science will use standard deviations regularly and will need to understand what it means and how to find it from a set of data. Thankfully, the only thing you need is the original data, and while the calculations can be tedious when you have a lot of data, in these cases you should use functions or spreadsheet data to do it automatically. However, all you need to do to understand the key concept is to see a basic example you can easily work out by hand. At its core, the sample standard deviation measures how much the quantity you’ve chosen varies across the whole population based on your sample.

TL;DR (Too Long; Didn't Read)

Using ​n​ to mean sample size, ​μ​ for the mean of the data, ​xi for each individual data point (from ​i​ = 1 to ​i​ = ​n​), and Σ as a summation sign, the sample variance (​s2) is:

s2 = (Σ ​xi – ​μ​)2 / (​n​ − 1)

And the sample standard deviation is:

s= √​s2

Standard Deviation vs. Sample Standard Deviation

Statistics revolves around making estimates for whole populations based on smaller samples from the population, and accounting for any uncertainty in the estimate in the process. Standard deviations quantify the amount of variation in the population you’re studying. If you’re trying to find the average height, you will get a cluster of results around the mean (the average) value, and the standard deviation describes the width of the cluster and the distribution of heights across the population.

The “sample” standard deviation estimates the true standard deviation for the whole population based on a small sample from the population. Most of the time, you won’t be able to sample the whole population in question, so the sample standard deviation is often the right version to use.

Finding the Sample Standard Deviation

You need your results and the number (​n​) of people in your sample. First, calculate the mean of the results (​μ​) by adding up all of the individual results and then dividing this by the number of measurements.

As an example, the heart rates (in beats per minute) of five men and five women are:

71, 83, 63, 70, 75, 69, 62, 75, 66, 68

Which leads to a mean of:

\begin{aligned} μ &= \frac{71 + 83 + 63 + 70 + 75 + 69 + 62 + 75 + 66 + 68}{10} \\ &= \frac{702}{10} \\ &= 70.2 \end{aligned}

The next stage is to subtract the mean from each individual measurement, and then square the result. As an example, for the first data point:

(71 - 70.2)^2 = 0.8^2 = 0.64

And for the second:

(83- 70.2)^2 = 12.8^2 = 163.84

You continue in this fashion through the data, and then add these results up. So for the example data, the sum of these values is:

0.64 + 163.84 +51.84 + 0.04 + 23.04 + 1.44 + 67.24 +23.04 + 17.64 + 4.84 = 353.6

The next stage distinguishes between the sample standard deviation and the population standard deviation. For the sample deviation, you divide this result by the sample size minus one (​n​ −1). In our example, ​n​ = 10, so ​n​ – 1 = 9.

This result gives the sample variance, denoted by ​s2, which for the example is:

s^2 = \frac{353.6}{9} = 39.289

The sample standard deviation (​s​) is just the positive square root of this number:

s = \sqrt{39.289} = 6.268

If you were calculating the population standard deviation (​σ​) the only difference is that you divide by ​n​ rather than ​n​ −1.

The whole formula for sample standard deviation can be expressed using the summation symbol Σ, with the sum being over the whole sample, and ​xi representing the ​i​th result out of ​n​. The sample variance is:

s^2 = \frac{(\sum_i x_i - μ)^2}{n - 1}

And the sample standard deviation is simply:

s = \sqrt{s^2}

Mean Deviation vs. Standard Deviation

The mean deviation differs slightly from the standard deviation. Instead of squaring the differences between the mean and each value, you instead just take the absolute difference (ignoring any minus signs), and then find the average of those. For the example in the previous section, the first and second data points (71 and 83) give:

x_1 - μ = 71 - 70.2 = 0.8 \\ x_2 - μ = 83 - 70.2 = 12.8

The third data point gives a negative result

x_3 - μ = 63 - 70.2 = -7.2

But you just remove the minus sign and take this as 7.2.

The sum of all of these gives divided by ​n​ gives the mean deviation. In the example:

\begin{aligned} &\frac{0.8 + 12.8 + 7.2 + 0.2 + 4.8 + 1.2 + 8.2 + 4.8 + 4.2 + 2.2}{10} \\ &= \frac{46.4}{10} \\ &= 4.64 \end{aligned}

This differs substantially from the standard deviation calculated before, because it doesn’t involve squares and roots.

Related Articles

How to Calculate Absolute Deviation (and Average Absolute...
How to Calculate Skew
How to Calculate Variance From Standard Error
How to Use Stats to Stand Out at the Science Fair
How to Calculate Unexplained Variance
How to Calculate Average Deviation From the Mean
How to Calculate Variance
How to Calculate Relative Standard Error
How to Calculate Dispersion
How to Calculate the Standard Error of a Slope
How to Calculate Correlation
How to Calculate the Distribution of the Mean
How to Determine a Sample Size Confidence Interval
How to Calculate X-bar
How to Calculate MSE
How to Calculate the Root MSE in ANOVA
How to Calculate Outliers
Calculate Average Deviation
How to Calculate Confidence Levels
How to Determine the Bin Width for a Histogram