Say you want to know how your purebred 12-week-old puppy's weight compares to that of dogs of the same age, sex and breed around the world. Because you can track down a database offering a sizable number of dogs meeting the right criteria, you can compare your dog's weight to the population average, and get a sense of how your pooch compares to its peers.
But what if you're trying to assess how a given value compares to the population mean when you don't have a complete picture of the data?
For example, say you want to know how the average height of British women in your town compares to the published figures from the U.K. National Health Service. If you live in the U.S. and can only find 11 women who qualify, how do you interpret the height data in this limited sample?
The first situation calls for something known as a z-score, while the second is analyzed using a t-score. These related ways of handling descriptive statistics provide the same general information about how a given data point compares to a "typical" point. The use of t-score statistics may require a t-score table, calculator or both.
Descriptive Statistics: The Basics
The mean, or average, of a set of data points is the sum of their individual values divided by the total number of points n. A population mean is usually denoted by μ. In a standard normal distribution, a bilaterally symmetrical ("bell") curve is centered abut a mean that is as likely to vary in one direction as it is to vary in the other, the standard deviation (SD) is denoted by σ.
- In a standard normal distribution, 68 percent of points fall within +/− 1 SD and 95 percent fall within +/− 2 SD.
The size if the SD in relation to the mean hints at the shape of the curve; a larger SD is associated with a wider distribution and a smaller SD with a narrower distribution.
Z-Scores and T-Scores Defined
A z-score is the decimal number of SD from the population mean μ (where Z = 0). Scores one SD above and below the mean are given values of 1.00 and −1.00, those two SD above and below get scores of 2.00 and −2.00 and so on.
The x is the value of the point being evaluated, and μ and σ are the population mean and population SD respectively.
A t-score is calculated from a similar formula, with important differences:
Here, x̄ is the sample mean, μ is again the population mean, s is the sample SD, and n is the number of data points.
Why Use T-Score Statistics?
When you have fewer than n = 30 in your sample, you should use t-score calculations rather than a z-score to analyze your data. As the number n grows larger, graphs of t-scores come to approximate those of z-scores, as a higher number of points in the set statistically assures a higher likelihood of the sample being coincident with an "infinitely" large random sample of the population of interest.
The remaining parameter you need to round out this analysis is the confidence interval you will be using. for most "two-tailed" tests, this is either 90 percent or 95 percent.
Say you have a class of 25 university students and their average score on a surprise test of Harry Potter knowledge is 65 percent , with a SD of +/− 15 percent. If the population mean is 65 percent, calculate the t-score, and determine whether it falls within a 90 percent confidence interval.
First, calculate t using the above equation:
t = (64 − 60)/)(15/√25) = 4/(15/5) = 4/3 = 1.333.
Now refer to a t-score chart (see the Resources for an example). df on these stands for degrees of freedom, equal to (n − 1). Since n = 25, df = 24.
Reading across the appropriate row to get to the column corresponding to a confidence interval of 90 percent, you see that this value 1.711. Since 1.333 < 1.711, you conclude that while your class is above average, it is not significantly so by your own definition. (What if you had chosen a confidence interval of 80 percent? Or 70 percent?)