Probability is a measure of how likely something is to happen (or not happen). Measuring probability is usually based on a ratio of how often an event could happen relative to how many chances it has at happening. Think about throwing a die: The number one has a one in six chance of happening on any given throw. Reliability, statistically speaking, just means consistency. If you measure something five times and come up with estimates that are fairly close together, your estimate can be considered reliable. Reliability is calculated based on how many measurements--and measurers--there are.
Define "success" for the event of interest. Say we are interested in knowing the probability of rolling a four on a die. Think about each roll of the die as a trial, in which we either "succeed" (roll a four) or "fail" (roll any other number). On each die, there is one "success" face and five "failure" faces. This will become your numerator in the final calculation.
Determine the total number of possible outcomes for the event of interest. Using the example of tossing a die, the total number of outcomes is six, because there are six different numbers on the die. This will become your denominator in the final calculation.
Divide the possible success over the total possible outcomes. In our die example, the probability would be 1/6 (one possibility of success for six total possible outcomes for each roll of the die).
Calculate the probability of more than one event by multiplying individual probabilities. In our die example, the probability of rolling a four and rolling a six on a subsequent roll is the multiple of the individual probabilities (1/6)x(1/6)=(1/36).
Calculate the probability of more than one event by adding individual probabilities. In our die example, the probability of rolling a four or rolling a six would be (1/6)+(1/6)=(2/6).
Calculating Reliability of Multiple Measurements
Evaluate the change in the mean. If we have a group of five people and weigh each person twice, we end up with two group estimates of weight (the average or "mean"). Compare the two averages to determine whether the difference between them is reasonably consistent or whether the measurements differ substantially. This is done by doing a statistical test--called a t-test--to compare the two means.
Calculate the typical expected error, also known as standard deviation. If we measured the weight of one person 100 times, we would end up with measurements that are very close to the true weight and others that are further away. This spread of measurements has a certain expected variation and can be attributed to random chance, sometimes referred to as a standard deviation. Measurements that are outside of the standard deviation are considered to be due to something other than random chance.
Calculate the correlation between two sets of measurements. In our weight example, the two groups of measurements could range from having no values in common (correlation of zero) to being exactly the same (correlation of one). Evaluating how closely correlated two sets of measurements are is important in determining consistency of measurements. High correlation implies high reliability of measurements. Think about the variability that might be introduced by using different scales each time or having different people reading the scales. In experiments and statistical testing, it's important to identify how much variability is due to random chance and how much is due to something we did differently in our measuring.
- dice image by Alison Bowden from Fotolia.com