What is the difference between population and sample standard deviation?

Population standard deviation (σ) divides by N (total population size), while sample standard deviation (s) divides by N-1. The N-1 correction (Bessel's correction) accounts for the fact that a sample underestimates variability. Use sample standard deviation when working with a subset of data.

What does a high standard deviation mean?

A high standard deviation means the data points are spread far from the mean, indicating high variability. A low standard deviation means data points cluster tightly around the mean. For example, test scores of 70, 75, 80 have low SD, while 30, 75, 95 have high SD.

What is the 68-95-99.7 rule?

For normally distributed data, approximately 68% of values fall within 1 standard deviation of the mean, 95% within 2 standard deviations, and 99.7% within 3 standard deviations. This rule helps you quickly estimate how unusual a data point is.

Mathematics

Standard Deviation Formula: How to Calculate with Examples

Updated March 21, 202610 min read

Normal distribution bell curve showing standard deviation intervals with 68-95-99.7 rule labels

What Is Standard Deviation?

Standard deviation is a measure of how spread out a set of data values is from the mean (average). A low standard deviation means the data points are clustered closely around the mean, while a high standard deviation means the data points are spread out over a wider range.

Consider two classes that both have a test average of 75%. Class A has scores of 73, 74, 75, 76, 77 — very consistent, low spread. Class B has scores of 55, 65, 75, 85, 95 — same average but wildly different performances. The standard deviation of Class A would be small (about 1.4), while the standard deviation of Class B would be large (about 14.1). The mean alone does not tell the full story — standard deviation tells you how reliable or consistent the mean is.

Standard deviation is used in virtually every field that deals with data: science (measurement uncertainty), finance (investment risk), manufacturing (quality control), psychology (test score analysis), and sports (player consistency). It is one of the most important concepts in statistics.

Population vs. Sample Standard Deviation

There are two standard deviation formulas, and choosing the right one depends on whether your data is a population or a sample. A population is the entire set of values you care about (e.g., the heights of every student in your school). A sample is a subset of the population (e.g., the heights of 30 randomly selected students).

Population standard deviation (sigma): sigma = sqrt[(1/N) * sum((xi - mu)^2)], where N is the population size, xi is each data point, and mu is the population mean. This formula divides by N.

Sample standard deviation (s): s = sqrt[(1/(n-1)) * sum((xi - x_bar)^2)], where n is the sample size, xi is each data point, and x_bar is the sample mean. This formula divides by (n-1), not n.

Why the difference? When you use a sample to estimate the population standard deviation, dividing by n tends to underestimate the true spread. Dividing by (n-1) corrects for this bias — it is called Bessel's correction. In most homework and real-world applications, you are working with samples, so use the (n-1) formula. Use the N formula only when you have data for the entire population.

Step-by-Step Calculation

Let's calculate the sample standard deviation of the data set: 4, 8, 6, 5, 3, 7, 8, 2.

Step 1 — Find the mean: Add all values and divide by n. Sum = 4 + 8 + 6 + 5 + 3 + 7 + 8 + 2 = 43. Mean = 43 / 8 = 5.375.

Step 2 — Find the deviations: Subtract the mean from each data point. (4 - 5.375) = -1.375. (8 - 5.375) = 2.625. (6 - 5.375) = 0.625. (5 - 5.375) = -0.375. (3 - 5.375) = -2.375. (7 - 5.375) = 1.625. (8 - 5.375) = 2.625. (2 - 5.375) = -3.375.

Step 3 — Square each deviation: 1.890625, 6.890625, 0.390625, 0.140625, 5.640625, 2.640625, 6.890625, 11.390625. Squaring eliminates negative signs and gives more weight to values far from the mean.

Completing the Calculation

Step 4 — Sum the squared deviations: 1.890625 + 6.890625 + 0.390625 + 0.140625 + 5.640625 + 2.640625 + 6.890625 + 11.390625 = 35.875.

Step 5 — Divide by (n-1) for sample variance: 35.875 / 7 = 5.125. This value (5.125) is the sample variance — the average of the squared deviations.

Step 6 — Take the square root: sqrt(5.125) = 2.264. The sample standard deviation is approximately 2.26. This means that, on average, data points deviate about 2.26 units from the mean of 5.375.

To verify: most data points should fall within one standard deviation of the mean (5.375 +/- 2.26 = about 3.1 to 7.6). Looking at our data (4, 8, 6, 5, 3, 7, 8, 2), five of the eight values fall in this range, which is consistent with the expected 68% for normally distributed data.

Understanding Variance

Variance is the square of the standard deviation. It is the average of the squared deviations from the mean. Sample variance = s^2 = sum((xi - x_bar)^2) / (n-1). Population variance = sigma^2 = sum((xi - mu)^2) / N.

Variance and standard deviation measure the same thing — data spread — but in different units. If your data is in centimeters, the variance is in square centimeters (cm^2), which is not intuitive. The standard deviation converts back to the original units by taking the square root, making it easier to interpret.

Variance is mathematically more convenient for some operations (it is additive for independent variables and appears in many statistical formulas), which is why it is important in theory. Standard deviation is more interpretable in practice, which is why it is used more in applied settings.

Both are used in AP Statistics, college statistics courses, and data science. You should be comfortable calculating both and converting between them. If a problem asks for variance, do steps 1-5 above. If it asks for standard deviation, do all six steps. The only difference is whether you take the square root in the last step.

The 68-95-99.7 Rule (Empirical Rule)

For data that follows a normal distribution (bell curve), the standard deviation tells you exactly what proportion of data falls within each range. This is the 68-95-99.7 rule, also called the empirical rule.

About 68% of data falls within 1 standard deviation of the mean (mean +/- 1 SD). About 95% falls within 2 standard deviations (mean +/- 2 SD). About 99.7% falls within 3 standard deviations (mean +/- 3 SD). Values beyond 3 standard deviations are extremely rare.

Example: If exam scores are normally distributed with a mean of 75 and a standard deviation of 10, then 68% of students scored between 65 and 85, 95% scored between 55 and 95, and 99.7% scored between 45 and 105. A score of 95 is 2 standard deviations above the mean — only about 2.5% of students scored higher.

This rule is powerful for quick estimates without calculating exact probabilities. If a data point is more than 2 standard deviations from the mean, it is unusual (top or bottom 5%). More than 3 standard deviations is very unusual (top or bottom 0.3%). This concept underlies z-scores, hypothesis testing, and confidence intervals.

When to Use Standard Deviation

Use standard deviation when you want to describe how spread out your data is. It answers the question: "How typical is the mean?" If the standard deviation is small relative to the mean, the mean is a reliable summary of the data. If it is large, individual data points vary widely and the mean may be misleading.

In comparing groups: if Group A has a mean score of 80 (SD = 5) and Group B has a mean score of 78 (SD = 15), Group A is more consistently performing well. Group B's higher variability means some students are doing much better and others much worse than the mean suggests.

Standard deviation is also used to calculate z-scores (z = (x - mean) / SD), which standardize data for comparison. A z-score tells you how many standard deviations a particular value is from the mean. This lets you compare values from different distributions — for example, comparing your performance on two different tests with different scales.

For data with outliers, consider using the interquartile range (IQR) instead of standard deviation. Standard deviation is sensitive to extreme values because it squares deviations (making large deviations disproportionately influential). The IQR, which measures the spread of the middle 50% of data, is more robust to outliers. If you need help with any statistics calculations, ScanSolve can walk you through the process step by step.

Practice Problems

Problem 1: Find the population standard deviation of: 10, 12, 14, 16, 18. Solution: Mean = 14. Deviations: -4, -2, 0, 2, 4. Squared: 16, 4, 0, 4, 16. Sum = 40. Variance = 40/5 = 8. SD = sqrt(8) = 2.83.

Problem 2: Find the sample standard deviation of: 22, 25, 29, 31, 33. Solution: Mean = 28. Deviations: -6, -3, 1, 3, 5. Squared: 36, 9, 1, 9, 25. Sum = 80. Variance = 80/4 = 20. SD = sqrt(20) = 4.47.

Problem 3: Test scores have a mean of 82 and a standard deviation of 6. What range contains 95% of the scores (assuming normal distribution)? Solution: Mean +/- 2 SD = 82 +/- 12 = 70 to 94.

Problem 4: Dataset A has mean 50, SD 3. Dataset B has mean 50, SD 12. Which dataset has more consistent values? Solution: Dataset A. Its smaller standard deviation means values are clustered more closely around the mean.

ScanSolve Editorial Team

Our team of educators and AI specialists creates step-by-step guides to help students master every subject.

Need Help With Your Homework?

Snap a photo of any homework problem and get a step-by-step solution instantly. ScanSolve handles math, science, history, and more.

Try ScanSolve Free