How to Calculate Variance in Statistics

by Gerald Hanks; Updated September 26, 2017

One of the most basic concepts in statistics is the average, or arithmetic mean, of a set of numbers. The mean signifies a central value for the data set. The variance of a data set measures how far the elements of that data set are spread out from the mean. Data sets in which the numbers are all close to the mean will have a low variance. Those sets in which the numbers are much higher or lower than the mean will have a high variance.

Calculate Mean of the Data Set

Since the variance measures the amount of separation from the mean, the first step in finding the variance of a data set is to find its mean. For instance, a store calculates its daily revenues for seven days:

Day 1: $62,000

Day 2: $64,800

Day 3: $62,600

Day 4: $69,200

Day 5: $66,000

Day 6: $63,900

Day 7: $69,400

The mean for the store's daily revenues for the week is :

(62000+64800+62600+69200+66000+63900+69400)/7 = 457900/7 = $65,414.29

Calculate Squared Differences

The next step involves calculating the difference between each element in the data set and the mean. Since some elements will be higher than the mean and some will be lower, the variance calculation uses the square of the differences.

Day 1 Sales - Mean Sales: $62,000 - $65414.29 = (-$3,414.29); (-3,414.29)2 = 11,657,346.94

Day 2 Sales - Mean Sales: $64,800- $65414.29 = (-$614.29); (-614.29)2 = 377,346.94

Day 3 Sales - Mean Sales: $62,600 - $65414.29 = (-$2,814.29); (-2,814.29)2 = 7,920,204.08

Day 4 Sales - Mean Sales: $69,200 - $65414.29 = (+$3,785.71); (+3,785.71)2 = 14,331,632.65

Day 5 Sales - Mean Sales: $66,000 - $65414.29 = (+$585.71); (+585.71)2 = 343,061.22

Day 6 Sales - Mean Sales: $63,900 - $65414.29 = (-$1,514.29); (-1,514.29)2 = 2,293,061.22

Day 7 Sales - Mean Sales: $69,400 - $65414.29 = (+$3,985.71); (+3,985.71)2 = 15,885,918.37

NOTE: The squared differences are not measured in dollars. These numbers are used in the next step to calculate the variance.

Variance and Standard Deviation

The variance is defined as the mean of the squared differences.

11,657,346.94 + 377,346.94 + 7,920,204.08 + 14,331,632.65 + 343,061.22 + 2,293,061.22 + 15,885,918.37 = 52,808,571.43

52,808,571.43/7 = 7,544,081.63

Since the variance uses the square of the difference, the square root of the variance will give a clearer indication of the actual spread. In statistics, the square root of the variance is called the standard deviation.

SQRT(7,544,081.63) = $2,746.65

Uses for Variance and Standard Deviation

Both variance and standard deviation are highly useful in statistical analysis. The variance measures the overall spread of a data set from the mean. The standard deviation helps in detecting outliers, or elements of the data set that stray too far from the mean.

In the data set above, the variance is quite high, with only two daily sales totals coming to within $1,000 of the mean. The data set also shows that two of the seven daily sales totals are more than one standard deviation above the mean, while two others are more than one standard deviation below the mean.

About the Author

Living in Houston, Gerald Hanks has been a writer since 2008. He has contributed to several special-interest national publications. Before starting his writing career, Gerald was a web programmer and database developer for 12 years. He also started Story Into Screenplay, a screenwriting blog at