Variance and standard deviation
Measures
of central tendency (mean, median, and mode) provide information on the data
values at the center of the data set. Measures of dispersion (quartiles,
percentiles, ranges) provide information on the spread of the data around the
center. In this section, we will look at two more measures of dispersion called
the variance and
the standard
deviation.
Variance:
The variance of the data is the
average squared distance between the mean and each data value.
Variance tells us that how our entire dataset is going to vary across the mean.
If the variance of our dataset is high
which means that our data is too far from its average.
If the variance is low which means
that our dataset is concentrated towards its mean.
The variance has the
following properties.
·
It is never negative since every term in the variance sum is
squared and therefore either positive or zero.
·
It has squared units. For example, the variance of a set of
heights measured in centimeters will be given in centimeters squared. Since the
population variance is squared, it is not directly comparable with the mean or
the data themselves. In the next section, we will describe a different measure
of dispersion, the standard deviation, which has the same units as the data.
Standard Deviation:
the
variance is a squared quantity, it cannot be directly compared to the data
values or the mean value of a data set. It is therefore more useful to have a
quantity that is the square root of the variance. This quantity is known as
the standard deviation.
In statistics, the
standard deviation is a very common measure of dispersion. Standard deviation
measures how to spread out the values in a data set are around the mean. More
precisely, it is a measure of the average distance between the values of the
data in the set and the mean. If the data values are all similar, then the
standard deviation will be low (closer to zero). If the data values are highly
variable, then the standard variation is high (further from zero).
The standard deviation is always a positive number and
is always measured in the same units as the original data. For example, if the
data are distance measurements in kilograms, the standard deviation will also
be measured in kilograms.
The mean and the standard deviation of a set of data
are usually reported together. In a certain sense, the standard deviation is a
natural measure of dispersion if the center of the data is taken as the mean.


0 Comments