The basics
Last updated
Last updated
Credits & Sources:
Found by adding all of the numbers together and dividing by the number of items in the set:
It depends on whether the number of terms in the distribution. Once the values are sorted.
If the given number of terms is odd:
It's the value in de middle.
1, 3, 3, 6, 7, 8, 9 6
If the given number of terms is even:
It's the average of the two terms in the middle.
1, 2, 3, 4, 5, 6, 8, 9 (4+5) 2 = 4.5
It's the most repeated value within the distribution. Example:
1, 2, 2, 3, 4, 7, 9 2
It's quite common to find more than one mode, especially if there aren't many terms. A distribution with two modes is called bimodal. As well, with three it's called trimodal.
It's the difference between the maximum value and the minimum value. Example:
1, 2, 2, 3, 4, 7, 9 9 - 1 = 8
It measures how far a set of random numbers are spread out from their mean.
It's is the squarred root of the variance. A low standard deviation indicates that the data points tend to be close to the mean.
It's a measure of the joint variability of two random variables. measures of the extent to which corresponding elements from two sets of ordered data move in the same direction. It measures how much two variables vary together. It’s similar to variance, but where variance tells you how a single variable varies, covariance tells you how two variables vary together.
Median is much less sensitive to outliers.
However, almost all analytic calculations on sets of data are more natural in terms of the mean than the median.
The difference between the median and the mean is useful to represent how skewed the data is.
The real use of the median comes when the data set may contain extreme outliers. Then, describing the distribution in terms of quartiles can be more informative than quoting and .
For skewed distributions, the mean is not necessarily the same as the median or the mode. For example, mean income is typically skewed upwards by a small number of people with very large incomes, so that the majority have an income lower than the mean. By contrast, the median income is the level at which half the population is below and half is above. The mode income is the most likely income and favors the larger number of people with lower incomes. Median and mode are often more intuitive measures for such skewed data, BUT many skewed distributions are in fact best described by their mean, including the Exponential and Poisson distributions.
Meanwhile the standard deviation expresses how disperse is data with respect to the mean, the standard error measures the standard deviation of its sampling distribution.
The sampling distribution of a population mean is generated by repeated sampling and recording of the means obtained. This forms a distribution of different means, and this distribution has its own mean and variance.
Given the standard error of the population and the size of a sample, the standard error of a sample of this population is expressed as: