The field of outlier detection and treatment is considerable, and a rigorous mathematical discussion is well beyond any treatment that is possible here. Moreover, the practice in the treatment of analytical results is usually simplified, because the number of observations is often not very large. The two most common methods used by analysts to detect outliers in measured data are versions of the Q-test (Refs. 1–3, 6) and Chauvanet’s criterion (Refs. 4–6), both of which assume that the data are sampled from a population that is normally distributed.
To perform the Q-test, one calculates the Q value given by:
Q = Qgap/R
where Qgap is the difference between the suspected outlier and the measured value closest to it, and R is the range of all the measured values in the data set. One then compares the calculated Q value with the critical Q values in Table 1.
Number of observations | Qcrit, 90% Confidence level | Qcrit, 95% Confidence level | Qcrit, 99% Confidence level |
3 | 0.941 | 0.970 | 0.994 |
4 | 0.765 | 0.829 | 0.926 |
5 | 0.642 | 0.710 | 0.821 |
6 | 0.560 | 0.625 | 0.740 |
7 | 0.507 | 0.568 | 0.680 |
8 | 0.468 | 0.526 | 0.634 |
9 | 0.437 | 0.493 | 0.598 |
10 | 0.412 | 0.466 | 0.568 |
If the calculated value of Q is greater than the appropriate value of Qcrit, then the value is a suspected outlier.
To perform Chauvenet’s test on a set of measurements, one first must calculate the mean and standard deviation of the data. Then one calculates:
τ = (xi – xave)/σ
where xi is the suspected outlier, xave is the mean of all the measurements, and σ is the standard deviation. One then compares the calculated value of τ with τcrit in the following table.
Number of observations, N | τcrit |
5 | 1.65 |
6 | 1.73 |
7 | 1.81 |
8 | 1.86 |
9 | 1.91 |
10 | 1.96 |
15 | 2.12 |
20 | 2.24 |
25 | 2.33 |
50 | 2.57 |
100 | 2.81 |
150 | 2.93 |
200 | 3.02 |
500 | 3.29 |
1000 | 3.48 |
If the calculated value of τ is greater than the value of τcrit, then the value is a suspected outlier.
For numbers of observations between those given in Table 2, especially for a large number of observations, one may use the following plot to estimate the value of Chauvenet’s τcrit.