Suppose you want to precisely estimate the weight of a piece of gold by combining 10 differnet weight scales. The results of your measurements (in grams) are as follows:

measurements <- c(89, 84, 87, 90, 88, 85, 86, 86, 92, 85)

We will examine two questions:

Question 1: How do we measure the error of our measurements?

Question 2: What is our best guess given the above measurements?

Suppose that the true unknown value of the weight is \(\omega\). The most intuitive approach for measuring error is to use the absolute differences \(|w_i - \omega|\), where \(w_i\) is our \(i^{th}\) measurement. The average error will then be as follows:

\[ \bar{e} = \frac{1}{n}\sum_{i=1}^n|x_i - \omega| \] Our goal is to estimate the unknown parameter \(\omega\). It is reasonable to assume that the best guess \(\hat\omega\) is the one that minimizes the error: the lower the error, the better our estimate is.

Which is the statistic that minimizes the above error? I will not provide any proof here, but one can show that it is the median, not the mean. It is easy to verify it for the above sample:

error.mean <- sum(abs(measurements - mean(measurements))) / length(measurements)
error.median <- sum(abs(measurements - median(measurements))) / length(measurements)

cat("Error(mean) =", error.mean, "\nError(median) =", error.median)
## Error(mean) = 2.04 
## Error(median) = 2

But what about the mean, which is the most common statistic of central tendency? We can show that the mean minimizes a different error formulation:

\[ \bar{e} = \frac{1}{n}\sum_{i=1}^n(x_i - \omega)^2 \]

Proof: Take the derivative \(\frac{d\bar{e}(\omega)}{d\omega}\) and find the \(\omega\) for which it becomes zero. This will give you the mean.

We can compare the error of the mean and the median on our dataset as follows:

error.mean <- sum((measurements - mean(measurements))^2) / length(measurements)
error.median <- sum((measurements - median(measurements))^2) / length(measurements)

cat("Error(mean) =", error.mean, "\nError(median) =", error.median)
## Error(mean) = 5.76 
## Error(median) = 6.25

Now, if we replace \(\omega\) by the mean, this second error formulation coincides with our well-known variance. Since the mean has several advantages and is the most commonly used measure, the choice of variance (and standard deviation) as a measure of error (and dispersion) becomes evident. However, in cases where the median is more meaningful than the mean, using the mean absolute difference might be a better choice.

Note: The source code of this page can be downloaded from here: variance.Rmd