In statistics and optimization, the concepts of statistical error and residual are easily confused with each other. Statistics is a mathematical science pertaining to the collection analysis interpretation or explanation and presentation of Data. In Mathematics, the term optimization, or mathematical programming, refers to the study of problems in which one seeks to minimize or maximize a real function

A statistical error is the amount by which an observation differs from its expected value; the latter being based on the whole population from which the statistical unit was chosen randomly. The expected value, being for instance the mean of the entire population, is typically unobservable. In Mathematics and Statistics, the arithmetic Mean (or simply the mean) of a list of numbers is the sum of all the members of the list divided If the mean height in a population of 21-year-old men is 1. 75 meters, and one randomly chosen man is 1. 80 meters tall, then the "error" is 0. 05 meters; if the randomly chosen man is 1. 70 meters tall, then the "error" is −0. 05 meters. The nomenclature arose from random measurement errors in astronomy. Observational error is the difference between a measured value of quantity and its true value Astronomy (from the Greek words astron (ἄστρον "star" and nomos (νόμος "law" is the scientific study It is as if the measurement of the man's height were an attempt to measure the population mean, so that any difference between the man's height and the mean would be a measurement error.

A residual (or fitting error), on the other hand, is an observable estimate of the unobservable statistical error. The simplest case involves a random sample of n men whose heights are measured. The sample mean is used as an estimate of the population mean. In Statistics, a sample is a Subset of a population. Typically the population is very large making a Census or a complete Enumeration In Statistics, a statistical population is a set of entities concerning which Statistical inferences are to be drawn often based on a Random sample Then we have:

• The difference between the height of each man in the sample and the unobservable population mean is a statistical error, and
• The difference between the height of each man in the sample and the observable sample mean is a residual.

Note that the sum of the residuals within a random sample is necessarily zero, and thus the residuals are necessarily not independent. In Probability theory, to say that two events are independent intuitively means that the occurrence of one event makes it neither more nor less probable that the other The sum of the statistical errors within a random sample need not be zero; the statistical errors are independent random variables if the individuals are chosen from the population independently. A random variable is a rigorously defined mathematical entity used mainly to describe Chance and Probability in a mathematical way

In sum:

• Residuals are observable; statistical errors are not.
• Statistical errors are often independent of each other; residuals are not (at least in the simple situation described above, and in most others).

Example with some mathematical theory

If we assume a normally distributed population with mean μ and standard deviation σ, and choose individuals independently, then we have

$X_1, \dots, X_n\sim N(\mu,\sigma^2)\,$

and the sample mean

$\overline{X}={X_1 + \cdots + X_n \over n}$

is a random variable distributed thus:

$\overline{X}\sim N(\mu, \sigma^2/n).$

The statistical errors are then

$\varepsilon_i=X_i-\mu,\,$

whereas the residuals are

$\widehat{\varepsilon}_i=X_i-\overline{X}.$

(As is often done, the "hat" over the letter ε indicates an observable estimate of an unobservable quantity called ε. The normal distribution, also called the Gaussian distribution, is an important family of Continuous probability distributions applicable in many fields In Probability and Statistics, the standard deviation is a measure of the dispersion of a collection of values In Mathematics and Statistics, the arithmetic Mean (or simply the mean) of a list of numbers is the sum of all the members of the list divided )

The sum of squares of the statistical errors, divided by σ2, has a chi-square distribution with n degrees of freedom:

$\sum_{i=1}^n \left(X_i-\mu\right)^2/\sigma^2\sim\chi^2_n.$

This quantity, however, is not observable. In Probability theory and Statistics, the chi-square distribution (also chi-squared or \chi^2  distribution) is one For other senses of these terms see Degrees of freedom or Degree. The sum of squares of the residuals, on the other hand, is observable. The quotient of that sum by σ2 has a chi-square distribution with only n − 1 degrees of freedom:

$\sum_{i=1}^n \left(\,X_i-\overline{X}\,\right)^2/\sigma^2\sim\chi^2_{n-1}.$

It is remarkable that the sum of squares of the residuals and the sample mean can be shown to be independent of each other. In Probability theory and Statistics, the definition of Variance is either the Expected value (when considering a theoretical distribution That fact and the normal and chi-square distributions given above form the basis of confidence interval calculations relying on Student's t-distribution. In Statistics, a confidence interval (CI is an interval estimate of a Population parameter. In Probability and Statistics, Student's t -distribution (or simply the t -distribution) is a Probability distribution In those calculations one encounters the quotient

${\overline{X}_n - \mu \over S_n/\sqrt{n}},$

in which the σ appears in both the numerator and the denominator and cancels. That is fortunate because in practice one would not know the value of σ2.

References

• Residuals and Influence in Regression, R. Dennis Cook, New York : Chapman and Hall, 1982.
• Applied Linear Regression, Second Edition, Sanford Weisberg, John Wiley & Sons, 1985.