Chi-square distribution
For any positive integer , the chi-square distribution with k degrees of freedom is the probability distribution of the random variable
-
where
Z1, ...,
Zk are
independent normal variabless, each having
expected value 0 and
variance 1. This distribution is usually written
If independent linear homogeneous constraints are imposed on these variables, the distribution of conditional on these constriants is , justifying the term "degrees of freedom". The
characteristic function of the Chi-square distribution is
The chi-square distribution has numerous applications in inferential
statistics, for instance in chi-square tests and in estimating variances. It enters the problem of estimating the mean of a normally distributed population and the problem of estimating the slope of a
regression line via its role in
Student's t-distribution. It enters all
analysis of variance problems via its role in the
F-distribution, which is the distribution of the ratio of two chi-squared
random variables.
Its probability density function is
-
and
pk(
x) = 0 for
x≤0. Here Γ denotes the
gamma function.
The normal approximation
If , then as tends to infinity, the distribution of tends to normality. However, the tendency is slow (the skewness is and the kurtosis is ) and two transformations are commonly considered, each of which approaches normality faster than itself:
Fisher showed that is approximately normally distributed with mean and unit variance.
Wilson and Hilferty showed in 1931 that is approximately normally distributed with mean and variance .
The expected value of a random variable having chi-square distribution with k degrees of freedom is k and the variance is 2k. The median is given approximately by
Note that 2 degrees of freedom leads to an
exponential distribution.
The chi-square distribution is a special case of the gamma distribution.
See
Cochran's theorem.