We examine a continuous random variable. Let be the characteristic function of its distribution whose density function is f, and its cumulants. We expand in terms of a known distribution with probability density function ψ, characteristic function , and cumulants . The density ψ is generally chosen to be that of the normal distribution, but other choices are possible as well. By the definition of the cumulants, we have (see Wallace, 1958)
which gives the following formal identity:
By the properties of the Fourier transform, is the Fourier transform of , where D is the differential operator with respect to x. Thus, after changing with on both sides of the equation, we find for f the formal expansion
If ψ is chosen as the normal density
with mean and variance as given by f, that is, mean and variance , then the expansion becomes
since for all r > 2, as higher cumulants of the normal distribution are 0. By expanding the exponential and collecting terms according to the order of the derivatives, we arrive at the Gram–Charlier A series. Such an expansion can be written compactly in terms of Bell polynomials as
Since the n-th derivative of the Gaussian function is given in terms of Hermite polynomial as
this gives us the final expression of the Gram-Charlier A series as
If we include only the first two correction terms to the normal distribution, we obtain
with and .
Note that this expression is not guaranteed to be positive, and is therefore not a valid probability distribution. The Gram–Charlier A series diverges in many cases of interest—it converges only if falls off faster than at infinity (Cramér 1957). When it does not converge, the series is also not a true asymptotic expansion, because it is not possible to estimate the error of the expansion. For this reason, the Edgeworth series (see next section) is generally preferred over the Gram–Charlier A series.
for every , as long as the mean and variance are finite.
Now assume that, in addition to having mean and variance , the i.i.d. random variables have higher cumulants . From the additivity and homogeneity properties of cumulants, the cumulants of in terms of the cumulants of are for ,
If we expand in terms of the standard normal distribution, that is, if we set
then the cumulant differences in the formal expression of the characteristic function of are
The Gram-Charlier A series for the density function of is now
The Edgeworth series is developed similarly to the Gram–Charlier A series, only that now terms are collected according to powers of . The coefficients of n-m/2 term can be obtained by collecting the monomials of the Bell polynomials corresponding to the integer partitions of m. Thus, we have the characteristic function as
where is a polynomial of degree . Again, after inverse Fourier transform, the density function follows as
Likewise, integrating the series, we obtain the distribution function
We can explicitly write the polynomial as
where the summation is over all the integer partitions of m such that and and
For example, if m = 3, then there are three ways to partition this number: 1 + 1 + 1 = 2 + 1 = 3. As such we need to examine three cases:
1 + 1 + 1 = 1 · k1, so we have k1 = 3, l1 = 3, and s = 9.
1 + 2 = 1 · k1 + 2 · k2, so we have k1 = 1, k2 = 1, l1 = 3, l2 = 4, and s = 7.
Here, φ(j)(x) is the j-th derivative of φ(·) at point x. Remembering that the derivatives of the density of the normal distribution are related to the normal density by , (where is the Hermite polynomial of order n), this explains the alternative representations in terms of the density function. Blinnikov and Moessner (1998) have given a simple algorithm to calculate higher-order terms of the expansion.
Note that in case of a lattice distributions (which have discrete values), the Edgeworth expansion must be adjusted to account for the discontinuous jumps between lattice points.
Illustration: density of the sample mean of three
Density of the sample mean of three chi2 variables. The chart compares the true density, the normal approximation, and two edgeworth expansions
The integral of the density need not integrate to 1
Probabilities can be negative
They can be inaccurate, especially in the tails, due to mainly two reasons:
They are obtained under a Taylor series around the mean
They guarantee (asymptotically) an absolute error, not a relative one. This is an issue when one wants to approximate very small quantities, for which the absolute error might be small, but the relative error important.