This page uses content from Wikipedia and is licensed under CC BY-SA.

A **whitening transformation** or **sphering transformation** is a linear transformation that transforms a vector of random variables with a known covariance matrix into a set of new variables whose covariance is the identity matrix, meaning that they are uncorrelated and each have variance 1.^{[1]} The transformation is called "whitening" because it changes the input vector into a white noise vector.

Several other transformations are closely related to whitening:

- the
**decorrelation transform**removes only the correlations but leaves variances intact, - the
**standardization transform**sets variances to 1 but leaves correlations intact, - a
**coloring transformation**transforms a vector of white random variables into a random vector with a specified covariance matrix.^{[2]}

Suppose is a random (column) vector with non-singular covariance matrix and mean . Then the transformation with
a **whitening matrix** satisfying the condition yields the whitened random vector with unit diagonal covariance.

There are infinitely many possible whitening matrices that all satisfy the above condition. Commonly used choices are (Mahalanobis or ZCA whitening), the Cholesky decomposition of (Cholesky whitening), or the eigen-system of (PCA whitening).^{[3]}

Optimal whitening transforms can be singled out by investigating the cross-covariance and cross-correlation of and .^{[4]} For example, the unique optimal whitening transformation achieving maximal component-wise correlation between original and whitened is produced by the whitening matrix where is the correlation matrix and the variance matrix.

Whitening a data matrix follows the same transformation as for random variables. An empirical whitening transform is obtained by estimating the covariance (e.g. by maximum likelihood) and subsequently constructing a corresponding estimated whitening matrix (e.g. by Cholesky decomposition).

An implementation of several whitening procedures in R, including ZCA-whitening and PCA whitening but also CCA whitening, is available in the "whitening" R package ^{[5]} published on CRAN.

An example of Python implementation of ZCA whitening ^{[6]}

```
import numpy as np
def zca_whitening_matrix(X):
"""
Function to compute ZCA whitening matrix (aka Mahalanobis whitening).
INPUT: X: [M x N] matrix.
Rows: Variables
Columns: Observations
OUTPUT: ZCAMatrix: [M x M] matrix
"""
# Covariance matrix [column-wise variables]: Sigma = (X-mu)' * (X-mu) / N
sigma = np.cov(X, rowvar=True) # [M x M]
# Singular Value Decomposition. X = U * np.diag(S) * V
U,S,V = np.linalg.svd(sigma)
# U: [M x M] eigenvectors of sigma.
# S: [M x 1] eigenvalues of sigma.
# V: [M x M] transpose of U
# Whitening constant: prevents division by zero
epsilon = 1e-5
# ZCA Whitening matrix: U * Lambda * U'
ZCAMatrix = np.dot(U, np.dot(np.diag(1.0/np.sqrt(S + epsilon)), U.T)) # [M x M]
return ZCAMatrix
```

**^**Koivunen, A.C.; Kostinski, A.B. (1999). "The Feasibility of Data Whitening to Improve Performance of Weather Radar".*American Meteorological Society*. doi:10.1175/1520-0450(1999)038<0741:TFODWT>2.0.CO;2.**^**Hossain, Miliha. "Whitening and Coloring Transforms for Multivariate Gaussian Random Variables". Project Rhea. Retrieved 21 March 2016.**^**Friedman, J. (1987). "Exploratory Projection Pursuit". ISSN 0162-1459. JSTOR 2289161.**^**Kessy, A.; Lewin, A.; Strimmer, K. (2018). "Optimal whitening and decorrelation".*The American Statistician*.**72**: 309–314. arXiv:1512.00809. doi:10.1080/00031305.2016.1277159.**^**"whitening R package". Retrieved 2018-11-25.**^**[stackoverflow.com]

- [courses.media.mit.edu]
- The ZCA whitening transformation. Appendix A of
*Learning Multiple Layers of Features from Tiny Images*by A. Krizhevsky.