Menu
Home Explore People Places Arts History Plants & Animals Science Life & Culture Technology
On this page
Centering matrix
Kind of matrix

In mathematics and multivariate statistics, the centering matrix is a symmetric and idempotent matrix, which when multiplied with a vector has the same effect as subtracting the mean of the components of the vector from every component of that vector.

We don't have any images related to Centering matrix yet.
We don't have any YouTube videos related to Centering matrix yet.
We don't have any PDF documents related to Centering matrix yet.
We don't have any Books related to Centering matrix yet.
We don't have any archived web articles related to Centering matrix yet.

Definition

The centering matrix of size n is defined as the n-by-n matrix

C n = I n − 1 n J n {\displaystyle C_{n}=I_{n}-{\tfrac {1}{n}}J_{n}}

where I n {\displaystyle I_{n}\,} is the identity matrix of size n and J n {\displaystyle J_{n}} is an n-by-n matrix of all 1's.

For example

C 1 = [ 0 ] {\displaystyle C_{1}={\begin{bmatrix}0\end{bmatrix}}} , C 2 = [ 1 0 0 1 ] − 1 2 [ 1 1 1 1 ] = [ 1 2 − 1 2 − 1 2 1 2 ] {\displaystyle C_{2}=\left[{\begin{array}{rrr}1&0\\0&1\end{array}}\right]-{\frac {1}{2}}\left[{\begin{array}{rrr}1&1\\1&1\end{array}}\right]=\left[{\begin{array}{rrr}{\frac {1}{2}}&-{\frac {1}{2}}\\-{\frac {1}{2}}&{\frac {1}{2}}\end{array}}\right]} , C 3 = [ 1 0 0 0 1 0 0 0 1 ] − 1 3 [ 1 1 1 1 1 1 1 1 1 ] = [ 2 3 − 1 3 − 1 3 − 1 3 2 3 − 1 3 − 1 3 − 1 3 2 3 ] {\displaystyle C_{3}=\left[{\begin{array}{rrr}1&0&0\\0&1&0\\0&0&1\end{array}}\right]-{\frac {1}{3}}\left[{\begin{array}{rrr}1&1&1\\1&1&1\\1&1&1\end{array}}\right]=\left[{\begin{array}{rrr}{\frac {2}{3}}&-{\frac {1}{3}}&-{\frac {1}{3}}\\-{\frac {1}{3}}&{\frac {2}{3}}&-{\frac {1}{3}}\\-{\frac {1}{3}}&-{\frac {1}{3}}&{\frac {2}{3}}\end{array}}\right]}

Properties

Given a column-vector, v {\displaystyle \mathbf {v} \,} of size n, the centering property of C n {\displaystyle C_{n}\,} can be expressed as

C n v = v − ( 1 n J n , 1 T v ) J n , 1 {\displaystyle C_{n}\,\mathbf {v} =\mathbf {v} -({\tfrac {1}{n}}J_{n,1}^{\textrm {T}}\mathbf {v} )J_{n,1}}

where J n , 1 {\displaystyle J_{n,1}} is a column vector of ones and 1 n J n , 1 T v {\displaystyle {\tfrac {1}{n}}J_{n,1}^{\textrm {T}}\mathbf {v} } is the mean of the components of v {\displaystyle \mathbf {v} \,} .

C n {\displaystyle C_{n}\,} is symmetric positive semi-definite.

C n {\displaystyle C_{n}\,} is idempotent, so that C n k = C n {\displaystyle C_{n}^{k}=C_{n}} , for k = 1 , 2 , … {\displaystyle k=1,2,\ldots } . Once the mean has been removed, it is zero and removing it again has no effect.

C n {\displaystyle C_{n}\,} is singular. The effects of applying the transformation C n v {\displaystyle C_{n}\,\mathbf {v} } cannot be reversed.

C n {\displaystyle C_{n}\,} has the eigenvalue 1 of multiplicity n − 1 and eigenvalue 0 of multiplicity 1.

C n {\displaystyle C_{n}\,} has a nullspace of dimension 1, along the vector J n , 1 {\displaystyle J_{n,1}} .

C n {\displaystyle C_{n}\,} is an orthogonal projection matrix. That is, C n v {\displaystyle C_{n}\mathbf {v} } is a projection of v {\displaystyle \mathbf {v} \,} onto the (n − 1)-dimensional subspace that is orthogonal to the nullspace J n , 1 {\displaystyle J_{n,1}} . (This is the subspace of all n-vectors whose components sum to zero.)

The trace of C n {\displaystyle C_{n}} is n ( n − 1 ) / n = n − 1 {\displaystyle n(n-1)/n=n-1} .

Application

Although multiplication by the centering matrix is not a computationally efficient way of removing the mean from a vector, it is a convenient analytical tool. It can be used not only to remove the mean of a single vector, but also of multiple vectors stored in the rows or columns of an m-by-n matrix X {\displaystyle X} .

The left multiplication by C m {\displaystyle C_{m}} subtracts a corresponding mean value from each of the n columns, so that each column of the product C m X {\displaystyle C_{m}\,X} has a zero mean. Similarly, the multiplication by C n {\displaystyle C_{n}} on the right subtracts a corresponding mean value from each of the m rows, and each row of the product X C n {\displaystyle X\,C_{n}} has a zero mean. The multiplication on both sides creates a doubly centred matrix C m X C n {\displaystyle C_{m}\,X\,C_{n}} , whose row and column means are equal to zero.

The centering matrix provides in particular a succinct way to express the scatter matrix, S = ( X − μ J n , 1 T ) ( X − μ J n , 1 T ) T {\displaystyle S=(X-\mu J_{n,1}^{\mathrm {T} })(X-\mu J_{n,1}^{\mathrm {T} })^{\mathrm {T} }} of a data sample X {\displaystyle X\,} , where μ = 1 n X J n , 1 {\displaystyle \mu ={\tfrac {1}{n}}XJ_{n,1}} is the sample mean. The centering matrix allows us to express the scatter matrix more compactly as

S = X C n ( X C n ) T = X C n C n X T = X C n X T . {\displaystyle S=X\,C_{n}(X\,C_{n})^{\mathrm {T} }=X\,C_{n}\,C_{n}\,X\,^{\mathrm {T} }=X\,C_{n}\,X\,^{\mathrm {T} }.}

C n {\displaystyle C_{n}} is the covariance matrix of the multinomial distribution, in the special case where the parameters of that distribution are k = n {\displaystyle k=n} , and p 1 = p 2 = ⋯ = p n = 1 n {\displaystyle p_{1}=p_{2}=\cdots =p_{n}={\frac {1}{n}}} .

References

  1. John I. Marden, Analyzing and Modeling Rank Data, Chapman & Hall, 1995, ISBN 0-412-99521-2, page 59. /wiki/ISBN_(identifier)