Squared deviations from the mean

On this page

Squared deviations from the mean (SDM) result from squaring deviations. In probability theory and statistics, the definition of variance is either the expected value of the SDM (when considering a theoretical distribution) or its average value (for actual experimental data). Computations for analysis of variance involve the partitioning of a sum of SDM.

We don't have any images related to Squared deviations from the mean yet.

You can add one yourself here.

We don't have any YouTube videos related to Squared deviations from the mean yet.

You can add one yourself here.

We don't have any PDF documents related to Squared deviations from the mean yet.

You can add one yourself here.

We don't have any Books related to Squared deviations from the mean yet.

You can add one yourself here.

We don't have any archived web articles related to Squared deviations from the mean yet.

You can submit a link to a page to archive here.

Background

An understanding of the computations involved is greatly enhanced by a study of the statistical value

E ⁡ ( X 2 ) {\displaystyle \operatorname {E} (X^{2})} , where E {\displaystyle \operatorname {E} } is the expected value operator.

For a random variable X {\displaystyle X} with mean μ {\displaystyle \mu } and variance σ 2 {\displaystyle \sigma ^{2}} ,

σ 2 = E ⁡ ( X 2 ) − μ 2 . {\displaystyle \sigma ^{2}=\operatorname {E} (X^{2})-\mu ^{2}.} ¹

(Its derivation is shown here.) Therefore,

E ⁡ ( X 2 ) = σ 2 + μ 2 . {\displaystyle \operatorname {E} (X^{2})=\sigma ^{2}+\mu ^{2}.}

From the above, the following can be derived:

E ⁡ ( ∑ ( X 2 ) ) = n σ 2 + n μ 2 , {\displaystyle \operatorname {E} \left(\sum \left(X^{2}\right)\right)=n\sigma ^{2}+n\mu ^{2},} E ⁡ ( ( ∑ X ) 2 ) = n σ 2 + n 2 μ 2 . {\displaystyle \operatorname {E} \left(\left(\sum X\right)^{2}\right)=n\sigma ^{2}+n^{2}\mu ^{2}.}

Sample variance

Main article: Sample variance

The sum of squared deviations needed to calculate sample variance (before deciding whether to divide by n or n − 1) is most easily calculated as

S = ∑ x 2 − ( ∑ x ) 2 n {\displaystyle S=\sum x^{2}-{\frac {\left(\sum x\right)^{2}}{n}}}

From the two derived expectations above the expected value of this sum is

E ⁡ ( S ) = n σ 2 + n μ 2 − n σ 2 + n 2 μ 2 n {\displaystyle \operatorname {E} (S)=n\sigma ^{2}+n\mu ^{2}-{\frac {n\sigma ^{2}+n^{2}\mu ^{2}}{n}}}

which implies

E ⁡ ( S ) = ( n − 1 ) σ 2 . {\displaystyle \operatorname {E} (S)=(n-1)\sigma ^{2}.}

This effectively proves the use of the divisor n − 1 in the calculation of an unbiased sample estimate of σ2.

Partition — analysis of variance

Main article: Partition of sums of squares

In the situation where data is available for k different treatment groups having size ni where i varies from 1 to k, then it is assumed that the expected mean of each group is

E ⁡ ( μ i ) = μ + T i {\displaystyle \operatorname {E} (\mu _{i})=\mu +T_{i}}

and the variance of each treatment group is unchanged from the population variance σ 2 {\displaystyle \sigma ^{2}} .

Under the Null Hypothesis that the treatments have no effect, then each of the T i {\displaystyle T_{i}} will be zero.

It is now possible to calculate three sums of squares:

Individual I = ∑ x 2 {\displaystyle I=\sum x^{2}} E ⁡ ( I ) = n σ 2 + n μ 2 {\displaystyle \operatorname {E} (I)=n\sigma ^{2}+n\mu ^{2}} Treatments T = ∑ i = 1 k ( ( ∑ x ) 2 / n i ) {\displaystyle T=\sum _{i=1}^{k}\left(\left(\sum x\right)^{2}/n_{i}\right)} E ⁡ ( T ) = k σ 2 + ∑ i = 1 k n i ( μ + T i ) 2 {\displaystyle \operatorname {E} (T)=k\sigma ^{2}+\sum _{i=1}^{k}n_{i}(\mu +T_{i})^{2}} E ⁡ ( T ) = k σ 2 + n μ 2 + 2 μ ∑ i = 1 k ( n i T i ) + ∑ i = 1 k n i ( T i ) 2 {\displaystyle \operatorname {E} (T)=k\sigma ^{2}+n\mu ^{2}+2\mu \sum _{i=1}^{k}(n_{i}T_{i})+\sum _{i=1}^{k}n_{i}(T_{i})^{2}}

Under the null hypothesis that the treatments cause no differences and all the T i {\displaystyle T_{i}} are zero, the expectation simplifies to

E ⁡ ( T ) = k σ 2 + n μ 2 . {\displaystyle \operatorname {E} (T)=k\sigma ^{2}+n\mu ^{2}.} Combination C = ( ∑ x ) 2 / n {\displaystyle C=\left(\sum x\right)^{2}/n} E ⁡ ( C ) = σ 2 + n μ 2 {\displaystyle \operatorname {E} (C)=\sigma ^{2}+n\mu ^{2}}

Sums of squared deviations

Under the null hypothesis, the difference of any pair of I, T, and C does not contain any dependency on μ {\displaystyle \mu } , only σ 2 {\displaystyle \sigma ^{2}} .

E ⁡ ( I − C ) = ( n − 1 ) σ 2 {\displaystyle \operatorname {E} (I-C)=(n-1)\sigma ^{2}} total squared deviations aka total sum of squares E ⁡ ( T − C ) = ( k − 1 ) σ 2 {\displaystyle \operatorname {E} (T-C)=(k-1)\sigma ^{2}} treatment squared deviations aka explained sum of squares E ⁡ ( I − T ) = ( n − k ) σ 2 {\displaystyle \operatorname {E} (I-T)=(n-k)\sigma ^{2}} residual squared deviations aka residual sum of squares

The constants (n − 1), (k − 1), and (n − k) are normally referred to as the number of degrees of freedom.

Example

In a very simple example, 5 observations arise from two treatments. The first treatment gives three values 1, 2, and 3, and the second treatment gives two values 4, and 6.

I = 1 2 1 + 2 2 1 + 3 2 1 + 4 2 1 + 6 2 1 = 66 {\displaystyle I={\frac {1^{2}}{1}}+{\frac {2^{2}}{1}}+{\frac {3^{2}}{1}}+{\frac {4^{2}}{1}}+{\frac {6^{2}}{1}}=66} T = ( 1 + 2 + 3 ) 2 3 + ( 4 + 6 ) 2 2 = 12 + 50 = 62 {\displaystyle T={\frac {(1+2+3)^{2}}{3}}+{\frac {(4+6)^{2}}{2}}=12+50=62} C = ( 1 + 2 + 3 + 4 + 6 ) 2 5 = 256 / 5 = 51.2 {\displaystyle C={\frac {(1+2+3+4+6)^{2}}{5}}=256/5=51.2}

Giving

Total squared deviations = 66 − 51.2 = 14.8 with 4 degrees of freedom. Treatment squared deviations = 62 − 51.2 = 10.8 with 1 degree of freedom. Residual squared deviations = 66 − 62 = 4 with 3 degrees of freedom.

Two-way analysis of variance

This section is an excerpt from Two-way analysis of variance.[edit]

In statistics, the two-way analysis of variance (ANOVA) is an extension of the one-way ANOVA that examines the influence of two different categorical independent variables on one continuous dependent variable. The two-way ANOVA not only aims at assessing the main effect of each independent variable but also if there is any interaction between them.

References

Mood & Graybill: An introduction to the Theory of Statistics (McGraw Hill) ↩

Background

Sample variance

Partition — analysis of variance

Sums of squared deviations

Example

Two-way analysis of variance

See also

References