In a model with a single explanatory variable, RSS is given by:1
where yi is the ith value of the variable to be predicted, xi is the ith value of the explanatory variable, and f ( x i ) {\displaystyle f(x_{i})} is the predicted value of yi (also termed y i ^ {\displaystyle {\hat {y_{i}}}} ). In a standard linear simple regression model, y i = α + β x i + ε i {\displaystyle y_{i}=\alpha +\beta x_{i}+\varepsilon _{i}\,} , where α {\displaystyle \alpha } and β {\displaystyle \beta } are coefficients, y and x are the regressand and the regressor, respectively, and ε is the error term. The sum of squares of residuals is the sum of squares of ε ^ i {\displaystyle {\widehat {\varepsilon \,}}_{i}} ; that is
where α ^ {\displaystyle {\widehat {\alpha \,}}} is the estimated value of the constant term α {\displaystyle \alpha } and β ^ {\displaystyle {\widehat {\beta \,}}} is the estimated value of the slope coefficient β {\displaystyle \beta } .
The general regression model with n observations and k explanators, the first of which is a constant unit vector whose coefficient is the regression intercept, is
where y is an n × 1 vector of dependent variable observations, each column of the n × k matrix X is a vector of observations on one of the k explanators, β {\displaystyle \beta } is a k × 1 vector of true coefficients, and e is an n× 1 vector of the true underlying errors. The ordinary least squares estimator for β {\displaystyle \beta } is
The residual vector e ^ = y − X β ^ = y − X ( X T X ) − 1 X T y {\displaystyle {\hat {e}}=y-X{\hat {\beta }}=y-X(X^{\operatorname {T} }X)^{-1}X^{\operatorname {T} }y} ; so the residual sum of squares is:
(equivalent to the square of the norm of residuals). In full:
where H is the hat matrix, or the projection matrix in linear regression.
The least-squares regression line is given by
where b = y ¯ − a x ¯ {\displaystyle b={\bar {y}}-a{\bar {x}}} and a = S x y S x x {\displaystyle a={\frac {S_{xy}}{S_{xx}}}} , where S x y = ∑ i = 1 n ( x ¯ − x i ) ( y ¯ − y i ) {\displaystyle S_{xy}=\sum _{i=1}^{n}({\bar {x}}-x_{i})({\bar {y}}-y_{i})} and S x x = ∑ i = 1 n ( x ¯ − x i ) 2 . {\displaystyle S_{xx}=\sum _{i=1}^{n}({\bar {x}}-x_{i})^{2}.}
Therefore,
where S y y = ∑ i = 1 n ( y ¯ − y i ) 2 . {\displaystyle S_{yy}=\sum _{i=1}^{n}({\bar {y}}-y_{i})^{2}.}
The Pearson product-moment correlation is given by r = S x y S x x S y y ; {\displaystyle r={\frac {S_{xy}}{\sqrt {S_{xx}S_{yy}}}};} therefore, RSS = S y y ( 1 − r 2 ) . {\displaystyle \operatorname {RSS} =S_{yy}(1-r^{2}).}
Archdeacon, Thomas J. (1994). Correlation and regression analysis : a historian's guide. University of Wisconsin Press. pp. 161–162. ISBN 0-299-13650-7. OCLC 27266095. 0-299-13650-7 ↩