Given a data set { y i , x i 1 , … , x i p } i = 1 n {\displaystyle \{y_{i},\,x_{i1},\ldots ,x_{ip}\}_{i=1}^{n}} of n statistical units, where { x i 1 , … , x i p } i = 1 n {\displaystyle \{x_{i1},\ldots ,x_{ip}\}_{i=1}^{n}} represent predictors and y i {\displaystyle y_{i}} is the outcome, the additive model takes the form
or
Where E [ ϵ ] = 0 {\displaystyle \mathrm {E} [\epsilon ]=0} , V a r ( ϵ ) = σ 2 {\displaystyle \mathrm {Var} (\epsilon )=\sigma ^{2}} and E [ f j ( X j ) ] = 0 {\displaystyle \mathrm {E} [f_{j}(X_{j})]=0} . The functions f j ( x i j ) {\displaystyle f_{j}(x_{ij})} are unknown smooth functions fit from the data. Fitting the AM (i.e. the functions f j ( x i j ) {\displaystyle f_{j}(x_{ij})} ) can be done using the backfitting algorithm proposed by Andreas Buja, Trevor Hastie and Robert Tibshirani (1989).2
Friedman, J.H. and Stuetzle, W. (1981). "Projection Pursuit Regression", Journal of the American Statistical Association 76:817–823. doi:10.1080/01621459.1981.10477729 /wiki/Friedman,_J.H. ↩
Buja, A., Hastie, T., and Tibshirani, R. (1989). "Linear Smoothers and Additive Models", The Annals of Statistics 17(2):453–555. JSTOR 2241560 /wiki/JSTOR_(identifier) ↩