Many different semiparametric regression methods have been proposed and developed. The most popular methods are the partially linear, index and varying coefficient models.
A partially linear model is given by
where Y i {\displaystyle Y_{i}} is the dependent variable, X i {\displaystyle X_{i}} is a p × 1 {\displaystyle p\times 1} vector of explanatory variables, β {\displaystyle \beta } is a p × 1 {\displaystyle p\times 1} vector of unknown parameters and Z i ∈ R q {\displaystyle Z_{i}\in \operatorname {R} ^{q}} . The parametric part of the partially linear model is given by the parameter vector β {\displaystyle \beta } while the nonparametric part is the unknown function g ( Z i ) {\displaystyle g\left(Z_{i}\right)} . The data is assumed to be i.i.d. with E ( u i | X i , Z i ) = 0 {\displaystyle E\left(u_{i}|X_{i},Z_{i}\right)=0} and the model allows for a conditionally heteroskedastic error process E ( u i 2 | x , z ) = σ 2 ( x , z ) {\displaystyle E\left(u_{i}^{2}|x,z\right)=\sigma ^{2}\left(x,z\right)} of unknown form. This type of model was proposed by Robinson (1988) and extended to handle categorical covariates by Racine and Li (2007).
This method is implemented by obtaining a n {\displaystyle {\sqrt {n}}} consistent estimator of β {\displaystyle \beta } and then deriving an estimator of g ( Z i ) {\displaystyle g\left(Z_{i}\right)} from the nonparametric regression of Y i − X i ′ β ^ {\displaystyle Y_{i}-X'_{i}{\hat {\beta }}} on z {\displaystyle z} using an appropriate nonparametric regression method.1
A single index model takes the form
where Y {\displaystyle Y} , X {\displaystyle X} and β 0 {\displaystyle \beta _{0}} are defined as earlier and the error term u {\displaystyle u} satisfies E ( u | X ) = 0 {\displaystyle E\left(u|X\right)=0} . The single index model takes its name from the parametric part of the model x ′ β {\displaystyle x'\beta } which is a scalar single index. The nonparametric part is the unknown function g ( ⋅ ) {\displaystyle g\left(\cdot \right)} .
The single index model method developed by Ichimura (1993) is as follows. Consider the situation in which y {\displaystyle y} is continuous. Given a known form for the function g ( ⋅ ) {\displaystyle g\left(\cdot \right)} , β 0 {\displaystyle \beta _{0}} could be estimated using the nonlinear least squares method to minimize the function
Since the functional form of g ( ⋅ ) {\displaystyle g\left(\cdot \right)} is not known, we need to estimate it. For a given value for β {\displaystyle \beta } an estimate of the function
using kernel method. Ichimura (1993) proposes estimating g ( X i ′ β ) {\displaystyle g\left(X'_{i}\beta \right)} with
the leave-one-out nonparametric kernel estimator of G ( X i ′ β ) {\displaystyle G\left(X'_{i}\beta \right)} .
If the dependent variable y {\displaystyle y} is binary and X i {\displaystyle X_{i}} and u i {\displaystyle u_{i}} are assumed to be independent, Klein and Spady (1993) propose a technique for estimating β {\displaystyle \beta } using maximum likelihood methods. The log-likelihood function is given by
where g ^ − i ( X i ′ β ) {\displaystyle {\hat {g}}_{-i}\left(X'_{i}\beta \right)} is the leave-one-out estimator.
Hastie and Tibshirani (1993) propose a smooth coefficient model given by
where X i {\displaystyle X_{i}} is a k × 1 {\displaystyle k\times 1} vector and β ( z ) {\displaystyle \beta \left(z\right)} is a vector of unspecified smooth functions of z {\displaystyle z} .
γ ( ⋅ ) {\displaystyle \gamma \left(\cdot \right)} may be expressed as
See Li and Racine (2007) for an in-depth look at nonparametric regression methods. ↩