L1-norm principal component analysis

L1-norm principal component analysis (L1-PCA) is a general method for multivariate data analysis.
L1-PCA is often preferred over standard L2-norm <a href="/facts/Principal_component_analysis/pCbVTqdR">principal component analysis</a> (PCA) when the analyzed data may contain <a href="/facts/Outlier/VBhIzGbx">outliers</a> (faulty values or corruptions), as it is believed to be <a href="/facts/Robust_statistics/RwIMvtD2">robust</a>.
Both L1-PCA and standard PCA seek a collection of <a href="/facts/Orthogonality/JVIjYPog">orthogonal</a> directions (principal components) that define a <a href="/facts/Linear_subspace/2wtKTgHL">subspace</a> wherein data representation is maximized according to the selected criterion.
Standard PCA quantifies data representation as the aggregate of the <a href="/facts/Norm_(mathematics)/xIbR4uE1">L2-norm</a> of the data point <a href="/facts/Projection_(linear_algebra)/sElXGkxD">projections</a> into the subspace, or equivalently the aggregate <a href="/facts/Euclidean_distance/9qDoQKQe">Euclidean distance</a> of the original points from their subspace-projected representations.
L1-PCA uses instead the aggregate of the L1-norm of the data point projections into the subspace. In PCA and L1-PCA, the number of principal components (PCs) is lower than the <a href="/facts/Rank_(linear_algebra)/Yp9b5LK3">rank</a> of the analyzed matrix, which coincides with the dimensionality of the space defined by the original data points.
Therefore, PCA or L1-PCA are commonly employed for <a href="/facts/Dimensionality_reduction/XoRpcrB0">dimensionality reduction</a> for the purpose of data denoising or compression.
Among the advantages of standard PCA that contributed to its high popularity are <a href="/facts/Computational_complexity/Qenh22Ja">low-cost</a> computational implementation by means of <a href="/facts/Singular-value_decomposition/8t3v6wk3">singular-value decomposition</a> (SVD) and statistical optimality when the data set is generated by a true <a href="/facts/Multivariate_normal_distribution/2Xfegqz2">multivariate normal</a> data source.
However, in modern big data sets, data often include corrupted, faulty points, commonly referred to as outliers.
Standard PCA is known to be sensitive to outliers, even when they appear as a small fraction of the processed data.
The reason is that the L2-norm formulation of L2-PCA places squared emphasis on the magnitude of each coordinate of each data point, ultimately overemphasizing peripheral points, such as outliers. 
On the other hand, following an L1-norm formulation, L1-PCA places linear emphasis on the coordinates of each data point, effectively restraining outliers.

L1-norm principal component analysis open-in-new

L1-norm principal component analysis