Normalization (machine learning)

In <a href="/facts/Machine_learning/e0w0XJTu">machine learning</a>, normalization is a statistical technique with various applications. There are two main forms of normalization, namely data normalization and activation normalization. Data normalization (or <a href="/facts/Feature_scaling/3WfXH9dR">feature scaling</a>) includes methods that rescale input data so that the <a href="/facts/Feature_(machine_learning)/nzAhYxfu">features</a> have the same range, mean, variance, or other statistical properties. For instance, a popular choice of feature scaling method is <a href="/facts/Feature_scaling/3WfXH9dR">min-max normalization</a>, where each feature is transformed to have the same range (typically 
 
 
 
 [
 0
 ,
 1
 ]
 
 
 {\displaystyle [0,1]}
 
 or 
 
 
 
 [
 −
 1
 ,
 1
 ]
 
 
 {\displaystyle [-1,1]}
 
). This solves the problem of different features having vastly different scales, for example if one feature is measured in kilometers and another in nanometers.
Activation normalization, on the other hand, is specific to <a href="/facts/Deep_learning/JLuwD3ea">deep learning</a>, and includes methods that rescale the activation of <a href="/facts/Hidden_layer/uKzUskIX">hidden neurons</a> inside <a href="/facts/Neural_network_(machine_learning)/6V1jMlkx">neural networks</a>.
Normalization is often used to:

<ul><li>increase the speed of training convergence,</li>
<li>reduce sensitivity to variations and feature scales in input data,</li>
<li>reduce <a href="/facts/Overfitting/5xnFLcMg">overfitting</a>,</li>
<li>and produce better model generalization to unseen data.</li></ul>
Normalization techniques are often theoretically justified as reducing covariance shift, smoothing optimization landscapes, and increasing <a href="/facts/Regularization_(mathematics)/K601A3y2">regularization</a>, though they are mainly justified by empirical success.

Normalization (machine learning) open-in-new

Normalization (machine learning)