Rectifier (neural networks)

In the context of <a href="/facts/Neural_network_(machine_learning)/6V1jMlkx">artificial neural networks</a>, the rectifier or ReLU (rectified linear unit) activation function is an <a href="/facts/Activation_function/S4NImL6L">activation function</a> defined as the non-negative part of its argument, i.e., the <a href="/facts/Ramp_function/A2cR3cq0">ramp function</a>:

ReLU
        ⁡
        (
        x
        )
        =
        
          x
          
            +
          
        
        =
        max
        (
        0
        ,
        x
        )
        =
        
          
            
              x
              +
              
                |
              
              x
              
                |
              
            
            2
          
        
        =
        
          
            {
            
              
                
                  x
                
                
                  
                    if 
                  
                  x
                  >
                  0
                  ,
                
              
              
                
                  0
                
                
                  x
                  ≤
                  0
                
              
            
            
          
        
      
    
    {\displaystyle \operatorname {ReLU} (x)=x^{+}=\max(0,x)={\frac {x+|x|}{2}}={\begin{cases}x&{\text{if }}x>0,\\0&x\leq 0\end{cases}}}

where 
 
 
 
 x
 
 
 {\displaystyle x}
 
 is the input to a <a href="/facts/Artificial_neuron/TSH6nsgg">neuron</a>. This is analogous to <a href="/facts/Half-wave_rectification/me8XDxND">half-wave rectification</a> in <a href="/facts/Electrical_engineering/YYXW1KLW">electrical engineering</a>.
ReLU is one of the most popular activation functions for artificial neural networks, and finds application in <a href="/facts/Computer_vision/Tl2Yyk66">computer vision</a> and <a href="/facts/Speech_recognition/z7S7Pgk6">speech recognition</a> using <a href="/facts/Deep_learning/JLuwD3ea">deep neural nets</a> and <a href="/facts/Computational_neuroscience/eclSYyJK">computational neuroscience</a>.
It was first used by <a href="/facts/Alston_Scott_Householder/6Erd17Un">Alston Householder</a> in 1941 as a mathematical abstraction of biological neural networks. It was introduced by <a href="/facts/Kunihiko_Fukushima/lEjNNvI2">Kunihiko Fukushima</a> in 1969 in the context of visual feature extraction in hierarchical neural networks. It was later argued that it has strong <a href="/facts/Biological/1M1ee3pB">biological</a> motivations and mathematical justifications. In 2011, ReLU activation enabled training deep <a href="/facts/Supervised_learning/JlNFNPFt">supervised</a> neural networks without <a href="/facts/Unsupervised_learning/8a6f61MC">unsupervised</a> pre-training, compared to the widely used activation functions prior to 2011, e.g., the <a href="/facts/Logistic_function/IaQ254t7">logistic sigmoid</a> (which is inspired by <a href="/facts/Probability_theory/mFBn51rE">probability theory</a>; see <a href="/facts/Logistic_regression/mGj6mc8y">logistic regression</a>) and its more practical counterpart, the <a href="/facts/Hyperbolic_tangent/Y4Cb5S35">hyperbolic tangent</a>.

Rectifier (neural networks) open-in-new

Rectifier (neural networks)