Hinge loss

In <a href="/facts/Machine_learning/e0w0XJTu">machine learning</a>, the hinge loss is a <a href="/facts/Loss_function/xv5ozuhl">loss function</a> used for training <a href="/facts/Statistical_classification/jXXHRkXR">classifiers</a>. The hinge loss is used for "maximum-margin" classification, most notably for <a href="/facts/Support_vector_machine/XobxpdBG">support vector machines</a> (SVMs).
For an intended output t = ±1 and a classifier score y, the hinge loss of the prediction y is defined as

ℓ
        (
        y
        )
        =
        max
        (
        0
        ,
        1
        −
        t
        ⋅
        y
        )
      
    
    {\displaystyle \ell (y)=\max(0,1-t\cdot y)}

Note that 
 
 
 
 y
 
 
 {\displaystyle y}
 
 should be the "raw" output of the classifier's decision function, not the predicted class label. For instance, in linear SVMs, 
 
 
 
 y
 =
 
 w
 
 ⋅
 
 x
 
 +
 b
 
 
 {\displaystyle y=\mathbf {w} \cdot \mathbf {x} +b}
 
, where 
 
 
 
 (
 
 w
 
 ,
 b
 )
 
 
 {\displaystyle (\mathbf {w} ,b)}
 
 are the parameters of the <a href="/facts/Hyperplane/2s2ycYYd">hyperplane</a> and 
 
 
 
 
 x
 
 
 
 {\displaystyle \mathbf {x} }
 
 is the input variable(s).
When t and y have the same sign (meaning y predicts the right class) and 
 
 
 
 
 |
 
 y
 
 |
 
 ≥
 1
 
 
 {\displaystyle |y|\geq 1}
 
, the hinge loss 
 
 
 
 ℓ
 (
 y
 )
 =
 0
 
 
 {\displaystyle \ell (y)=0}
 
. When they have opposite signs, 
 
 
 
 ℓ
 (
 y
 )
 
 
 {\displaystyle \ell (y)}
 
 increases linearly with y, and similarly if 
 
 
 
 
 |
 
 y
 
 |
 
 <
 1
 
 
 {\displaystyle |y|<1}
 
, even if it has the same sign (correct prediction, but not by enough margin).

Hinge loss open-in-new

Hinge loss