Mean absolute percentage error

<h2 id="mape-in-regression-problems">MAPE in regression problems</h2>
Mean absolute percentage error is commonly used as a loss function for <a href="/facts/Regression_analysis/n6z5Tf7K">regression problems</a> and in model evaluation, because of its very intuitive interpretation in terms of relative error.

<h3>Definition</h3>
Consider a standard regression setting in which the data are fully described by a random pair 
 
 
 
 Z
 =
 (
 X
 ,
 Y
 )
 
 
 {\displaystyle Z=(X,Y)}
 
 with values in 
 
 
 
 
 
 R
 
 
 d
 
 
 ×
 
 R
 
 
 
 {\displaystyle \mathbb {R} ^{d}\times \mathbb {R} }
 
, and n i.i.d. copies 
 
 
 
 (
 
 X
 
 1
 
 
 ,
 
 Y
 
 1
 
 
 )
 ,
 .
 .
 .
 ,
 (
 
 X
 
 n
 
 
 ,
 
 Y
 
 n
 
 
 )
 
 
 {\displaystyle (X_{1},Y_{1}),...,(X_{n},Y_{n})}
 
 of 
 
 
 
 (
 X
 ,
 Y
 )
 
 
 {\displaystyle (X,Y)}
 
. Regression models aim at finding a good model for the pair, that is a <a href="/facts/Measurable_function/NgIHnb1b">measurable function</a> g from 
 
 
 
 
 
 R
 
 
 d
 
 
 
 
 {\displaystyle \mathbb {R} ^{d}}
 
 to 
 
 
 
 
 R
 
 
 
 {\displaystyle \mathbb {R} }
 
 such that 
 
 
 
 g
 (
 X
 )
 
 
 {\displaystyle g(X)}
 
 is close to Y.
In the classical regression setting, the closeness of 
 
 
 
 g
 (
 X
 )
 
 
 {\displaystyle g(X)}
 
 to Y is measured via the L2 risk, also called the <a href="/facts/Mean_squared_error/kz3TR7bv">mean squared error</a> (MSE). In the MAPE regression context,<a class="footnote-ref" id="fnref:1" href="#fn:1">1</a> the closeness of 
 
 
 
 g
 (
 X
 )
 
 
 {\displaystyle g(X)}
 
 to Y is measured via the MAPE, and the aim of MAPE regressions is to find a model 
 
 
 
 
 g
 
 MAPE
 
 
 
 
 {\displaystyle g_{\text{MAPE}}}
 
 such that:

 
 
 
 
 g
 
 
 M
 A
 P
 E
 
 
 
 (
 x
 )
 =
 arg
 ⁡
 
 min
 
 g
 ∈
 
 
 G
 
 
 
 
 
 E
 
 
 
 [
 
 
 
 |
 
 
 
 g
 (
 X
 )
 −
 Y
 
 Y
 
 
 |
 
 
 |
 
 X
 =
 x
 
 
 ]
 
 
 
 
 {\displaystyle g_{\mathrm {MAPE} }(x)=\arg \min _{g\in {\mathcal {G}}}\mathbb {E} {\Biggl [}\left|{\frac {g(X)-Y}{Y}}\right||X=x{\Biggr ]}}

where 
 
 
 
 
 
 G
 
 
 
 
 {\displaystyle {\mathcal {G}}}
 
 is the class of models considered (e.g. linear models).
In practice
In practice 
 
 
 
 
 g
 
 MAPE
 
 
 (
 x
 )
 
 
 {\displaystyle g_{\text{MAPE}}(x)}
 
 can be estimated by the <a href="/facts/Empirical_risk_minimization/QIm8CRMC">empirical risk minimization</a> strategy, leading to

 
 
 
 
 
 
 
 g
 ^
 
 
 
 
 MAPE
 
 
 (
 x
 )
 =
 arg
 ⁡
 
 min
 
 g
 ∈
 
 
 G
 
 
 
 
 
 ∑
 
 i
 =
 1
 
 
 n
 
 
 
 |
 
 
 
 g
 (
 
 X
 
 i
 
 
 )
 −
 
 Y
 
 i
 
 
 
 
 Y
 
 i
 
 
 
 
 |
 
 
 
 {\displaystyle {\widehat {g}}_{\text{MAPE}}(x)=\arg \min _{g\in {\mathcal {G}}}\sum _{i=1}^{n}\left|{\frac {g(X_{i})-Y_{i}}{Y_{i}}}\right|}

From a practical point of view, the use of the MAPE as a quality function for regression model is equivalent to doing weighted <a href="/facts/Mean_absolute_error/kzu4tbtv">mean absolute error</a> (MAE) regression, also known as <a href="/facts/Quantile_regression/5J05lVIx">quantile regression</a>. This property is trivial since

 
 
 
 
 
 
 
 g
 ^
 
 
 
 
 MAPE
 
 
 (
 x
 )
 =
 arg
 ⁡
 
 min
 
 g
 ∈
 
 
 G
 
 
 
 
 
 ∑
 
 i
 =
 1
 
 
 n
 
 
 ω
 (
 
 Y
 
 i
 
 
 )
 
 |
 
 g
 (
 
 X
 
 i
 
 
 )
 −
 
 Y
 
 i
 
 
 
 |
 
 
 
  with 
 
 
 ω
 (
 
 Y
 
 i
 
 
 )
 =
 
 |
 
 
 1
 
 Y
 
 i
 
 
 
 
 |
 
 
 
 {\displaystyle {\widehat {g}}_{\text{MAPE}}(x)=\arg \min _{g\in {\mathcal {G}}}\sum _{i=1}^{n}\omega (Y_{i})\left|g(X_{i})-Y_{i}\right|{\mbox{ with }}\omega (Y_{i})=\left|{\frac {1}{Y_{i}}}\right|}

As a consequence, the use of the MAPE is very easy in practice, for example using existing libraries for quantile regression allowing weights.

<h3>Consistency</h3>
The use of the MAPE as a loss function for regression analysis is feasible both on a practical point of view and on a theoretical one, since the existence of an optimal model and the <a href="/facts/Consistency_(statistics)/rRHIzC06">consistency</a> of the empirical risk minimization can be proved.<a class="footnote-ref" id="fnref:2" href="#fn:2">2</a>

<h2 id="wmape">WMAPE</h2>
WMAPE (sometimes spelled wMAPE) stands for weighted mean absolute percentage error.<a class="footnote-ref" id="fnref:3" href="#fn:3">3</a> It is a measure used to evaluate the performance of regression or forecasting models. It is a variant of MAPE in which the mean absolute percent errors is treated as a weighted arithmetic mean. Most commonly the absolute percent errors are weighted by the actuals (e.g. in case of sales forecasting, errors are weighted by sales volume).<a class="footnote-ref" id="fnref:4" href="#fn:4">4</a> Effectively, this overcomes the 'infinite error' issue.<a class="footnote-ref" id="fnref:5" href="#fn:5">5</a>
Its formula is:<a class="footnote-ref" id="fnref:6" href="#fn:6">6</a>

wMAPE
          
        
        =
        
          
            
              
                ∑
                
                  i
                  =
                  1
                
                
                  n
                
              
              
                (
                
                  
                    w
                    
                      i
                    
                  
                  ⋅
                  
                    
                      
                        
                          |
                          
                            
                              A
                              
                                i
                              
                            
                            −
                            
                              F
                              
                                i
                              
                            
                          
                          |
                        
                        
                          
                            |
                          
                          
                            A
                            
                              i
                            
                          
                          
                            |
                          
                        
                      
                    
                  
                
                )
              
            
            
              
                ∑
                
                  i
                  =
                  1
                
                
                  n
                
              
              
                w
                
                  i
                
              
            
          
        
        =
        
          
            
              
                ∑
                
                  i
                  =
                  1
                
                
                  n
                
              
              
                (
                
                  
                    |
                  
                  
                    A
                    
                      i
                    
                  
                  
                    |
                  
                  ⋅
                  
                    
                      
                        
                          |
                          
                            
                              A
                              
                                i
                              
                            
                            −
                            
                              F
                              
                                i
                              
                            
                          
                          |
                        
                        
                          
                            |
                          
                          
                            A
                            
                              i
                            
                          
                          
                            |
                          
                        
                      
                    
                  
                
                )
              
            
            
              
                ∑
                
                  i
                  =
                  1
                
                
                  n
                
              
              
                |
                
                  A
                  
                    i
                  
                
                |
              
            
          
        
      
    
    {\displaystyle {\mbox{wMAPE}}={\frac {\displaystyle \sum _{i=1}^{n}\left(w_{i}\cdot {\tfrac {\left|A_{i}-F_{i}\right|}{|A_{i}|}}\right)}{\displaystyle \sum _{i=1}^{n}w_{i}}}={\frac {\displaystyle \sum _{i=1}^{n}\left(|A_{i}|\cdot {\tfrac {\left|A_{i}-F_{i}\right|}{|A_{i}|}}\right)}{\displaystyle \sum _{i=1}^{n}\left|A_{i}\right|}}}

Where 
 
 
 
 
 w
 
 i
 
 
 
 
 {\displaystyle w_{i}}
 
 is the weight, 
 
 
 
 A
 
 
 {\displaystyle A}
 
 is a vector of the actual data and 
 
 
 
 F
 
 
 {\displaystyle F}
 
 is the forecast or prediction.
However, this effectively simplifies to a much simpler formula:

wMAPE
          
        
        =
        
          
            
              
                ∑
                
                  i
                  =
                  1
                
                
                  n
                
              
              
                |
                
                  
                    A
                    
                      i
                    
                  
                  −
                  
                    F
                    
                      i
                    
                  
                
                |
              
            
            
              
                ∑
                
                  i
                  =
                  1
                
                
                  n
                
              
              
                |
                
                  A
                  
                    i
                  
                
                |
              
            
          
        
      
    
    {\displaystyle {\mbox{wMAPE}}={\frac {\displaystyle \sum _{i=1}^{n}\left|A_{i}-F_{i}\right|}{\displaystyle \sum _{i=1}^{n}\left|A_{i}\right|}}}

Confusingly, sometimes when people refer to wMAPE they are talking about a different model in which the numerator and denominator of the wMAPE formula above are weighted again by another set of custom weights 
 
 
 
 
 w
 
 i
 
 
 
 
 {\displaystyle w_{i}}
 
. Perhaps it would be more accurate to call this the double weighted MAPE (wwMAPE). Its formula is:

wwMAPE
          
        
        =
        
          
            
              
                ∑
                
                  i
                  =
                  1
                
                
                  n
                
              
              
                w
                
                  i
                
              
              
                |
                
                  
                    A
                    
                      i
                    
                  
                  −
                  
                    F
                    
                      i
                    
                  
                
                |
              
            
            
              
                ∑
                
                  i
                  =
                  1
                
                
                  n
                
              
              
                w
                
                  i
                
              
              
                |
                
                  A
                  
                    i
                  
                
                |
              
            
          
        
      
    
    {\displaystyle {\mbox{wwMAPE}}={\frac {\displaystyle \sum _{i=1}^{n}w_{i}\left|A_{i}-F_{i}\right|}{\displaystyle \sum _{i=1}^{n}w_{i}\left|A_{i}\right|}}}

<h2 id="issues">Issues</h2>
Although the concept of MAPE sounds very simple and convincing, it has major drawbacks in practical application,<a class="footnote-ref" id="fnref:7" href="#fn:7">7</a> and there are many studies on shortcomings and misleading results from MAPE.<a class="footnote-ref" id="fnref:8" href="#fn:8">8</a><a class="footnote-ref" id="fnref:9" href="#fn:9">9</a>

<ul><li>It cannot be used if there are zero or close-to-zero values (which sometimes happens, for example in demand data) because there would be a division by zero or values of MAPE tending to infinity.<a class="footnote-ref" id="fnref:10" href="#fn:10">10</a></li>
<li>For forecasts which are too low the percentage error cannot exceed 100%, but for forecasts which are too high there is no upper limit to the percentage error.</li>
<li>MAPE puts a heavier penalty on negative errors, 
 
 
 
 
 A
 
 t
 
 
 <
 
 F
 
 t
 
 
 
 
 {\displaystyle A_{t}<F_{t}}
 
 than on positive errors.<a class="footnote-ref" id="fnref:11" href="#fn:11">11</a> As a consequence, when MAPE is used to compare the accuracy of prediction methods it is biased in that it will systematically select a method whose forecasts are too low. This little-known but serious issue can be overcome by using an accuracy measure based on the logarithm of the accuracy ratio (the ratio of the predicted to actual value), given by 
 
 
 
 log
 ⁡
 
 (
 
 
 predicted
 actual
 
 
 )
 
 
 
 {\textstyle \log \left({\frac {\text{predicted}}{\text{actual}}}\right)}
 
. This approach leads to superior statistical properties and also leads to predictions which can be interpreted in terms of the geometric mean.<a class="footnote-ref" id="fnref:12" href="#fn:12">12</a></li>
<li>People often think the MAPE will be optimized at the median. But for example, a log normal has a median of 
 
 
 
 
 e
 
 μ
 
 
 
 
 {\displaystyle e^{\mu }}
 
 where as its MAPE is optimized at 
 
 
 
 
 e
 
 μ
 −
 
 σ
 
 2
 
 
 
 
 
 
 {\displaystyle e^{\mu -\sigma ^{2}}}
 
.</li></ul>
To overcome these issues with MAPE, there are some other measures proposed in literature: 

<ul><li><a href="/facts/Mean_Absolute_Scaled_Error/GVRm4yID">Mean Absolute Scaled Error</a> (MASE)</li>
<li><a href="/facts/Symmetric_Mean_Absolute_Percentage_Error/FQFKFaH1">Symmetric Mean Absolute Percentage Error</a> (sMAPE)</li>
<li><a href="/facts/Mean_Directional_Accuracy_(MDA)/vDCM6uKT">Mean Directional Accuracy (MDA)</a></li>
<li>Mean Arctangent Absolute Percentage Error (MAAPE): MAAPE can be considered a slope as an angle, while MAPE is a slope as a ratio.<a class="footnote-ref" id="fnref:13" href="#fn:13">13</a></li></ul>
<h2 id="see-also">See also</h2>
<ul><li><a href="/facts/Least_absolute_deviations/fcqbHnEt">Least absolute deviations</a></li>
<li><a href="/facts/Mean_absolute_error/kzu4tbtv">Mean absolute error</a></li>
<li><a href="/facts/Mean_percentage_error/Sn5llElL">Mean percentage error</a></li>
<li><a href="/facts/Symmetric_mean_absolute_percentage_error/FQFKFaH1">Symmetric mean absolute percentage error</a></li></ul>

<h2 id="external-links">External links</h2>
<ul><li><a href="https://arxiv.org/abs/1605.02541">Mean Absolute Percentage Error for Regression Models</a></li>
<li><a href="http://www.gestiondeoperaciones.net/proyeccion-de-demanda/error-porcentual-absoluto-medio-mape-en-un-pronostico-de-demanda/">Mean Absolute Percentage Error (MAPE)</a></li>
<li><a href="http://robjhyndman.com/hyndsight/smape">Errors on percentage errors</a> - variants of MAPE</li>
<li><a href="https://www.sciencedirect.com/science/article/pii/S0169207016000121/">Mean Arctangent Absolute Percentage Error (MAAPE)</a></li></ul>

<h2 id="references">References</h2>

<ol>
<li id="fn:1">de Myttenaere, B Golden, B Le Grand, F Rossi (2015). "Mean absolute percentage error for regression models", Neurocomputing 2016 arXiv:1605.02541 <a href="/wiki/ArXiv_(identifier)" target="_blank">/wiki/ArXiv_(identifier)</a> <a href="#fnref:1" class="footnote-back-ref">↩</a></li>
<li id="fn:2">de Myttenaere, B Golden, B Le Grand, F Rossi (2015). "Mean absolute percentage error for regression models", Neurocomputing 2016 arXiv:1605.02541 <a href="/wiki/ArXiv_(identifier)" target="_blank">/wiki/ArXiv_(identifier)</a> <a href="#fnref:2" class="footnote-back-ref">↩</a></li>
<li id="fn:3">"Understanding Forecast Accuracy: MAPE, WAPE, WMAPE". <a href="https://www.baeldung.com/cs/mape-vs-wape-vs-wmape" target="_blank">https://www.baeldung.com/cs/mape-vs-wape-vs-wmape</a> <a href="#fnref:3" class="footnote-back-ref">↩</a></li>
<li id="fn:4">"WMAPE: Weighted Mean Absolute Percentage Error". <a href="https://ibf.org/knowledge/glossary/weighted-mean-absolute-percentage-error-wmape-299" target="_blank">https://ibf.org/knowledge/glossary/weighted-mean-absolute-percentage-error-wmape-299</a> <a href="#fnref:4" class="footnote-back-ref">↩</a></li>
<li id="fn:5">"Statistical Forecast Errors". <a href="https://blog.olivehorse.com/statistical-forecast-errors" target="_blank">https://blog.olivehorse.com/statistical-forecast-errors</a> <a href="#fnref:5" class="footnote-back-ref">↩</a></li>
<li id="fn:6">"Statistical Forecast Errors". <a href="https://blog.olivehorse.com/statistical-forecast-errors" target="_blank">https://blog.olivehorse.com/statistical-forecast-errors</a> <a href="#fnref:6" class="footnote-back-ref">↩</a></li>
<li id="fn:7">Tofallis (2015). "A Better Measure of Relative Prediction Accuracy for Model Selection and Model Estimation", Journal of the Operational Research Society, 66(8):1352-1362. archived preprint <a href="https://ssrn.com/abstract=2635088" target="_blank">https://ssrn.com/abstract=2635088</a> <a href="#fnref:7" class="footnote-back-ref">↩</a></li>
<li id="fn:8">Hyndman, Rob J., and Anne B. Koehler (2006). "Another look at measures of forecast accuracy." International Journal of Forecasting, 22(4):679-688 doi:10.1016/j.ijforecast.2006.03.001. <a href="//doi.org/10.1016/j.ijforecast.2006.03.001" target="_blank">//doi.org/10.1016/j.ijforecast.2006.03.001</a> <a href="#fnref:8" class="footnote-back-ref">↩</a></li>
<li id="fn:9">Kim, Sungil and Heeyoung Kim (2016). "A new metric of absolute percentage error for intermittent demand forecasts." International Journal of Forecasting, 32(3):669-679 doi:10.1016/j.ijforecast.2015.12.003. <a href="//doi.org/10.1016/j.ijforecast.2015.12.003" target="_blank">//doi.org/10.1016/j.ijforecast.2015.12.003</a> <a href="#fnref:9" class="footnote-back-ref">↩</a></li>
<li id="fn:10">Kim, Sungil; Kim, Heeyoung (1 July 2016). "A new metric of absolute percentage error for intermittent demand forecasts". International Journal of Forecasting. 32 (3): 669–679. doi:10.1016/j.ijforecast.2015.12.003. <a href="https://doi.org/10.1016%2Fj.ijforecast.2015.12.003" target="_blank">https://doi.org/10.1016%2Fj.ijforecast.2015.12.003</a> <a href="#fnref:10" class="footnote-back-ref">↩</a></li>
<li id="fn:11">Makridakis, Spyros (1993) "Accuracy measures: theoretical and practical concerns." International Journal of Forecasting, 9(4):527-529 doi:10.1016/0169-2070(93)90079-3 <a href="//doi.org/10.1016/0169-2070(93)90079-3" target="_blank">//doi.org/10.1016/0169-2070(93)90079-3</a> <a href="#fnref:11" class="footnote-back-ref">↩</a></li>
<li id="fn:12">Tofallis (2015). "A Better Measure of Relative Prediction Accuracy for Model Selection and Model Estimation", Journal of the Operational Research Society, 66(8):1352-1362. archived preprint <a href="https://ssrn.com/abstract=2635088" target="_blank">https://ssrn.com/abstract=2635088</a> <a href="#fnref:12" class="footnote-back-ref">↩</a></li>
<li id="fn:13">Kim, Sungil and Heeyoung Kim (2016). "A new metric of absolute percentage error for intermittent demand forecasts." International Journal of Forecasting, 32(3):669-679 doi:10.1016/j.ijforecast.2015.12.003. <a href="//doi.org/10.1016/j.ijforecast.2015.12.003" target="_blank">//doi.org/10.1016/j.ijforecast.2015.12.003</a> <a href="#fnref:13" class="footnote-back-ref">↩</a></li>
</ol>

Mean absolute percentage error open-in-new

Mean absolute percentage error