Proximal gradient method

Proximal gradient methods are a generalized form of projection used to solve non-differentiable <a href="/facts/Convex_optimization/7D6U4MTN">convex optimization</a> problems.

Many interesting problems can be formulated as convex optimization problems of the form

 
 
 
 
 min
 
 x
 ∈
 
 
 R
 
 
 N
 
 
 
 
 ⁡
 
 ∑
 
 i
 =
 1
 
 
 n
 
 
 
 f
 
 i
 
 
 (
 x
 )
 
 
 {\displaystyle \operatorname {min} \limits _{x\in \mathbb {R} ^{N}}\sum _{i=1}^{n}f_{i}(x)}

where 
 
 
 
 
 f
 
 i
 
 
 :
 
 
 R
 
 
 N
 
 
 →
 
 R
 
 ,
  
 i
 =
 1
 ,
 …
 ,
 n
 
 
 {\displaystyle f_{i}:\mathbb {R} ^{N}\rightarrow \mathbb {R} ,\ i=1,\dots ,n}
 
 are possibly non-differentiable <a href="/facts/Convex_functions/IbzG5SLF">convex functions</a>. The lack of differentiability rules out conventional smooth optimization techniques like the <a href="/facts/Gradient_descent/pFFrek0F">steepest descent method</a> and the <a href="/facts/Conjugate_gradient_method/AFwzBS4U">conjugate gradient method</a>, but proximal gradient methods can be used instead. 
Proximal gradient methods starts by a splitting step, in which the functions 
 
 
 
 
 f
 
 1
 
 
 ,
 .
 .
 .
 ,
 
 f
 
 n
 
 
 
 
 {\displaystyle f_{1},...,f_{n}}
 
 are used individually so as to yield an easily implementable algorithm. They are called <a href="/facts/Proximal/IpJ1fq2f">proximal</a> because each non-differentiable function among 
 
 
 
 
 f
 
 1
 
 
 ,
 .
 .
 .
 ,
 
 f
 
 n
 
 
 
 
 {\displaystyle f_{1},...,f_{n}}
 
 is involved via its <a href="/facts/Proximal_operator/KHXpHWHC">proximity operator</a>. Iterative shrinkage thresholding algorithm, <a href="/facts/Landweber_iteration/FQzxcluq">projected Landweber</a>, projected gradient, <a href="/facts/Alternating_projection/xeo6T5Ud">alternating projections</a>, <a href="/facts/Alternating_direction_method_of_multipliers/8zamSYrl">alternating-direction method of multipliers</a>, alternating
split <a href="/facts/Bregman_method/anNx5Gta">Bregman</a> are special instances of proximal algorithms. 
For the theory of proximal gradient methods from the perspective of and with applications to <a href="/facts/Statistical_learning_theory/eK1V4mxN">statistical learning theory</a>, see <a href="/facts/Proximal_gradient_methods_for_learning/G2zt3d09">proximal gradient methods for learning</a>.

Proximal gradient method open-in-new

Proximal gradient method