Min-max theorem

<h2 id="matrices">Matrices</h2>
Let A be a n × n <a href="/facts/Hermitian_matrix/qArPblYo">Hermitian matrix</a>. As with many other variational results on eigenvalues, one considers the <a href="/facts/Rayleigh_quotient/yH7sD1K4">Rayleigh–Ritz quotient</a> RA : Cn \ {0} → R defined by

R
          
            A
          
        
        (
        x
        )
        =
        
          
            
              (
              A
              x
              ,
              x
              )
            
            
              (
              x
              ,
              x
              )
            
          
        
      
    
    {\displaystyle R_{A}(x)={\frac {(Ax,x)}{(x,x)}}}

where (⋅, ⋅) denotes the <a href="/facts/Dot_product/tNz8MLjT">Euclidean inner product</a> on Cn. 
Equivalently, the Rayleigh–Ritz quotient can be replaced by

f
        (
        x
        )
        =
        (
        A
        x
        ,
        x
        )
        ,
        
        ‖
        x
        ‖
        =
        1.
      
    
    {\displaystyle f(x)=(Ax,x),\;\|x\|=1.}

The Rayleigh quotient of an eigenvector 
 
 
 
 v
 
 
 {\displaystyle v}
 
 is its associated eigenvalue 
 
 
 
 λ
 
 
 {\displaystyle \lambda }
 
 because 
 
 
 
 
 R
 
 A
 
 
 (
 v
 )
 =
 (
 λ
 x
 ,
 x
 )
 
 /
 
 (
 x
 ,
 x
 )
 =
 λ
 
 
 {\displaystyle R_{A}(v)=(\lambda x,x)/(x,x)=\lambda }
 
. 
For a Hermitian matrix A, the range of the continuous functions RA(x) and f(x) is a compact interval [a, b] of the real line. The maximum b and the minimum a are the largest and smallest eigenvalue of A, respectively. The min-max theorem is a refinement of this fact.

<h3>Min-max theorem</h3>
Let 
 
 
 
 A
 
 
 {\textstyle A}
 
 be Hermitian on an inner product space 
 
 
 
 V
 
 
 {\textstyle V}
 
 with dimension 
 
 
 
 n
 
 
 {\textstyle n}
 
, with spectrum ordered in descending order 
 
 
 
 
 λ
 
 1
 
 
 ≥
 .
 .
 .
 ≥
 
 λ
 
 n
 
 
 
 
 {\textstyle \lambda _{1}\geq ...\geq \lambda _{n}}
 
.
Let 
 
 
 
 
 v
 
 1
 
 
 ,
 .
 .
 .
 ,
 
 v
 
 n
 
 
 
 
 {\textstyle v_{1},...,v_{n}}
 
 be the corresponding unit-length orthogonal eigenvectors.
Reverse the spectrum ordering, so that 
 
 
 
 
 ξ
 
 1
 
 
 =
 
 λ
 
 n
 
 
 ,
 .
 .
 .
 ,
 
 ξ
 
 n
 
 
 =
 
 λ
 
 1
 
 
 
 
 {\textstyle \xi _{1}=\lambda _{n},...,\xi _{n}=\lambda _{1}}
 
.

(Poincaré’s inequality)—Let 
 
 
 
 M
 
 
 {\textstyle M}
 
 be a subspace of 
 
 
 
 V
 
 
 {\textstyle V}
 
 with dimension 
 
 
 
 k
 
 
 {\textstyle k}
 
, then there exists unit vectors 
 
 
 
 x
 ,
 y
 ∈
 M
 
 
 {\textstyle x,y\in M}
 
, such that

 
 
 
 ⟨
 x
 ,
 A
 x
 ⟩
 ≤
 
 λ
 
 k
 
 
 
 
 {\textstyle \langle x,Ax\rangle \leq \lambda _{k}}
 
, and 
 
 
 
 ⟨
 y
 ,
 A
 y
 ⟩
 ≥
 
 ξ
 
 k
 
 
 
 
 {\textstyle \langle y,Ay\rangle \geq \xi _{k}}
 
.

Proof
Part 2 is a corollary, using 
 
 
 
 −
 A
 
 
 {\textstyle -A}
 
.

 
 
 
 M
 
 
 {\textstyle M}
 
 is a 
 
 
 
 k
 
 
 {\textstyle k}
 
 dimensional subspace, so if we pick any list of 
 
 
 
 n
 −
 k
 +
 1
 
 
 {\textstyle n-k+1}
 
 vectors, their span 
 
 
 
 N
 :=
 s
 p
 a
 n
 (
 
 v
 
 k
 
 
 ,
 .
 .
 .
 
 v
 
 n
 
 
 )
 
 
 {\textstyle N:=span(v_{k},...v_{n})}
 
 must intersect 
 
 
 
 M
 
 
 {\textstyle M}
 
 on at least a single line.
Take unit 
 
 
 
 x
 ∈
 M
 ∩
 N
 
 
 {\textstyle x\in M\cap N}
 
. That’s what we need.

x
        =
        
          ∑
          
            i
            =
            k
          
          
            n
          
        
        
          a
          
            i
          
        
        
          v
          
            i
          
        
      
    
    {\textstyle x=\sum _{i=k}^{n}a_{i}v_{i}}
  
, since 
  
    
      
        x
        ∈
        N
      
    
    {\textstyle x\in N}
  
.
Since 
  
    
      
        
          ∑
          
            i
            =
            k
          
          
            n
          
        
        
          |
        
        
          a
          
            i
          
        
        
          
            |
          
          
            2
          
        
        =
        1
      
    
    {\textstyle \sum _{i=k}^{n}|a_{i}|^{2}=1}
  
, we find 
  
    
      
        ⟨
        x
        ,
        A
        x
        ⟩
        =
        
          ∑
          
            i
            =
            k
          
          
            n
          
        
        
          |
        
        
          a
          
            i
          
        
        
          
            |
          
          
            2
          
        
        
          λ
          
            i
          
        
        ≤
        
          λ
          
            k
          
        
      
    
    {\textstyle \langle x,Ax\rangle =\sum _{i=k}^{n}|a_{i}|^{2}\lambda _{i}\leq \lambda _{k}}
  
.

min-max theorem—
 
 
 
 
 
 
 
 
 λ
 
 k
 
 
 
 
 
 =
 
 max
 
 
 
 
 
 
 M
 
 
 ⊂
 V
 
 
 
 
 dim
 ⁡
 (
 
 
 M
 
 
 )
 =
 k
 
 
 
 
 
 
 min
 
 
 
 
 x
 ∈
 
 
 M
 
 
 
 
 
 
 ‖
 x
 ‖
 =
 1
 
 
 
 
 
 ⟨
 x
 ,
 A
 x
 ⟩
 
 
 
 
 
 
 =
 
 min
 
 
 
 
 
 
 M
 
 
 ⊂
 V
 
 
 
 
 dim
 ⁡
 (
 
 
 M
 
 
 )
 =
 n
 −
 k
 +
 1
 
 
 
 
 
 
 max
 
 
 
 
 x
 ∈
 
 
 M
 
 
 
 
 
 
 ‖
 x
 ‖
 =
 1
 
 
 
 
 
 ⟨
 x
 ,
 A
 x
 ⟩
 
 . 
 
 
 
 
 
 
 
 {\displaystyle {\begin{aligned}\lambda _{k}&=\max _{\begin{array}{c}{\mathcal {M}}\subset V\\\operatorname {dim} ({\mathcal {M}})=k\end{array}}\min _{\begin{array}{c}x\in {\mathcal {M}}\\\|x\|=1\end{array}}\langle x,Ax\rangle \\&=\min _{\begin{array}{c}{\mathcal {M}}\subset V\\\operatorname {dim} ({\mathcal {M}})=n-k+1\end{array}}\max _{\begin{array}{c}x\in {\mathcal {M}}\\\|x\|=1\end{array}}\langle x,Ax\rangle {\text{. }}\end{aligned}}}

Proof
Part 2 is a corollary of part 1, by using 
 
 
 
 −
 A
 
 
 {\textstyle -A}
 
.
By Poincare’s inequality, 
 
 
 
 
 λ
 
 k
 
 
 
 
 {\textstyle \lambda _{k}}
 
 is an upper bound to the right side.
By setting 
 
 
 
 
 
 M
 
 
 =
 s
 p
 a
 n
 (
 
 v
 
 1
 
 
 ,
 .
 .
 .
 
 v
 
 k
 
 
 )
 
 
 {\textstyle {\mathcal {M}}=span(v_{1},...v_{k})}
 
, the upper bound is achieved.

Define the <a href="/facts/Partial_trace/ra7RhkkT">partial trace</a> 
 
 
 
 t
 
 r
 
 V
 
 
 (
 A
 )
 
 
 {\textstyle tr_{V}(A)}
 
 to be the trace of projection of 
 
 
 
 A
 
 
 {\textstyle A}
 
 to 
 
 
 
 V
 
 
 {\textstyle V}
 
. It is equal to 
 
 
 
 
 ∑
 
 i
 
 
 
 v
 
 i
 
 
 ∗
 
 
 A
 
 v
 
 i
 
 
 
 
 {\textstyle \sum _{i}v_{i}^{*}Av_{i}}
 
 given an orthonormal basis of 
 
 
 
 V
 
 
 {\textstyle V}
 
.

Wielandt minimax formula (<a class="footnote-ref" id="fnref:1" href="#fn:1">1</a>: 44 )—Let 
 
 
 
 1
 ≤
 
 i
 
 1
 
 
 <
 ⋯
 <
 
 i
 
 k
 
 
 ≤
 n
 
 
 {\textstyle 1\leq i_{1}<\cdots <i_{k}\leq n}
 
 be integers. Define a partial flag to be a nested collection 
 
 
 
 
 V
 
 1
 
 
 ⊂
 ⋯
 ⊂
 
 V
 
 k
 
 
 
 
 {\textstyle V_{1}\subset \cdots \subset V_{k}}
 
 of subspaces of 
 
 
 
 
 
 C
 
 
 n
 
 
 
 
 {\textstyle \mathbb {C} ^{n}}
 
 such that 
 
 
 
 dim
 ⁡
 
 (
 
 V
 
 j
 
 
 )
 
 =
 
 i
 
 j
 
 
 
 
 {\textstyle \operatorname {dim} \left(V_{j}\right)=i_{j}}
 
 for all 
 
 
 
 1
 ≤
 j
 ≤
 k
 
 
 {\textstyle 1\leq j\leq k}
 
.
Define the associated Schubert variety 
 
 
 
 X
 
 (
 
 
 V
 
 1
 
 
 ,
 …
 ,
 
 V
 
 k
 
 
 
 )
 
 
 
 {\textstyle X\left(V_{1},\ldots ,V_{k}\right)}
 
 to be the collection of all 
 
 
 
 k
 
 
 {\textstyle k}
 
 dimensional subspaces 
 
 
 
 W
 
 
 {\textstyle W}
 
 such that 
 
 
 
 dim
 ⁡
 
 (
 
 W
 ∩
 
 V
 
 j
 
 
 
 )
 
 ≥
 j
 
 
 {\textstyle \operatorname {dim} \left(W\cap V_{j}\right)\geq j}
 
.

 
 
 
 
 λ
 
 
 i
 
 1
 
 
 
 
 (
 A
 )
 +
 ⋯
 +
 
 λ
 
 
 i
 
 k
 
 
 
 
 (
 A
 )
 =
 
 sup
 
 
 V
 
 1
 
 
 ,
 …
 ,
 
 V
 
 k
 
 
 
 
 
 inf
 
 W
 ∈
 X
 
 (
 
 
 V
 
 1
 
 
 ,
 …
 ,
 
 V
 
 k
 
 
 
 )
 
 
 
 t
 
 r
 
 W
 
 
 (
 A
 )
 
 
 {\displaystyle \lambda _{i_{1}}(A)+\cdots +\lambda _{i_{k}}(A)=\sup _{V_{1},\ldots ,V_{k}}\inf _{W\in X\left(V_{1},\ldots ,V_{k}\right)}tr_{W}(A)}

This has some corollaries:<a class="footnote-ref" id="fnref:2" href="#fn:2">2</a>: 44

Extremal partial trace—
 
 
 
 
 λ
 
 1
 
 
 (
 A
 )
 +
 ⋯
 +
 
 λ
 
 k
 
 
 (
 A
 )
 =
 
 sup
 
 dim
 ⁡
 (
 V
 )
 =
 k
 
 
 t
 
 r
 
 V
 
 
 (
 A
 )
 
 
 {\displaystyle \lambda _{1}(A)+\dots +\lambda _{k}(A)=\sup _{\operatorname {dim} (V)=k}tr_{V}(A)}

ξ
 
 1
 
 
 (
 A
 )
 +
 ⋯
 +
 
 ξ
 
 k
 
 
 (
 A
 )
 =
 
 inf
 
 dim
 ⁡
 (
 V
 )
 =
 k
 
 
 t
 
 r
 
 V
 
 
 (
 A
 )
 
 
 {\displaystyle \xi _{1}(A)+\dots +\xi _{k}(A)=\inf _{\operatorname {dim} (V)=k}tr_{V}(A)}

Corollary—The sum 
 
 
 
 
 λ
 
 1
 
 
 (
 A
 )
 +
 ⋯
 +
 
 λ
 
 k
 
 
 (
 A
 )
 
 
 {\textstyle \lambda _{1}(A)+\dots +\lambda _{k}(A)}
 
 is a convex function, and 
 
 
 
 
 ξ
 
 1
 
 
 (
 A
 )
 +
 ⋯
 +
 
 ξ
 
 k
 
 
 (
 A
 )
 
 
 {\textstyle \xi _{1}(A)+\dots +\xi _{k}(A)}
 
 is concave.
(Schur-Horn inequality) 
 
 
 
 
 ξ
 
 1
 
 
 (
 A
 )
 +
 ⋯
 +
 
 ξ
 
 k
 
 
 (
 A
 )
 ≤
 
 a
 
 
 i
 
 1
 
 
 ,
 
 i
 
 1
 
 
 
 
 +
 ⋯
 +
 
 a
 
 
 i
 
 k
 
 
 ,
 
 i
 
 k
 
 
 
 
 ≤
 
 λ
 
 1
 
 
 (
 A
 )
 +
 ⋯
 +
 
 λ
 
 k
 
 
 (
 A
 )
 
 
 {\displaystyle \xi _{1}(A)+\dots +\xi _{k}(A)\leq a_{i_{1},i_{1}}+\dots +a_{i_{k},i_{k}}\leq \lambda _{1}(A)+\dots +\lambda _{k}(A)}
 
 for any subset of indices.
Equivalently, this states that the diagonal vector of 
 
 
 
 A
 
 
 {\textstyle A}
 
 is majorized by its eigenspectrum.

Schatten-norm Hölder inequality—Given Hermitian 
 
 
 
 A
 ,
 B
 
 
 {\textstyle A,B}
 
 and Hölder pair 
 
 
 
 1
 
 /
 
 p
 +
 1
 
 /
 
 q
 =
 1
 
 
 {\textstyle 1/p+1/q=1}
 
, 
 
 
 
 
 |
 
 tr
 ⁡
 (
 A
 B
 )
 
 |
 
 ≤
 ‖
 A
 
 ‖
 
 
 S
 
 p
 
 
 
 
 ‖
 B
 
 ‖
 
 
 S
 
 q
 
 
 
 
 
 
 {\displaystyle |\operatorname {tr} (AB)|\leq \|A\|_{S^{p}}\|B\|_{S^{q}}}

<h3>Counterexample in the non-Hermitian case</h3>
Let N be the nilpotent matrix

[
            
              
                
                  0
                
                
                  1
                
              
              
                
                  0
                
                
                  0
                
              
            
            ]
          
        
        .
      
    
    {\displaystyle {\begin{bmatrix}0&1\\0&0\end{bmatrix}}.}

Define the Rayleigh quotient 
 
 
 
 
 R
 
 N
 
 
 (
 x
 )
 
 
 {\displaystyle R_{N}(x)}
 
 exactly as above in the Hermitian case. Then it is easy to see that the only eigenvalue of N is zero, while the maximum value of the Rayleigh quotient is ⁠1/2⁠. That is, the maximum value of the Rayleigh quotient is larger than the maximum eigenvalue.

<h2 id="applications">Applications</h2>
<h3>Min-max principle for singular values</h3>
The <a href="/facts/Singular_value/0QWHnqGG">singular values</a> {σk} of a square matrix M are the square roots of the eigenvalues of M*M (equivalently MM*). An immediate consequence of the first equality in the min-max theorem is:

σ
          
            k
          
          
            ↓
          
        
        =
        
          max
          
            S
            :
            dim
            ⁡
            (
            S
            )
            =
            k
          
        
        
          min
          
            x
            ∈
            S
            ,
            ‖
            x
            ‖
            =
            1
          
        
        (
        
          M
          
            ∗
          
        
        M
        x
        ,
        x
        
          )
          
            
              1
              2
            
          
        
        =
        
          max
          
            S
            :
            dim
            ⁡
            (
            S
            )
            =
            k
          
        
        
          min
          
            x
            ∈
            S
            ,
            ‖
            x
            ‖
            =
            1
          
        
        ‖
        M
        x
        ‖
        .
      
    
    {\displaystyle \sigma _{k}^{\downarrow }=\max _{S:\dim(S)=k}\min _{x\in S,\|x\|=1}(M^{*}Mx,x)^{\frac {1}{2}}=\max _{S:\dim(S)=k}\min _{x\in S,\|x\|=1}\|Mx\|.}

Similarly,

σ
          
            k
          
          
            ↓
          
        
        =
        
          min
          
            S
            :
            dim
            ⁡
            (
            S
            )
            =
            n
            −
            k
            +
            1
          
        
        
          max
          
            x
            ∈
            S
            ,
            ‖
            x
            ‖
            =
            1
          
        
        ‖
        M
        x
        ‖
        .
      
    
    {\displaystyle \sigma _{k}^{\downarrow }=\min _{S:\dim(S)=n-k+1}\max _{x\in S,\|x\|=1}\|Mx\|.}

Here 
 
 
 
 
 σ
 
 k
 
 
 ↓
 
 
 
 
 {\displaystyle \sigma _{k}^{\downarrow }}
 
 denotes the kth entry in the decreasing sequence of the singular values, so that 
 
 
 
 
 σ
 
 1
 
 
 ↓
 
 
 ≥
 
 σ
 
 2
 
 
 ↓
 
 
 ≥
 ⋯
 
 
 {\displaystyle \sigma _{1}^{\downarrow }\geq \sigma _{2}^{\downarrow }\geq \cdots }
 
.

<h3>Cauchy interlacing theorem</h3>
Main article: <a href="/facts/Poincar%25C3%25A9_separation_theorem/R1fHkooF">Poincaré separation theorem</a>
Let A be a symmetric n × n matrix. The m × m matrix B, where m ≤ n, is called a <a href="/facts/Compression_(functional_analysis)/6SuygJ9p">compression</a> of A if there exists an <a href="/facts/Projection_(linear_algebra)/sElXGkxD">orthogonal projection</a> P onto a subspace of dimension m such that PAP* = B. The Cauchy interlacing theorem states:

Theorem. If the eigenvalues of A are α1 ≤ ... ≤ αn, and those of B are β1 ≤ ... ≤ βj ≤ ... ≤ βm, then for all j ≤ m,

α
          
            j
          
        
        ≤
        
          β
          
            j
          
        
        ≤
        
          α
          
            n
            −
            m
            +
            j
          
        
        .
      
    
    {\displaystyle \alpha _{j}\leq \beta _{j}\leq \alpha _{n-m+j}.}

This can be proven using the min-max principle. Let βi have corresponding eigenvector bi and Sj be the j dimensional subspace Sj = span{b1, ..., bj}, then

β
          
            j
          
        
        =
        
          max
          
            x
            ∈
            
              S
              
                j
              
            
            ,
            ‖
            x
            ‖
            =
            1
          
        
        (
        B
        x
        ,
        x
        )
        =
        
          max
          
            x
            ∈
            
              S
              
                j
              
            
            ,
            ‖
            x
            ‖
            =
            1
          
        
        (
        P
        A
        
          P
          
            ∗
          
        
        x
        ,
        x
        )
        ≥
        
          min
          
            
              S
              
                j
              
            
          
        
        
          max
          
            x
            ∈
            
              S
              
                j
              
            
            ,
            ‖
            x
            ‖
            =
            1
          
        
        (
        A
        (
        
          P
          
            ∗
          
        
        x
        )
        ,
        
          P
          
            ∗
          
        
        x
        )
        =
        
          α
          
            j
          
        
        .
      
    
    {\displaystyle \beta _{j}=\max _{x\in S_{j},\|x\|=1}(Bx,x)=\max _{x\in S_{j},\|x\|=1}(PAP^{*}x,x)\geq \min _{S_{j}}\max _{x\in S_{j},\|x\|=1}(A(P^{*}x),P^{*}x)=\alpha _{j}.}

According to first part of min-max, αj ≤ βj. On the other hand, if we define Sm−j+1 = span{bj, ..., bm}, then

β
          
            j
          
        
        =
        
          min
          
            x
            ∈
            
              S
              
                m
                −
                j
                +
                1
              
            
            ,
            ‖
            x
            ‖
            =
            1
          
        
        (
        B
        x
        ,
        x
        )
        =
        
          min
          
            x
            ∈
            
              S
              
                m
                −
                j
                +
                1
              
            
            ,
            ‖
            x
            ‖
            =
            1
          
        
        (
        P
        A
        
          P
          
            ∗
          
        
        x
        ,
        x
        )
        =
        
          min
          
            x
            ∈
            
              S
              
                m
                −
                j
                +
                1
              
            
            ,
            ‖
            x
            ‖
            =
            1
          
        
        (
        A
        (
        
          P
          
            ∗
          
        
        x
        )
        ,
        
          P
          
            ∗
          
        
        x
        )
        ≤
        
          α
          
            n
            −
            m
            +
            j
          
        
        ,
      
    
    {\displaystyle \beta _{j}=\min _{x\in S_{m-j+1},\|x\|=1}(Bx,x)=\min _{x\in S_{m-j+1},\|x\|=1}(PAP^{*}x,x)=\min _{x\in S_{m-j+1},\|x\|=1}(A(P^{*}x),P^{*}x)\leq \alpha _{n-m+j},}

where the last inequality is given by the second part of min-max.
When n − m = 1, we have αj ≤ βj ≤ αj+1, hence the name interlacing theorem.

<h3>Lidskii's inequality</h3>
Main article: <a href="/facts/Trace_class/2AmWENA3">Trace class § Lidskii's theorem</a>

Lidskii inequality—If 
 
 
 
 1
 ≤
 
 i
 
 1
 
 
 <
 ⋯
 <
 
 i
 
 k
 
 
 ≤
 n
 
 
 {\textstyle 1\leq i_{1}<\cdots <i_{k}\leq n}
 
 then 
 
 
 
 
 
 
 
 
 
 λ
 
 
 i
 
 1
 
 
 
 
 (
 A
 +
 B
 )
 +
 ⋯
 +
 
 λ
 
 
 i
 
 k
 
 
 
 
 (
 A
 +
 B
 )
 
 
 
 
 
 
 
 ≤
 
 λ
 
 
 i
 
 1
 
 
 
 
 (
 A
 )
 +
 ⋯
 +
 
 λ
 
 
 i
 
 k
 
 
 
 
 (
 A
 )
 +
 
 λ
 
 1
 
 
 (
 B
 )
 +
 ⋯
 +
 
 λ
 
 k
 
 
 (
 B
 )
 
 
 
 
 
 
 {\displaystyle {\begin{aligned}&\lambda _{i_{1}}(A+B)+\cdots +\lambda _{i_{k}}(A+B)\\&\quad \leq \lambda _{i_{1}}(A)+\cdots +\lambda _{i_{k}}(A)+\lambda _{1}(B)+\cdots +\lambda _{k}(B)\end{aligned}}}

λ
 
 
 i
 
 1
 
 
 
 
 (
 A
 +
 B
 )
 +
 ⋯
 +
 
 λ
 
 
 i
 
 k
 
 
 
 
 (
 A
 +
 B
 )
 
 
 
 
 
 
 
 ≥
 
 λ
 
 
 i
 
 1
 
 
 
 
 (
 A
 )
 +
 ⋯
 +
 
 λ
 
 
 i
 
 k
 
 
 
 
 (
 A
 )
 +
 
 ξ
 
 1
 
 
 (
 B
 )
 +
 ⋯
 +
 
 ξ
 
 k
 
 
 (
 B
 )
 
 
 
 
 
 
 {\displaystyle {\begin{aligned}&\lambda _{i_{1}}(A+B)+\cdots +\lambda _{i_{k}}(A+B)\\&\quad \geq \lambda _{i_{1}}(A)+\cdots +\lambda _{i_{k}}(A)+\xi _{1}(B)+\cdots +\xi _{k}(B)\end{aligned}}}

Note that 
 
 
 
 
 ∑
 
 i
 
 
 
 λ
 
 i
 
 
 (
 A
 +
 B
 )
 =
 t
 r
 (
 A
 +
 B
 )
 =
 
 ∑
 
 i
 
 
 
 λ
 
 i
 
 
 (
 A
 )
 +
 
 λ
 
 i
 
 
 (
 B
 )
 
 
 {\displaystyle \sum _{i}\lambda _{i}(A+B)=tr(A+B)=\sum _{i}\lambda _{i}(A)+\lambda _{i}(B)}
 
. In other words, 
 
 
 
 λ
 (
 A
 +
 B
 )
 −
 λ
 (
 A
 )
 ⪯
 λ
 (
 B
 )
 
 
 {\displaystyle \lambda (A+B)-\lambda (A)\preceq \lambda (B)}
 
 where 
 
 
 
 ⪯
 
 
 {\displaystyle \preceq }
 
 means <a href="/facts/Majorization/RSSsgWTP">majorization</a>. By the Schur convexity theorem, we then have

p-Wielandt-Hoffman inequality—
 
 
 
 ‖
 λ
 (
 A
 +
 B
 )
 −
 λ
 (
 A
 )
 
 ‖
 
 
 ℓ
 
 p
 
 
 
 
 ≤
 ‖
 B
 
 ‖
 
 
 S
 
 p
 
 
 
 
 
 
 {\textstyle \|\lambda (A+B)-\lambda (A)\|_{\ell ^{p}}\leq \|B\|_{S^{p}}}
 
 where 
 
 
 
 ‖
 ⋅
 
 ‖
 
 
 S
 
 p
 
 
 
 
 
 
 {\textstyle \|\cdot \|_{S^{p}}}
 
 stands for the p-Schatten norm.

<h2 id="compact-operators">Compact operators</h2>
Let A be a <a href="/facts/Compact_operator_on_Hilbert_space/xaL1NE6M">compact</a>, <a href="/facts/Hermitian/UbfypD8h">Hermitian</a> operator on a Hilbert space H. Recall that the <a href="/facts/Spectrum_(functional_analysis)/cA1evTo6">spectrum</a> of such an operator (the set of eigenvalues) is a set of real numbers whose only possible <a href="/facts/Cluster_point/95MykoGU">cluster point</a> is zero. 
It is thus convenient to list the positive eigenvalues of A as

⋯
        ≤
        
          λ
          
            k
          
        
        ≤
        ⋯
        ≤
        
          λ
          
            1
          
        
        ,
      
    
    {\displaystyle \cdots \leq \lambda _{k}\leq \cdots \leq \lambda _{1},}

where entries are repeated with <a href="/facts/Multiplicity_(mathematics)/dndUyetA">multiplicity</a>, as in the matrix case. (To emphasize that the sequence is decreasing, we may write 
 
 
 
 
 λ
 
 k
 
 
 =
 
 λ
 
 k
 
 
 ↓
 
 
 
 
 {\displaystyle \lambda _{k}=\lambda _{k}^{\downarrow }}
 
.) 
When H is infinite-dimensional, the above sequence of eigenvalues is necessarily infinite. 
We now apply the same reasoning as in the matrix case. Letting Sk ⊂ H be a k dimensional subspace, we can obtain the following theorem.

Theorem (Min-Max). Let A be a compact, self-adjoint operator on a Hilbert space H, whose positive eigenvalues are listed in decreasing order ... ≤ λk ≤ ... ≤ λ1. Then:

max
                  
                    
                      S
                      
                        k
                      
                    
                  
                
                
                  min
                  
                    x
                    ∈
                    
                      S
                      
                        k
                      
                    
                    ,
                    ‖
                    x
                    ‖
                    =
                    1
                  
                
                (
                A
                x
                ,
                x
                )
              
              
                
                =
                
                  λ
                  
                    k
                  
                  
                    ↓
                  
                
                ,
              
            
            
              
                
                  min
                  
                    
                      S
                      
                        k
                        −
                        1
                      
                    
                  
                
                
                  max
                  
                    x
                    ∈
                    
                      S
                      
                        k
                        −
                        1
                      
                      
                        ⊥
                      
                    
                    ,
                    ‖
                    x
                    ‖
                    =
                    1
                  
                
                (
                A
                x
                ,
                x
                )
              
              
                
                =
                
                  λ
                  
                    k
                  
                  
                    ↓
                  
                
                .
              
            
          
        
      
    
    {\displaystyle {\begin{aligned}\max _{S_{k}}\min _{x\in S_{k},\|x\|=1}(Ax,x)&=\lambda _{k}^{\downarrow },\\\min _{S_{k-1}}\max _{x\in S_{k-1}^{\perp },\|x\|=1}(Ax,x)&=\lambda _{k}^{\downarrow }.\end{aligned}}}

A similar pair of equalities hold for negative eigenvalues.

Proof
Let S' be the closure of the linear span 
 
 
 
 
 S
 ′
 
 =
 span
 ⁡
 {
 
 u
 
 k
 
 
 ,
 
 u
 
 k
 +
 1
 
 
 ,
 …
 }
 
 
 {\displaystyle S'=\operatorname {span} \{u_{k},u_{k+1},\ldots \}}
 
.
The subspace S' has codimension k − 1. By the same dimension count argument as in the matrix case, S' ∩ Sk has positive dimension. So there exists x ∈ S'  ∩ Sk with 
 
 
 
 ‖
 x
 ‖
 =
 1
 
 
 {\displaystyle \|x\|=1}
 
. Since it is an element of S' , such an x necessarily satisfy

(
        A
        x
        ,
        x
        )
        ≤
        
          λ
          
            k
          
        
        .
      
    
    {\displaystyle (Ax,x)\leq \lambda _{k}.}

Therefore, for all Sk

inf
          
            x
            ∈
            
              S
              
                k
              
            
            ,
            ‖
            x
            ‖
            =
            1
          
        
        (
        A
        x
        ,
        x
        )
        ≤
        
          λ
          
            k
          
        
      
    
    {\displaystyle \inf _{x\in S_{k},\|x\|=1}(Ax,x)\leq \lambda _{k}}

But A is compact, therefore the function f(x) = (Ax, x) is weakly continuous. Furthermore, any bounded set in H is weakly compact. This lets us replace the infimum by minimum:

min
          
            x
            ∈
            
              S
              
                k
              
            
            ,
            ‖
            x
            ‖
            =
            1
          
        
        (
        A
        x
        ,
        x
        )
        ≤
        
          λ
          
            k
          
        
        .
      
    
    {\displaystyle \min _{x\in S_{k},\|x\|=1}(Ax,x)\leq \lambda _{k}.}

So

sup
          
            
              S
              
                k
              
            
          
        
        
          min
          
            x
            ∈
            
              S
              
                k
              
            
            ,
            ‖
            x
            ‖
            =
            1
          
        
        (
        A
        x
        ,
        x
        )
        ≤
        
          λ
          
            k
          
        
        .
      
    
    {\displaystyle \sup _{S_{k}}\min _{x\in S_{k},\|x\|=1}(Ax,x)\leq \lambda _{k}.}

Because equality is achieved when 
 
 
 
 
 S
 
 k
 
 
 =
 span
 ⁡
 {
 
 u
 
 1
 
 
 ,
 …
 ,
 
 u
 
 k
 
 
 }
 
 
 {\displaystyle S_{k}=\operatorname {span} \{u_{1},\ldots ,u_{k}\}}
 
,

max
          
            
              S
              
                k
              
            
          
        
        
          min
          
            x
            ∈
            
              S
              
                k
              
            
            ,
            ‖
            x
            ‖
            =
            1
          
        
        (
        A
        x
        ,
        x
        )
        =
        
          λ
          
            k
          
        
        .
      
    
    {\displaystyle \max _{S_{k}}\min _{x\in S_{k},\|x\|=1}(Ax,x)=\lambda _{k}.}

This is the first part of min-max theorem for compact self-adjoint operators.
Analogously, consider now a (k − 1)-dimensional subspace Sk−1, whose the orthogonal complement is denoted by Sk−1⊥. If S' = span{u1...uk},

S
          ′
        
        ∩
        
          S
          
            k
            −
            1
          
          
            ⊥
          
        
        ≠
        
          0
        
        .
      
    
    {\displaystyle S'\cap S_{k-1}^{\perp }\neq {0}.}

So

∃
        x
        ∈
        
          S
          
            k
            −
            1
          
          
            ⊥
          
        
        
        ‖
        x
        ‖
        =
        1
        ,
        (
        A
        x
        ,
        x
        )
        ≥
        
          λ
          
            k
          
        
        .
      
    
    {\displaystyle \exists x\in S_{k-1}^{\perp }\,\|x\|=1,(Ax,x)\geq \lambda _{k}.}

This implies

max
          
            x
            ∈
            
              S
              
                k
                −
                1
              
              
                ⊥
              
            
            ,
            ‖
            x
            ‖
            =
            1
          
        
        (
        A
        x
        ,
        x
        )
        ≥
        
          λ
          
            k
          
        
      
    
    {\displaystyle \max _{x\in S_{k-1}^{\perp },\|x\|=1}(Ax,x)\geq \lambda _{k}}

where the compactness of A was applied. Index the above by the collection of k-1-dimensional subspaces gives

inf
          
            
              S
              
                k
                −
                1
              
            
          
        
        
          max
          
            x
            ∈
            
              S
              
                k
                −
                1
              
              
                ⊥
              
            
            ,
            ‖
            x
            ‖
            =
            1
          
        
        (
        A
        x
        ,
        x
        )
        ≥
        
          λ
          
            k
          
        
        .
      
    
    {\displaystyle \inf _{S_{k-1}}\max _{x\in S_{k-1}^{\perp },\|x\|=1}(Ax,x)\geq \lambda _{k}.}

Pick Sk−1 = span{u1, ..., uk−1} and we deduce

min
          
            
              S
              
                k
                −
                1
              
            
          
        
        
          max
          
            x
            ∈
            
              S
              
                k
                −
                1
              
              
                ⊥
              
            
            ,
            ‖
            x
            ‖
            =
            1
          
        
        (
        A
        x
        ,
        x
        )
        =
        
          λ
          
            k
          
        
        .
      
    
    {\displaystyle \min _{S_{k-1}}\max _{x\in S_{k-1}^{\perp },\|x\|=1}(Ax,x)=\lambda _{k}.}

<h2 id="self-adjoint-operators">Self-adjoint operators</h2>
The min-max theorem also applies to (possibly unbounded) self-adjoint operators.<a class="footnote-ref" id="fnref:3" href="#fn:3">3</a><a class="footnote-ref" id="fnref:4" href="#fn:4">4</a> Recall the <a href="/facts/Essential_spectrum/EHN47PNM">essential spectrum</a> is the spectrum without isolated eigenvalues of finite multiplicity. 
Sometimes we have some eigenvalues below the essential spectrum, and we would like to approximate the eigenvalues and eigenfunctions.

Theorem (Min-Max). Let A be self-adjoint, and let 
 
 
 
 
 E
 
 1
 
 
 ≤
 
 E
 
 2
 
 
 ≤
 
 E
 
 3
 
 
 ≤
 ⋯
 
 
 {\displaystyle E_{1}\leq E_{2}\leq E_{3}\leq \cdots }
 
 be the eigenvalues of A below the essential spectrum. Then

 
 
 
 
 E
 
 n
 
 
 =
 
 min
 
 
 ψ
 
 1
 
 
 ,
 …
 ,
 
 ψ
 
 n
 
 
 
 
 max
 {
 ⟨
 ψ
 ,
 A
 ψ
 ⟩
 :
 ψ
 ∈
 span
 ⁡
 (
 
 ψ
 
 1
 
 
 ,
 …
 ,
 
 ψ
 
 n
 
 
 )
 ,
 
 ‖
 ψ
 ‖
 =
 1
 }
 
 
 {\displaystyle E_{n}=\min _{\psi _{1},\ldots ,\psi _{n}}\max\{\langle \psi ,A\psi \rangle :\psi \in \operatorname {span} (\psi _{1},\ldots ,\psi _{n}),\,\|\psi \|=1\}}
 
.
If we only have N eigenvalues and hence run out of eigenvalues, then we let 
 
 
 
 
 E
 
 n
 
 
 :=
 inf
 
 σ
 
 e
 s
 s
 
 
 (
 A
 )
 
 
 {\displaystyle E_{n}:=\inf \sigma _{ess}(A)}
 
 (the bottom of the essential spectrum) for n>N, and the above statement holds after replacing min-max with inf-sup.

Theorem (Max-Min). Let A be self-adjoint, and let 
 
 
 
 
 E
 
 1
 
 
 ≤
 
 E
 
 2
 
 
 ≤
 
 E
 
 3
 
 
 ≤
 ⋯
 
 
 {\displaystyle E_{1}\leq E_{2}\leq E_{3}\leq \cdots }
 
 be the eigenvalues of A below the essential spectrum. Then

 
 
 
 
 E
 
 n
 
 
 =
 
 max
 
 
 ψ
 
 1
 
 
 ,
 …
 ,
 
 ψ
 
 n
 −
 1
 
 
 
 
 min
 {
 ⟨
 ψ
 ,
 A
 ψ
 ⟩
 :
 ψ
 ⊥
 
 ψ
 
 1
 
 
 ,
 …
 ,
 
 ψ
 
 n
 −
 1
 
 
 ,
 
 ‖
 ψ
 ‖
 =
 1
 }
 
 
 {\displaystyle E_{n}=\max _{\psi _{1},\ldots ,\psi _{n-1}}\min\{\langle \psi ,A\psi \rangle :\psi \perp \psi _{1},\ldots ,\psi _{n-1},\,\|\psi \|=1\}}
 
.
If we only have N eigenvalues and hence run out of eigenvalues, then we let 
 
 
 
 
 E
 
 n
 
 
 :=
 inf
 
 σ
 
 e
 s
 s
 
 
 (
 A
 )
 
 
 {\displaystyle E_{n}:=\inf \sigma _{ess}(A)}
 
 (the bottom of the essential spectrum) for n > N, and the above statement holds after replacing max-min with sup-inf.
The proofs<a class="footnote-ref" id="fnref:5" href="#fn:5">5</a><a class="footnote-ref" id="fnref:6" href="#fn:6">6</a> use the following results about self-adjoint operators:

Theorem. Let A be self-adjoint. Then 
 
 
 
 (
 A
 −
 E
 )
 ≥
 0
 
 
 {\displaystyle (A-E)\geq 0}
 
 for 
 
 
 
 E
 ∈
 
 R
 
 
 
 {\displaystyle E\in \mathbb {R} }
 
 if and only if 
 
 
 
 σ
 (
 A
 )
 ⊆
 [
 E
 ,
 ∞
 )
 
 
 {\displaystyle \sigma (A)\subseteq [E,\infty )}
 
.<a class="footnote-ref" id="fnref:7" href="#fn:7">7</a>: 77 
Theorem. If A is self-adjoint, then

 
 
 
 inf
 σ
 (
 A
 )
 =
 
 inf
 
 ψ
 ∈
 
 
 D
 
 
 (
 A
 )
 ,
 ‖
 ψ
 ‖
 =
 1
 
 
 ⟨
 ψ
 ,
 A
 ψ
 ⟩
 
 
 {\displaystyle \inf \sigma (A)=\inf _{\psi \in {\mathfrak {D}}(A),\|\psi \|=1}\langle \psi ,A\psi \rangle }

and

 
 
 
 sup
 σ
 (
 A
 )
 =
 
 sup
 
 ψ
 ∈
 
 
 D
 
 
 (
 A
 )
 ,
 ‖
 ψ
 ‖
 =
 1
 
 
 ⟨
 ψ
 ,
 A
 ψ
 ⟩
 
 
 {\displaystyle \sup \sigma (A)=\sup _{\psi \in {\mathfrak {D}}(A),\|\psi \|=1}\langle \psi ,A\psi \rangle }
 
.<a class="footnote-ref" id="fnref:8" href="#fn:8">8</a>: 77 

<h2 id="see-also">See also</h2>
<ul><li><a href="/facts/Courant_minimax_principle/P0MRtGPv">Courant minimax principle</a></li>
<li><a href="/facts/Max%25E2%2580%2593min_inequality/48pXX4sN">Max–min inequality</a></li></ul>

<h2 id="external-links-and-citations-to-related-work">External links and citations to related work</h2>
<ul><li>Fisk, Steve (2005). "A very short proof of Cauchy's interlace theorem for eigenvalues of Hermitian matrices". <a href="/facts/ArXiv_(identifier)/H6EtgnBe">arXiv</a>:<a href="https://arxiv.org/abs/math/0502408">math/0502408</a>.</li>
<li>Hwang, Suk-Geun (2004). <a href="https://www.jstor.org/stable/4145217">"Cauchy's Interlace Theorem for Eigenvalues of Hermitian Matrices"</a>. The American Mathematical Monthly. 111 (2): 157–159. <a href="/facts/Doi_(identifier)/muM9Etpq">doi</a>:<a href="https://doi.org/10.2307%2F4145217">10.2307/4145217</a>. <a href="/facts/JSTOR_(identifier)/YTeVmaJ7">JSTOR</a> <a href="https://www.jstor.org/stable/4145217">4145217</a>.</li>
<li>Kline, Jeffery (2020). <a href="https://doi.org/10.1016%2Fj.laa.2019.12.004">"Bordered Hermitian matrices and sums of the Möbius function"</a>. Linear Algebra and Its Applications. 588: 224–237. <a href="/facts/Doi_(identifier)/muM9Etpq">doi</a>:<a href="https://doi.org/10.1016%2Fj.laa.2019.12.004">10.1016/j.laa.2019.12.004</a>.</li>
<li>Reed, Michael; Simon, Barry (1978). <a href="https://www.elsevier.com/books/iv-analysis-of-operators/reed/978-0-08-057045-7">Methods of Modern Mathematical Physics IV: Analysis of Operators</a>. Academic Press. <a href="/facts/ISBN_(identifier)/15AdSPa9">ISBN</a> 978-0-08-057045-7.</li></ul>

<h2 id="references">References</h2>

<ol>
<li id="fn:1">Tao, Terence (2012). Topics in random matrix theory. Graduate studies in mathematics. Providence, R.I: American Mathematical Society. ISBN 978-0-8218-7430-1. <a href="978-0-8218-7430-1" target="_blank">978-0-8218-7430-1</a> <a href="#fnref:1" class="footnote-back-ref">↩</a></li>
<li id="fn:2">Tao, Terence (2012). Topics in random matrix theory. Graduate studies in mathematics. Providence, R.I: American Mathematical Society. ISBN 978-0-8218-7430-1. <a href="978-0-8218-7430-1" target="_blank">978-0-8218-7430-1</a> <a href="#fnref:2" class="footnote-back-ref">↩</a></li>
<li id="fn:3">G. Teschl, Mathematical Methods in Quantum Mechanics (GSM 99) https://www.mat.univie.ac.at/~gerald/ftp/book-schroe/schroe.pdf <a href="https://www.mat.univie.ac.at/~gerald/ftp/book-schroe/schroe.pdf" target="_blank">https://www.mat.univie.ac.at/~gerald/ftp/book-schroe/schroe.pdf</a> <a href="#fnref:3" class="footnote-back-ref">↩</a></li>
<li id="fn:4">Lieb; Loss (2001). Analysis. GSM. Vol. 14 (2nd ed.). Providence: American Mathematical Society. ISBN 0-8218-2783-9. <a href="0-8218-2783-9" target="_blank">0-8218-2783-9</a> <a href="#fnref:4" class="footnote-back-ref">↩</a></li>
<li id="fn:5">G. Teschl, Mathematical Methods in Quantum Mechanics (GSM 99) https://www.mat.univie.ac.at/~gerald/ftp/book-schroe/schroe.pdf <a href="https://www.mat.univie.ac.at/~gerald/ftp/book-schroe/schroe.pdf" target="_blank">https://www.mat.univie.ac.at/~gerald/ftp/book-schroe/schroe.pdf</a> <a href="#fnref:5" class="footnote-back-ref">↩</a></li>
<li id="fn:6">Lieb; Loss (2001). Analysis. GSM. Vol. 14 (2nd ed.). Providence: American Mathematical Society. ISBN 0-8218-2783-9. <a href="0-8218-2783-9" target="_blank">0-8218-2783-9</a> <a href="#fnref:6" class="footnote-back-ref">↩</a></li>
<li id="fn:7">G. Teschl, Mathematical Methods in Quantum Mechanics (GSM 99) https://www.mat.univie.ac.at/~gerald/ftp/book-schroe/schroe.pdf <a href="https://www.mat.univie.ac.at/~gerald/ftp/book-schroe/schroe.pdf" target="_blank">https://www.mat.univie.ac.at/~gerald/ftp/book-schroe/schroe.pdf</a> <a href="#fnref:7" class="footnote-back-ref">↩</a></li>
<li id="fn:8">G. Teschl, Mathematical Methods in Quantum Mechanics (GSM 99) https://www.mat.univie.ac.at/~gerald/ftp/book-schroe/schroe.pdf <a href="https://www.mat.univie.ac.at/~gerald/ftp/book-schroe/schroe.pdf" target="_blank">https://www.mat.univie.ac.at/~gerald/ftp/book-schroe/schroe.pdf</a> <a href="#fnref:8" class="footnote-back-ref">↩</a></li>
</ol>

Min-max theorem open-in-new

Min-max theorem