Let A be a n × n Hermitian matrix. As with many other variational results on eigenvalues, one considers the Rayleigh–Ritz quotient RA : Cn \ {0} → R defined by
where (⋅, ⋅) denotes the Euclidean inner product on Cn. Equivalently, the Rayleigh–Ritz quotient can be replaced by
The Rayleigh quotient of an eigenvector v {\displaystyle v} is its associated eigenvalue λ {\displaystyle \lambda } because R A ( v ) = ( λ x , x ) / ( x , x ) = λ {\displaystyle R_{A}(v)=(\lambda x,x)/(x,x)=\lambda } . For a Hermitian matrix A, the range of the continuous functions RA(x) and f(x) is a compact interval [a, b] of the real line. The maximum b and the minimum a are the largest and smallest eigenvalue of A, respectively. The min-max theorem is a refinement of this fact.
Let A {\textstyle A} be Hermitian on an inner product space V {\textstyle V} with dimension n {\textstyle n} , with spectrum ordered in descending order λ 1 ≥ . . . ≥ λ n {\textstyle \lambda _{1}\geq ...\geq \lambda _{n}} .
Let v 1 , . . . , v n {\textstyle v_{1},...,v_{n}} be the corresponding unit-length orthogonal eigenvectors.
Reverse the spectrum ordering, so that ξ 1 = λ n , . . . , ξ n = λ 1 {\textstyle \xi _{1}=\lambda _{n},...,\xi _{n}=\lambda _{1}} .
(Poincaré’s inequality)—Let M {\textstyle M} be a subspace of V {\textstyle V} with dimension k {\textstyle k} , then there exists unit vectors x , y ∈ M {\textstyle x,y\in M} , such that
⟨ x , A x ⟩ ≤ λ k {\textstyle \langle x,Ax\rangle \leq \lambda _{k}} , and ⟨ y , A y ⟩ ≥ ξ k {\textstyle \langle y,Ay\rangle \geq \xi _{k}} .
Part 2 is a corollary, using − A {\textstyle -A} .
M {\textstyle M} is a k {\textstyle k} dimensional subspace, so if we pick any list of n − k + 1 {\textstyle n-k+1} vectors, their span N := s p a n ( v k , . . . v n ) {\textstyle N:=span(v_{k},...v_{n})} must intersect M {\textstyle M} on at least a single line.
Take unit x ∈ M ∩ N {\textstyle x\in M\cap N} . That’s what we need.
min-max theorem— λ k = max M ⊂ V dim ( M ) = k min x ∈ M ‖ x ‖ = 1 ⟨ x , A x ⟩ = min M ⊂ V dim ( M ) = n − k + 1 max x ∈ M ‖ x ‖ = 1 ⟨ x , A x ⟩ . {\displaystyle {\begin{aligned}\lambda _{k}&=\max _{\begin{array}{c}{\mathcal {M}}\subset V\\\operatorname {dim} ({\mathcal {M}})=k\end{array}}\min _{\begin{array}{c}x\in {\mathcal {M}}\\\|x\|=1\end{array}}\langle x,Ax\rangle \\&=\min _{\begin{array}{c}{\mathcal {M}}\subset V\\\operatorname {dim} ({\mathcal {M}})=n-k+1\end{array}}\max _{\begin{array}{c}x\in {\mathcal {M}}\\\|x\|=1\end{array}}\langle x,Ax\rangle {\text{. }}\end{aligned}}}
Part 2 is a corollary of part 1, by using − A {\textstyle -A} .
By Poincare’s inequality, λ k {\textstyle \lambda _{k}} is an upper bound to the right side.
By setting M = s p a n ( v 1 , . . . v k ) {\textstyle {\mathcal {M}}=span(v_{1},...v_{k})} , the upper bound is achieved.
Define the partial trace t r V ( A ) {\textstyle tr_{V}(A)} to be the trace of projection of A {\textstyle A} to V {\textstyle V} . It is equal to ∑ i v i ∗ A v i {\textstyle \sum _{i}v_{i}^{*}Av_{i}} given an orthonormal basis of V {\textstyle V} .
Wielandt minimax formula (1: 44 )—Let 1 ≤ i 1 < ⋯ < i k ≤ n {\textstyle 1\leq i_{1}<\cdots <i_{k}\leq n} be integers. Define a partial flag to be a nested collection V 1 ⊂ ⋯ ⊂ V k {\textstyle V_{1}\subset \cdots \subset V_{k}} of subspaces of C n {\textstyle \mathbb {C} ^{n}} such that dim ( V j ) = i j {\textstyle \operatorname {dim} \left(V_{j}\right)=i_{j}} for all 1 ≤ j ≤ k {\textstyle 1\leq j\leq k} .
Define the associated Schubert variety X ( V 1 , … , V k ) {\textstyle X\left(V_{1},\ldots ,V_{k}\right)} to be the collection of all k {\textstyle k} dimensional subspaces W {\textstyle W} such that dim ( W ∩ V j ) ≥ j {\textstyle \operatorname {dim} \left(W\cap V_{j}\right)\geq j} .
λ i 1 ( A ) + ⋯ + λ i k ( A ) = sup V 1 , … , V k inf W ∈ X ( V 1 , … , V k ) t r W ( A ) {\displaystyle \lambda _{i_{1}}(A)+\cdots +\lambda _{i_{k}}(A)=\sup _{V_{1},\ldots ,V_{k}}\inf _{W\in X\left(V_{1},\ldots ,V_{k}\right)}tr_{W}(A)}
This has some corollaries:2: 44
Extremal partial trace— λ 1 ( A ) + ⋯ + λ k ( A ) = sup dim ( V ) = k t r V ( A ) {\displaystyle \lambda _{1}(A)+\dots +\lambda _{k}(A)=\sup _{\operatorname {dim} (V)=k}tr_{V}(A)}
ξ 1 ( A ) + ⋯ + ξ k ( A ) = inf dim ( V ) = k t r V ( A ) {\displaystyle \xi _{1}(A)+\dots +\xi _{k}(A)=\inf _{\operatorname {dim} (V)=k}tr_{V}(A)}
Corollary—The sum λ 1 ( A ) + ⋯ + λ k ( A ) {\textstyle \lambda _{1}(A)+\dots +\lambda _{k}(A)} is a convex function, and ξ 1 ( A ) + ⋯ + ξ k ( A ) {\textstyle \xi _{1}(A)+\dots +\xi _{k}(A)} is concave.
(Schur-Horn inequality) ξ 1 ( A ) + ⋯ + ξ k ( A ) ≤ a i 1 , i 1 + ⋯ + a i k , i k ≤ λ 1 ( A ) + ⋯ + λ k ( A ) {\displaystyle \xi _{1}(A)+\dots +\xi _{k}(A)\leq a_{i_{1},i_{1}}+\dots +a_{i_{k},i_{k}}\leq \lambda _{1}(A)+\dots +\lambda _{k}(A)} for any subset of indices.
Equivalently, this states that the diagonal vector of A {\textstyle A} is majorized by its eigenspectrum.
Schatten-norm Hölder inequality—Given Hermitian A , B {\textstyle A,B} and Hölder pair 1 / p + 1 / q = 1 {\textstyle 1/p+1/q=1} , | tr ( A B ) | ≤ ‖ A ‖ S p ‖ B ‖ S q {\displaystyle |\operatorname {tr} (AB)|\leq \|A\|_{S^{p}}\|B\|_{S^{q}}}
Let N be the nilpotent matrix
Define the Rayleigh quotient R N ( x ) {\displaystyle R_{N}(x)} exactly as above in the Hermitian case. Then it is easy to see that the only eigenvalue of N is zero, while the maximum value of the Rayleigh quotient is 1/2. That is, the maximum value of the Rayleigh quotient is larger than the maximum eigenvalue.
The singular values {σk} of a square matrix M are the square roots of the eigenvalues of M*M (equivalently MM*). An immediate consequence of the first equality in the min-max theorem is:
Similarly,
Here σ k ↓ {\displaystyle \sigma _{k}^{\downarrow }} denotes the kth entry in the decreasing sequence of the singular values, so that σ 1 ↓ ≥ σ 2 ↓ ≥ ⋯ {\displaystyle \sigma _{1}^{\downarrow }\geq \sigma _{2}^{\downarrow }\geq \cdots } .
Main article: Poincaré separation theorem
Let A be a symmetric n × n matrix. The m × m matrix B, where m ≤ n, is called a compression of A if there exists an orthogonal projection P onto a subspace of dimension m such that PAP* = B. The Cauchy interlacing theorem states:
This can be proven using the min-max principle. Let βi have corresponding eigenvector bi and Sj be the j dimensional subspace Sj = span{b1, ..., bj}, then
According to first part of min-max, αj ≤ βj. On the other hand, if we define Sm−j+1 = span{bj, ..., bm}, then
where the last inequality is given by the second part of min-max.
When n − m = 1, we have αj ≤ βj ≤ αj+1, hence the name interlacing theorem.
Main article: Trace class § Lidskii's theorem
Lidskii inequality—If 1 ≤ i 1 < ⋯ < i k ≤ n {\textstyle 1\leq i_{1}<\cdots <i_{k}\leq n} then λ i 1 ( A + B ) + ⋯ + λ i k ( A + B ) ≤ λ i 1 ( A ) + ⋯ + λ i k ( A ) + λ 1 ( B ) + ⋯ + λ k ( B ) {\displaystyle {\begin{aligned}&\lambda _{i_{1}}(A+B)+\cdots +\lambda _{i_{k}}(A+B)\\&\quad \leq \lambda _{i_{1}}(A)+\cdots +\lambda _{i_{k}}(A)+\lambda _{1}(B)+\cdots +\lambda _{k}(B)\end{aligned}}}
λ i 1 ( A + B ) + ⋯ + λ i k ( A + B ) ≥ λ i 1 ( A ) + ⋯ + λ i k ( A ) + ξ 1 ( B ) + ⋯ + ξ k ( B ) {\displaystyle {\begin{aligned}&\lambda _{i_{1}}(A+B)+\cdots +\lambda _{i_{k}}(A+B)\\&\quad \geq \lambda _{i_{1}}(A)+\cdots +\lambda _{i_{k}}(A)+\xi _{1}(B)+\cdots +\xi _{k}(B)\end{aligned}}}
Note that ∑ i λ i ( A + B ) = t r ( A + B ) = ∑ i λ i ( A ) + λ i ( B ) {\displaystyle \sum _{i}\lambda _{i}(A+B)=tr(A+B)=\sum _{i}\lambda _{i}(A)+\lambda _{i}(B)} . In other words, λ ( A + B ) − λ ( A ) ⪯ λ ( B ) {\displaystyle \lambda (A+B)-\lambda (A)\preceq \lambda (B)} where ⪯ {\displaystyle \preceq } means majorization. By the Schur convexity theorem, we then have
p-Wielandt-Hoffman inequality— ‖ λ ( A + B ) − λ ( A ) ‖ ℓ p ≤ ‖ B ‖ S p {\textstyle \|\lambda (A+B)-\lambda (A)\|_{\ell ^{p}}\leq \|B\|_{S^{p}}} where ‖ ⋅ ‖ S p {\textstyle \|\cdot \|_{S^{p}}} stands for the p-Schatten norm.
Let A be a compact, Hermitian operator on a Hilbert space H. Recall that the spectrum of such an operator (the set of eigenvalues) is a set of real numbers whose only possible cluster point is zero. It is thus convenient to list the positive eigenvalues of A as
where entries are repeated with multiplicity, as in the matrix case. (To emphasize that the sequence is decreasing, we may write λ k = λ k ↓ {\displaystyle \lambda _{k}=\lambda _{k}^{\downarrow }} .) When H is infinite-dimensional, the above sequence of eigenvalues is necessarily infinite. We now apply the same reasoning as in the matrix case. Letting Sk ⊂ H be a k dimensional subspace, we can obtain the following theorem.
A similar pair of equalities hold for negative eigenvalues.
Let S' be the closure of the linear span S ′ = span { u k , u k + 1 , … } {\displaystyle S'=\operatorname {span} \{u_{k},u_{k+1},\ldots \}} . The subspace S' has codimension k − 1. By the same dimension count argument as in the matrix case, S' ∩ Sk has positive dimension. So there exists x ∈ S' ∩ Sk with ‖ x ‖ = 1 {\displaystyle \|x\|=1} . Since it is an element of S' , such an x necessarily satisfy
Therefore, for all Sk
But A is compact, therefore the function f(x) = (Ax, x) is weakly continuous. Furthermore, any bounded set in H is weakly compact. This lets us replace the infimum by minimum:
So
Because equality is achieved when S k = span { u 1 , … , u k } {\displaystyle S_{k}=\operatorname {span} \{u_{1},\ldots ,u_{k}\}} ,
This is the first part of min-max theorem for compact self-adjoint operators.
Analogously, consider now a (k − 1)-dimensional subspace Sk−1, whose the orthogonal complement is denoted by Sk−1⊥. If S' = span{u1...uk},
This implies
where the compactness of A was applied. Index the above by the collection of k-1-dimensional subspaces gives
Pick Sk−1 = span{u1, ..., uk−1} and we deduce
The min-max theorem also applies to (possibly unbounded) self-adjoint operators.34 Recall the essential spectrum is the spectrum without isolated eigenvalues of finite multiplicity. Sometimes we have some eigenvalues below the essential spectrum, and we would like to approximate the eigenvalues and eigenfunctions.
E n = min ψ 1 , … , ψ n max { ⟨ ψ , A ψ ⟩ : ψ ∈ span ( ψ 1 , … , ψ n ) , ‖ ψ ‖ = 1 } {\displaystyle E_{n}=\min _{\psi _{1},\ldots ,\psi _{n}}\max\{\langle \psi ,A\psi \rangle :\psi \in \operatorname {span} (\psi _{1},\ldots ,\psi _{n}),\,\|\psi \|=1\}} .
If we only have N eigenvalues and hence run out of eigenvalues, then we let E n := inf σ e s s ( A ) {\displaystyle E_{n}:=\inf \sigma _{ess}(A)} (the bottom of the essential spectrum) for n>N, and the above statement holds after replacing min-max with inf-sup.
E n = max ψ 1 , … , ψ n − 1 min { ⟨ ψ , A ψ ⟩ : ψ ⊥ ψ 1 , … , ψ n − 1 , ‖ ψ ‖ = 1 } {\displaystyle E_{n}=\max _{\psi _{1},\ldots ,\psi _{n-1}}\min\{\langle \psi ,A\psi \rangle :\psi \perp \psi _{1},\ldots ,\psi _{n-1},\,\|\psi \|=1\}} .
If we only have N eigenvalues and hence run out of eigenvalues, then we let E n := inf σ e s s ( A ) {\displaystyle E_{n}:=\inf \sigma _{ess}(A)} (the bottom of the essential spectrum) for n > N, and the above statement holds after replacing max-min with sup-inf.
The proofs56 use the following results about self-adjoint operators:
inf σ ( A ) = inf ψ ∈ D ( A ) , ‖ ψ ‖ = 1 ⟨ ψ , A ψ ⟩ {\displaystyle \inf \sigma (A)=\inf _{\psi \in {\mathfrak {D}}(A),\|\psi \|=1}\langle \psi ,A\psi \rangle }
and
sup σ ( A ) = sup ψ ∈ D ( A ) , ‖ ψ ‖ = 1 ⟨ ψ , A ψ ⟩ {\displaystyle \sup \sigma (A)=\sup _{\psi \in {\mathfrak {D}}(A),\|\psi \|=1}\langle \psi ,A\psi \rangle } .8: 77
Tao, Terence (2012). Topics in random matrix theory. Graduate studies in mathematics. Providence, R.I: American Mathematical Society. ISBN 978-0-8218-7430-1. 978-0-8218-7430-1 ↩
G. Teschl, Mathematical Methods in Quantum Mechanics (GSM 99) https://www.mat.univie.ac.at/~gerald/ftp/book-schroe/schroe.pdf https://www.mat.univie.ac.at/~gerald/ftp/book-schroe/schroe.pdf ↩
Lieb; Loss (2001). Analysis. GSM. Vol. 14 (2nd ed.). Providence: American Mathematical Society. ISBN 0-8218-2783-9. 0-8218-2783-9 ↩