In the mathematical field of linear algebra and convex analysis, the numerical range or field of values of a complex n × n {\displaystyle n\times n} matrix A is the set
W ( A ) = { x ∗ A x x ∗ x ∣ x ∈ C n , x ≠ 0 } = { ⟨ x , A x ⟩ ∣ x ∈ C n , ‖ x ‖ 2 = 1 } {\displaystyle W(A)=\left\{{\frac {\mathbf {x} ^{*}A\mathbf {x} }{\mathbf {x} ^{*}\mathbf {x} }}\mid \mathbf {x} \in \mathbb {C} ^{n},\ \mathbf {x} \not =0\right\}=\left\{\langle \mathbf {x} ,A\mathbf {x} \rangle \mid \mathbf {x} \in \mathbb {C} ^{n},\ \|\mathbf {x} \|_{2}=1\right\}}where x ∗ {\displaystyle \mathbf {x} ^{*}} denotes the conjugate transpose of the vector x {\displaystyle \mathbf {x} } . The numerical range includes, in particular, the diagonal entries of the matrix (obtained by choosing x equal to the unit vectors along the coordinate axes) and the eigenvalues of the matrix (obtained by choosing x equal to the eigenvectors).
In engineering, numerical ranges are used as a rough estimate of eigenvalues of A. Recently, generalizations of the numerical range are used to study quantum computing.
A related concept is the numerical radius, which is the largest absolute value of the numbers in the numerical range, i.e.
r ( A ) = sup { | λ | : λ ∈ W ( A ) } = sup ‖ x ‖ 2 = 1 | ⟨ x , A x ⟩ | . {\displaystyle r(A)=\sup\{|\lambda |:\lambda \in W(A)\}=\sup _{\|x\|_{2}=1}|\langle \mathbf {x} ,A\mathbf {x} \rangle |.}Properties
Let sum of sets denote a sumset.
General properties
- The numerical range is the range of the Rayleigh quotient.
- (Hausdorff–Toeplitz theorem) The numerical range is convex and compact.
- W ( α A + β I ) = α W ( A ) + { β } {\displaystyle W(\alpha A+\beta I)=\alpha W(A)+\{\beta \}} for all square matrix A {\displaystyle A} and complex numbers α {\displaystyle \alpha } and β {\displaystyle \beta } . Here I {\displaystyle I} is the identity matrix.
- W ( A ) {\displaystyle W(A)} is a subset of the closed right half-plane if and only if A + A ∗ {\displaystyle A+A^{*}} is positive semidefinite.
- The numerical range W ( ⋅ ) {\displaystyle W(\cdot )} is the only function on the set of square matrices that satisfies (2), (3) and (4).
- W ( U A U ∗ ) = W ( A ) {\displaystyle W(UAU^{*})=W(A)} for any unitary U {\displaystyle U} .
- W ( A ∗ ) = W ( A ) ∗ {\displaystyle W(A^{*})=W(A)^{*}} .
- If A {\displaystyle A} is Hermitian, then W ( A ) {\displaystyle W(A)} is on the real line. If A {\displaystyle A} is anti-Hermitian, then W ( A ) {\displaystyle W(A)} is on the imaginary line.
- W ( A ) = { z } {\displaystyle W(A)=\{z\}} if and only if A = z I {\displaystyle A=zI} .
- (Sub-additive) W ( A + B ) ⊆ W ( A ) + W ( B ) {\displaystyle W(A+B)\subseteq W(A)+W(B)} .
- W ( A ) {\displaystyle W(A)} contains all the eigenvalues of A {\displaystyle A} .
- The numerical range of a 2 × 2 {\displaystyle 2\times 2} matrix is a filled ellipse.
- W ( A ) {\displaystyle W(A)} is a real line segment [ α , β ] {\displaystyle [\alpha ,\beta ]} if and only if A {\displaystyle A} is a Hermitian matrix with its smallest and the largest eigenvalues being α {\displaystyle \alpha } and β {\displaystyle \beta } .
- If A {\textstyle A} is normal, and x ∈ span ( v 1 , … , v k ) {\textstyle x\in \operatorname {span} (v_{1},\dots ,v_{k})} , where v 1 , … , v k {\textstyle v_{1},\ldots ,v_{k}} are eigenvectors of A {\textstyle A} corresponding to λ 1 , … , λ k {\textstyle \lambda _{1},\ldots ,\lambda _{k}} , respectively, then ⟨ x , A x ⟩ ∈ hull ( λ 1 , … , λ k ) {\textstyle \langle x,Ax\rangle \in \operatorname {hull} \left(\lambda _{1},\ldots ,\lambda _{k}\right)} .
- If A {\displaystyle A} is a normal matrix then W ( A ) {\displaystyle W(A)} is the convex hull of its eigenvalues.
- If α {\displaystyle \alpha } is a sharp point on the boundary of W ( A ) {\displaystyle W(A)} , then α {\displaystyle \alpha } is a normal eigenvalue of A {\displaystyle A} .
Numerical radius
- r ( ⋅ ) {\displaystyle r(\cdot )} is a unitarily invariant norm on the space of n × n {\displaystyle n\times n} matrices.
- r ( A ) ≤ ‖ A ‖ op ≤ 2 r ( A ) {\displaystyle r(A)\leq \|A\|_{\operatorname {op} }\leq 2r(A)} , where ‖ ⋅ ‖ op {\displaystyle \|\cdot \|_{\operatorname {op} }} denotes the operator norm.1234
- r ( A ) = ‖ A ‖ op {\displaystyle r(A)=\|A\|_{\operatorname {op} }} if (but not only if) A {\displaystyle A} is normal.
- r ( A n ) ≤ r ( A ) n {\displaystyle r(A^{n})\leq r(A)^{n}} .
Proofs
Most of the claims are obvious. Some are not.
General properties
Proof of (13)If A {\textstyle A} is Hermitian, then it is normal, so it is the convex hull of its eigenvalues, which are all real.
Conversely, assume W ( A ) {\textstyle W(A)} is on the real line. Decompose A = B + C {\textstyle A=B+C} , where B {\textstyle B} is a Hermitian matrix, and C {\textstyle C} an anti-Hermitian matrix. Since W ( C ) {\textstyle W(C)} is on the imaginary line, if C ≠ 0 {\textstyle C\neq 0} , then W ( A ) {\textstyle W(A)} would stray from the real line. Thus C = 0 {\textstyle C=0} , and A {\textstyle A} is Hermitian.
Proof of (12)The elements of W ( A ) {\textstyle W(A)} are of the form tr ( A P ) {\textstyle \operatorname {tr} (AP)} , where P {\textstyle P} is projection from C 2 {\textstyle \mathbb {C} ^{2}} to a one-dimensional subspace.
The space of all one-dimensional subspaces of C 2 {\textstyle \mathbb {C} ^{2}} is P C 1 {\textstyle \mathbb {P} \mathbb {C} ^{1}} , which is a 2-sphere. The image of a 2-sphere under a linear projection is a filled ellipse.
In more detail, such P {\textstyle P} are of the form 1 2 I + 1 2 [ cos 2 θ e i ϕ sin 2 θ e − i ϕ sin 2 θ − cos 2 θ ] = 1 2 [ 1 + z x + i y x − i y 1 − z ] {\displaystyle {\frac {1}{2}}I+{\frac {1}{2}}{\begin{bmatrix}\cos 2\theta &e^{i\phi }\sin 2\theta \\e^{-i\phi }\sin 2\theta &-\cos 2\theta \end{bmatrix}}={\frac {1}{2}}{\begin{bmatrix}1+z&x+iy\\x-iy&1-z\end{bmatrix}}} where x , y , z {\textstyle x,y,z} , satisfying x 2 + y 2 + z 2 = 1 {\textstyle x^{2}+y^{2}+z^{2}=1} , is a point on the unit 2-sphere.
Therefore, the elements of W ( A ) {\textstyle W(A)} , regarded as elements of R 2 {\textstyle \mathbb {R} ^{2}} is the composition of two real linear maps ( x , y , z ) ↦ 1 2 [ 1 + z x + i y x − i y 1 − z ] {\textstyle (x,y,z)\mapsto {\frac {1}{2}}{\begin{bmatrix}1+z&x+iy\\x-iy&1-z\end{bmatrix}}} and M ↦ tr ( A M ) {\textstyle M\mapsto \operatorname {tr} (AM)} , which maps the 2-sphere to a filled ellipse.
Proof of (2)W ( A ) {\textstyle W(A)} is the image of a continuous map x ↦ ⟨ x , A x ⟩ {\textstyle x\mapsto \langle x,Ax\rangle } from the closed unit sphere, so it is compact.
For any x , y {\textstyle x,y} of unit norm, project A {\textstyle A} to the span of x , y {\textstyle x,y} as P ∗ A P {\textstyle P^{*}AP} . Then W ( P ∗ A P ) {\textstyle W(P^{*}AP)} is a filled ellipse by the previous result, and so for any θ ∈ [ 0 , 1 ] {\textstyle \theta \in [0,1]} , let z = θ x + ( 1 − θ ) y {\textstyle z=\theta x+(1-\theta )y} , we have ⟨ z , A z ⟩ = ⟨ z , P ∗ A P z ⟩ ∈ W ( P ∗ A P ) ⊂ W ( A ) {\displaystyle \langle z,Az\rangle =\langle z,P^{*}APz\rangle \in W(P^{*}AP)\subset W(A)}
Proof of (5)Let W {\textstyle W} satisfy these properties. Let W 0 {\textstyle W_{0}} be the original numerical range.
Fix some matrix A {\textstyle A} . We show that the supporting planes of W ( A ) {\textstyle W(A)} and W 0 ( A ) {\textstyle W_{0}(A)} are identical. This would then imply that W ( A ) = W 0 ( A ) {\textstyle W(A)=W_{0}(A)} since they are both convex and compact.
By property (4), W ( A ) {\textstyle W(A)} is nonempty. Let z {\textstyle z} be a point on the boundary of W ( A ) {\textstyle W(A)} , then we can translate and rotate the complex plane so that the point translates to the origin, and the region W ( A ) {\textstyle W(A)} falls entirely within C + {\textstyle \mathbb {C} ^{+}} . That is, for some ϕ ∈ R {\textstyle \phi \in \mathbb {R} } , the set e i ϕ ( W ( A ) − z ) {\textstyle e^{i\phi }(W(A)-z)} lies entirely within C + {\textstyle \mathbb {C} ^{+}} , while for any t > 0 {\textstyle t>0} , the set e i ϕ ( W ( A ) − z ) − t I {\textstyle e^{i\phi }(W(A)-z)-tI} does not lie entirely in C + {\textstyle \mathbb {C} ^{+}} .
The two properties of W {\textstyle W} then imply that e i ϕ ( A − z ) + e − i ϕ ( A − z ) ∗ ⪰ 0 {\displaystyle e^{i\phi }(A-z)+e^{-i\phi }(A-z)^{*}\succeq 0} and that inequality is sharp, meaning that e i ϕ ( A − z ) + e − i ϕ ( A − z ) ∗ {\textstyle e^{i\phi }(A-z)+e^{-i\phi }(A-z)^{*}} has a zero eigenvalue. This is a complete characterization of the supporting planes of W ( A ) {\textstyle W(A)} .
The same argument applies to W 0 ( A ) {\textstyle W_{0}(A)} , so they have the same supporting planes.
Normal matrices
Proof of (1), (2)For (2), if A {\textstyle A} is normal, then it has a full eigenbasis, so it reduces to (1).
Since A {\textstyle A} is normal, by the spectral theorem, there exists a unitary matrix U {\textstyle U} such that A = U D U ∗ {\textstyle A=UDU^{*}} , where D {\textstyle D} is a diagonal matrix containing the eigenvalues λ 1 , λ 2 , … , λ n {\textstyle \lambda _{1},\lambda _{2},\ldots ,\lambda _{n}} of A {\textstyle A} .
Let x = c 1 v 1 + c 2 v 2 + ⋯ + c k v k {\textstyle x=c_{1}v_{1}+c_{2}v_{2}+\cdots +c_{k}v_{k}} . Using the linearity of the inner product, that A v j = λ j v j {\textstyle Av_{j}=\lambda _{j}v_{j}} , and that { v i } {\textstyle \left\{v_{i}\right\}} are orthonormal, we have:
⟨ x , A x ⟩ = ∑ i , j = 1 k c i ∗ c j ⟨ v i , λ j v j ⟩ ∑ i = 1 k | c i | 2 λ i ∈ hull ( λ 1 , … , λ k ) {\displaystyle \langle x,Ax\rangle =\sum _{i,j=1}^{k}c_{i}^{*}c_{j}\left\langle v_{i},\lambda _{j}v_{j}\right\rangle \sum _{i=1}^{k}\left|c_{i}\right|^{2}\lambda _{i}\in \operatorname {hull} \left(\lambda _{1},\ldots ,\lambda _{k}\right)}
Proof (3)By affineness of W {\textstyle W} , we can translate and rotate the complex plane, so that we reduce to the case where ∂ W ( A ) {\textstyle \partial W(A)} has a sharp point at 0 {\textstyle 0} , and that the two supporting planes at that point both make an angle ϕ 1 , ϕ 2 {\textstyle \phi _{1},\phi _{2}} with the imaginary axis, such that ϕ 1 < ϕ 2 , e i ϕ 1 ≠ e i ϕ 2 {\textstyle \phi _{1}<\phi _{2},e^{i\phi _{1}}\neq e^{i\phi _{2}}} since the point is sharp.
Since 0 ∈ W ( A ) {\textstyle 0\in W(A)} , there exists a unit vector x 0 {\textstyle x_{0}} such that x 0 ∗ A x 0 = 0 {\textstyle x_{0}^{*}Ax_{0}=0} .
By general property (4), the numerical range lies in the sectors defined by: Re ( e i θ ⟨ x , A x ⟩ ) ≥ 0 for all θ ∈ [ ϕ 1 , ϕ 2 ] and nonzero x ∈ C n . {\displaystyle \operatorname {Re} \left(e^{i\theta }\langle x,Ax\rangle \right)\geq 0\quad {\text{for all }}\theta \in [\phi _{1},\phi _{2}]{\text{ and nonzero }}x\in \mathbb {C} ^{n}.} At x = x 0 {\textstyle x=x_{0}} , the directional derivative in any direction y {\textstyle y} must vanish to maintain non-negativity. Specifically: d d t Re ( e i θ ⟨ x 0 + t y , A ( x 0 + t y ) ⟩ ) | t = 0 = 0 ∀ y ∈ C n , θ ∈ [ ϕ 1 , ϕ 2 ] . {\displaystyle \left.{\frac {d}{dt}}\operatorname {Re} \left(e^{i\theta }\langle x_{0}+ty,A(x_{0}+ty)\rangle \right)\right|_{t=0}=0\quad \forall y\in \mathbb {C} ^{n},\theta \in [\phi _{1},\phi _{2}].} Expanding this derivative: Re ( e i θ ( ⟨ y , A x 0 ⟩ + ⟨ x 0 , A y ⟩ ) ) = 0 ∀ y ∈ C n , θ ∈ [ ϕ 1 , ϕ 2 ] . {\displaystyle \operatorname {Re} \left(e^{i\theta }\left(\langle y,Ax_{0}\rangle +\langle x_{0},Ay\rangle \right)\right)=0\quad \forall y\in \mathbb {C} ^{n},\theta \in [\phi _{1},\phi _{2}].}
Since the above holds for all θ ∈ [ ϕ 1 , ϕ 2 ] {\textstyle \theta \in [\phi _{1},\phi _{2}]} , we must have: ⟨ y , A x 0 ⟩ + ⟨ x 0 , A y ⟩ = 0 ∀ y ∈ C n . {\displaystyle \langle y,Ax_{0}\rangle +\langle x_{0},Ay\rangle =0\quad \forall y\in \mathbb {C} ^{n}.}
For any y ∈ C n {\textstyle y\in \mathbb {C} ^{n}} and α ∈ C {\textstyle \alpha \in \mathbb {C} } , substitute α y {\textstyle \alpha y} into the equation: α ⟨ y , A x 0 ⟩ + α ∗ ⟨ x 0 , A y ⟩ = 0. {\displaystyle \alpha \langle y,Ax_{0}\rangle +\alpha ^{*}\langle x_{0},Ay\rangle =0.} Choose α = 1 {\textstyle \alpha =1} and α = i {\textstyle \alpha =i} , then simplify, we obtain ⟨ y , A x 0 ⟩ = 0 {\displaystyle \langle y,Ax_{0}\rangle =0} for all y {\displaystyle y} , thus A x 0 = 0 {\textstyle Ax_{0}=0} .
Numerical radius
Proof of (2)Let v = arg max ‖ x ‖ 2 = 1 | ⟨ x , A x ⟩ | {\textstyle v=\arg \max _{\|x\|_{2}=1}|\langle x,Ax\rangle |} . We have r ( A ) = | ⟨ v , A v ⟩ | {\textstyle r(A)=|\langle v,Av\rangle |} .
By Cauchy–Schwarz, | ⟨ v , A v ⟩ | ≤ ‖ v ‖ 2 ‖ A v ‖ 2 = ‖ A v ‖ 2 ≤ ‖ A ‖ o p {\displaystyle |\langle v,Av\rangle |\leq \|v\|_{2}\|Av\|_{2}=\|Av\|_{2}\leq \|A\|_{op}}
For the other one, let A = B + i C {\textstyle A=B+iC} , where B , C {\textstyle B,C} are Hermitian. ‖ A ‖ o p ≤ ‖ B ‖ o p + ‖ C ‖ o p {\displaystyle \|A\|_{op}\leq \|B\|_{op}+\|C\|_{op}}
Since W ( B ) {\textstyle W(B)} is on the real line, and W ( i C ) {\textstyle W(iC)} is on the imaginary line, the extremal points of W ( B ) , W ( i C ) {\textstyle W(B),W(iC)} appear in W ( A ) {\textstyle W(A)} , shifted, thus both ‖ B ‖ o p = r ( B ) ≤ r ( A ) , ‖ C ‖ o p = r ( i C ) ≤ r ( A ) {\textstyle \|B\|_{op}=r(B)\leq r(A),\|C\|_{op}=r(iC)\leq r(A)} .
Generalisations
- C-numerical range
- Higher-rank numerical range
- Joint numerical range
- Product numerical range
- Polynomial numerical hull
See also
Bibliography
- Toeplitz, Otto (1918). "Das algebraische Analogon zu einem Satze von Fejér" (PDF). Mathematische Zeitschrift (in German). 2 (1–2): 187–197. doi:10.1007/BF01212904. ISSN 0025-5874.
- Hausdorff, Felix (1919). "Der Wertvorrat einer Bilinearform". Mathematische Zeitschrift (in German). 3 (1): 314–316. doi:10.1007/BF01292610. ISSN 0025-5874.
- Choi, M.D.; Kribs, D.W.; Życzkowski (2006), "Quantum error correcting codes from the compression formalism", Rep. Math. Phys., 58 (1): 77–91, arXiv:quant-ph/0511101, Bibcode:2006RpMP...58...77C, doi:10.1016/S0034-4877(06)80041-8, S2CID 119427312.
- Bhatia, Rajendra (1997). Matrix analysis. Graduate texts in mathematics. New York Berlin Heidelberg: Springer. ISBN 978-0-387-94846-1.
- Dirr, G.; Helmkel, U.; Kleinsteuber, M.; Schulte-Herbrüggen, Th. (2006), "A new type of C-numerical range arising in quantum computing", Proc. Appl. Math. Mech., 6: 711–712, doi:10.1002/pamm.200610336.
- Bonsall, F.F.; Duncan, J. (1971), Numerical Ranges of Operators on Normed Spaces and of Elements of Normed Algebras, Cambridge University Press, ISBN 978-0-521-07988-4.
- Bonsall, F.F.; Duncan, J. (1971), Numerical Ranges II, Cambridge University Press, ISBN 978-0-521-20227-5.
- Horn, Roger A.; Johnson, Charles R. (1991), Topics in Matrix Analysis, Cambridge University Press, Chapter 1, ISBN 978-0-521-46713-1.
- Horn, Roger A.; Johnson, Charles R. (1990), Matrix Analysis, Cambridge University Press, Ch. 5.7, ex. 21, ISBN 0-521-30586-1
- Li, C.K. (1996), "A simple proof of the elliptical range theorem", Proc. Am. Math. Soc., 124 (7): 1985, doi:10.1090/S0002-9939-96-03307-2.
- Keeler, Dennis S.; Rodman, Leiba; Spitkovsky, Ilya M. (1997), "The numerical range of 3 × 3 matrices", Linear Algebra and Its Applications, 252 (1–3): 115, doi:10.1016/0024-3795(95)00674-5.
- Johnson, Charles R. (1976). "Functional characterizations of the field of values and the convex hull of the spectrum" (PDF). Proceedings of the American Mathematical Society. 61 (2). American Mathematical Society (AMS): 201–204. doi:10.1090/s0002-9939-1976-0437555-3. ISSN 0002-9939.
References
""well-known" inequality for numerical radius of an operator". StackExchange. https://math.stackexchange.com/questions/3278149/ ↩
"Upper bound for norm of Hilbert space operator". StackExchange. https://math.stackexchange.com/questions/597880/ ↩
"Inequalities for numerical radius of complex Hilbert space operator". StackExchange. https://math.stackexchange.com/questions/4020968/ ↩
Hilary Priestley. "B4b hilbert spaces: extended synopses 9. Spectral theory" (PDF). In fact, ‖T‖ = max(−mT , MT) = wT. This fails for non-self-adjoint operators, but wT ≤ ‖T‖ ≤ 2wT in the complex case. /wiki/Hilary_Priestley ↩