Companion matrix

<h2 id="similarity-to-companion-matrix">Similarity to companion matrix</h2>
<p>Any matrix A with entries in a <a href="/facts/Field_(mathematics)/xAjAS4ko">field</a> F has characteristic polynomial 
  
    
      
        p
        (
        x
        )
        =
        det
        (
        x
        I
        −
        A
        )
      
    
    {\displaystyle p(x)=\det(xI-A)}
  
, which in turn has companion matrix 
  
    
      
        C
        (
        p
        )
      
    
    {\displaystyle C(p)}
  
. These matrices are related as follows.
</p><p>The following statements are equivalent:
</p>
<ul><li><i>A</i>  is <a href="/facts/Similar_(linear_algebra)/ySevpav1">similar</a> over <i>F</i> to 
  
    
      
        C
        (
        p
        )
      
    
    {\displaystyle C(p)}
  
, i.e. <i>A</i> can be conjugated to its companion matrix by matrices in GL<i>n</i>(<i>F</i>);</li>
<li>the characteristic polynomial 
  
    
      
        p
        (
        x
        )
      
    
    {\displaystyle p(x)}
  
 coincides with the minimal polynomial of <i>A</i> , i.e. the minimal polynomial has degree <i>n</i>;</li>
<li>the linear mapping 
  
    
      
        A
        :
        
          F
          
            n
          
        
        →
        
          F
          
            n
          
        
      
    
    {\displaystyle A:F^{n}\to F^{n}}
  
 makes 
  
    
      
        
          F
          
            n
          
        
      
    
    {\displaystyle F^{n}}
  
 a <a href="/facts/Cyclic_module/w9tMplLC">cyclic</a> 
  
    
      
        F
        [
        A
        ]
      
    
    {\displaystyle F[A]}
  
-module, having a basis of the form 
  
    
      
        {
        v
        ,
        A
        v
        ,
        …
        ,
        
          A
          
            n
            −
            1
          
        
        v
        }
      
    
    {\displaystyle \{v,Av,\ldots ,A^{n-1}v\}}
  
; or equivalently 
  
    
      
        
          F
          
            n
          
        
        ≅
        F
        [
        X
        ]
        
          /
        
        (
        p
        (
        x
        )
        )
      
    
    {\displaystyle F^{n}\cong F[X]/(p(x))}
  
 as 
  
    
      
        F
        [
        A
        ]
      
    
    {\displaystyle F[A]}
  
-modules.</li></ul>
<p>If the above hold, one says that <i>A</i> is <i>non-derogatory</i>.
</p><p>Not every square matrix is similar to a companion matrix, but every square matrix is similar to a <a href="/facts/Block_diagonal/fPI8HX9f">block diagonal</a> matrix made of companion matrices. If we also demand that the polynomial of each diagonal block divides the next one, they are uniquely determined by <i>A</i>, and this gives the <a href="/facts/Frobenius_normal_form/EUPch5SM">rational canonical form</a> of <i>A</i>. 
</p>
<h2 id="diagonalizability">Diagonalizability</h2>
<p>The roots of the characteristic polynomial 
  
    
      
        p
        (
        x
        )
      
    
    {\displaystyle p(x)}
  
 are the <a href="/facts/Eigenvalue/8TjEoT8u">eigenvalues</a> of 
  
    
      
        C
        (
        p
        )
      
    
    {\displaystyle C(p)}
  
. If there are <i>n</i> distinct eigenvalues 
  
    
      
        
          λ
          
            1
          
        
        ,
        …
        ,
        
          λ
          
            n
          
        
      
    
    {\displaystyle \lambda _{1},\ldots ,\lambda _{n}}
  
, then 
  
    
      
        C
        (
        p
        )
      
    
    {\displaystyle C(p)}
  
 is <a href="/facts/Diagonalizable/y3MkyTCy">diagonalizable</a> as 
  
    
      
        C
        (
        p
        )
        =
        
          V
          
            −
            1
          
        
        
        D
        V
      
    
    {\displaystyle C(p)=V^{-1}\!DV}
  
, where <i>D</i> is the diagonal matrix and <i>V</i> is the <a href="/facts/Vandermonde_matrix/8oSbUk6u">Vandermonde matrix</a> corresponding to the λ's:

D
        =
        
          
            [
            
              
                
                  
                    λ
                    
                      1
                    
                  
                
                
                  0
                
                
                  
                  
                  
                  ⋯
                  
                  
                  
                
                
                  0
                
              
              
                
                  0
                
                
                  
                    λ
                    
                      2
                    
                  
                
                
                  
                  
                  
                  ⋯
                  
                  
                  
                
                
                  0
                
              
              
                
                  0
                
                
                  0
                
                
                  
                  
                  
                  ⋯
                  
                  
                  
                
                
                  
                    λ
                    
                      n
                    
                  
                
              
            
            ]
          
        
        ,
        
        V
        =
        
          
            [
            
              
                
                  1
                
                
                  
                    λ
                    
                      1
                    
                  
                
                
                  
                    λ
                    
                      1
                    
                    
                      2
                    
                  
                
                
                  
                  
                  
                  ⋯
                  
                  
                  
                
                
                  
                    λ
                    
                      1
                    
                    
                      n
                      −
                      1
                    
                  
                
              
              
                
                  1
                
                
                  
                    λ
                    
                      2
                    
                  
                
                
                  
                    λ
                    
                      2
                    
                    
                      2
                    
                  
                
                
                  
                  
                  
                  ⋯
                  
                  
                  
                
                
                  
                    λ
                    
                      2
                    
                    
                      n
                      −
                      1
                    
                  
                
              
              
                
                  ⋮
                
                
                  ⋮
                
                
                  ⋮
                
                
                  
                  
                  
                  ⋱
                  
                  
                  
                
                
                  ⋮
                
              
              
                
                  1
                
                
                  
                    λ
                    
                      n
                    
                  
                
                
                  
                    λ
                    
                      n
                    
                    
                      2
                    
                  
                
                
                  
                  
                  
                  ⋯
                  
                  
                  
                
                
                  
                    λ
                    
                      n
                    
                    
                      n
                      −
                      1
                    
                  
                
              
            
            ]
          
        
        .
      
    
    {\displaystyle D={\begin{bmatrix}\lambda _{1}&0&\!\!\!\cdots \!\!\!&0\\0&\lambda _{2}&\!\!\!\cdots \!\!\!&0\\0&0&\!\!\!\cdots \!\!\!&\lambda _{n}\end{bmatrix}},\qquad V={\begin{bmatrix}1&\lambda _{1}&\lambda _{1}^{2}&\!\!\!\cdots \!\!\!&\lambda _{1}^{n-1}\\1&\lambda _{2}&\lambda _{2}^{2}&\!\!\!\cdots \!\!\!&\lambda _{2}^{n-1}\\[-1em]\vdots &\vdots &\vdots &\!\!\!\ddots \!\!\!&\vdots \\1&\lambda _{n}&\lambda _{n}^{2}&\!\!\!\cdots \!\!\!&\lambda _{n}^{n-1}\end{bmatrix}}.}

Indeed, a reasonably hard computation shows that the transpose 
  
    
      
        C
        (
        p
        
          )
          
            T
          
        
      
    
    {\displaystyle C(p)^{T}}
  
 has eigenvectors 
  
    
      
        
          v
          
            i
          
        
        =
        (
        1
        ,
        
          λ
          
            i
          
        
        ,
        …
        ,
        
          λ
          
            i
          
          
            n
            −
            1
          
        
        )
      
    
    {\displaystyle v_{i}=(1,\lambda _{i},\ldots ,\lambda _{i}^{n-1})}
  
 with 
  
    
      
        C
        (
        p
        
          )
          
            T
          
        
        
        (
        
          v
          
            i
          
        
        )
        =
        
          λ
          
            i
          
        
        
          v
          
            i
          
        
      
    
    {\displaystyle C(p)^{T}\!(v_{i})=\lambda _{i}v_{i}}
  
, which follows from 
  
    
      
        p
        (
        
          λ
          
            i
          
        
        )
        =
        
          c
          
            0
          
        
        +
        
          c
          
            1
          
        
        
          λ
          
            i
          
        
        +
        ⋯
        +
        
          c
          
            n
            −
            1
          
        
        
          λ
          
            i
          
          
            n
            −
            1
          
        
        +
        
          λ
          
            i
          
          
            n
          
        
        =
        0
      
    
    {\displaystyle p(\lambda _{i})=c_{0}+c_{1}\lambda _{i}+\cdots +c_{n-1}\lambda _{i}^{n-1}+\lambda _{i}^{n}=0}
  
. Thus, its diagonalizing <a href="/facts/Change_of_basis/Hnu1Spcc">change of basis</a> matrix is 
  
    
      
        
          V
          
            T
          
        
        =
        [
        
          v
          
            1
          
          
            T
          
        
        …
        
          v
          
            n
          
          
            T
          
        
        ]
      
    
    {\displaystyle V^{T}=[v_{1}^{T}\ldots v_{n}^{T}]}
  
, meaning 
  
    
      
        C
        (
        p
        
          )
          
            T
          
        
        =
        
          V
          
            T
          
        
        D
        
        (
        
          V
          
            T
          
        
        
          )
          
            −
            1
          
        
      
    
    {\displaystyle C(p)^{T}=V^{T}D\,(V^{T})^{-1}}
  
, and taking the transpose of both sides gives 
  
    
      
        C
        (
        p
        )
        =
        
          V
          
            −
            1
          
        
        
        D
        V
      
    
    {\displaystyle C(p)=V^{-1}\!DV}
  
.
</p><p>We can read the eigenvectors of 
  
    
      
        C
        (
        p
        )
      
    
    {\displaystyle C(p)}
  
 with 
  
    
      
        C
        (
        p
        )
        (
        
          w
          
            i
          
        
        )
        =
        
          λ
          
            i
          
        
        
          w
          
            i
          
        
      
    
    {\displaystyle C(p)(w_{i})=\lambda _{i}w_{i}}
  
 from the equation 
  
    
      
        C
        (
        p
        )
        =
        
          V
          
            −
            1
          
        
        
        D
        V
      
    
    {\displaystyle C(p)=V^{-1}\!DV}
  
: they are the column vectors of the <a href="/facts/Vandermonde_matrix/8oSbUk6u">inverse Vandermonde matrix</a> 
  
    
      
        
          V
          
            −
            1
          
        
        =
        [
        
          w
          
            1
          
          
            T
          
        
        ⋯
        
          w
          
            n
          
          
            T
          
        
        ]
      
    
    {\displaystyle V^{-1}=[w_{1}^{T}\cdots w_{n}^{T}]}
  
. This matrix is known explicitly, giving the eigenvectors 
  
    
      
        
          w
          
            i
          
        
        =
        (
        
          L
          
            0
            i
          
        
        ,
        …
        ,
        
          L
          
            (
            n
            −
            1
            )
            i
          
        
        )
      
    
    {\displaystyle w_{i}=(L_{0i},\ldots ,L_{(n-1)i})}
  
, with coordinates equal to the coefficients of the <a href="/facts/Lagrange_polynomial/MnpMmhtC">Lagrange polynomials</a>

L
          
            i
          
        
        (
        x
        )
        =
        
          L
          
            0
            i
          
        
        +
        
          L
          
            1
            i
          
        
        x
        +
        ⋯
        +
        
          L
          
            (
            n
            −
            1
            )
            i
          
        
        
          x
          
            n
            −
            1
          
        
        =
        
          ∏
          
            j
            ≠
            i
          
        
        
          
            
              x
              −
              
                λ
                
                  j
                
              
            
            
              
                λ
                
                  j
                
              
              −
              
                λ
                
                  i
                
              
            
          
        
        =
        
          
            
              p
              (
              x
              )
            
            
              (
              x
              −
              
                λ
                
                  i
                
              
              )
              
              
                p
                ′
              
              (
              
                λ
                
                  i
                
              
              )
            
          
        
        .
      
    
    {\displaystyle L_{i}(x)=L_{0i}+L_{1i}x+\cdots +L_{(n-1)i}x^{n-1}=\prod _{j\neq i}{\frac {x-\lambda _{j}}{\lambda _{j}-\lambda _{i}}}={\frac {p(x)}{(x-\lambda _{i})\,p'(\lambda _{i})}}.}

Alternatively, the scaled eigenvectors 
  
    
      
        
          
            
              
                w
                ~
              
            
          
          
            i
          
        
        =
        
          p
          ′
        
        
        (
        
          λ
          
            i
          
        
        )
        
        
          w
          
            i
          
        
      
    
    {\displaystyle {\tilde {w}}_{i}=p'\!(\lambda _{i})\,w_{i}}
  
 have simpler coefficients.
</p><p>If 
  
    
      
        p
        (
        x
        )
      
    
    {\displaystyle p(x)}
  
 has multiple roots, then 
  
    
      
        C
        (
        p
        )
      
    
    {\displaystyle C(p)}
  
 is not diagonalizable. Rather, the <a href="/facts/Jordan_canonical_form/M7ezYbnl">Jordan canonical form</a> of 
  
    
      
        C
        (
        p
        )
      
    
    {\displaystyle C(p)}
  
 contains one <a href="/facts/Jordan_block/RSDz7mn5">Jordan block</a> for each distinct root; if the multiplicity of the root is <i>m</i>, then the block is an <i>m</i> × <i>m</i> matrix with  
  
    
      
        λ
      
    
    {\displaystyle \lambda }
  
 on the diagonal and 1 in the entries just above the diagonal. in this case, <i>V</i> becomes a <a href="/facts/Vandermonde_matrix/8oSbUk6u">confluent Vandermonde matrix</a>.<a class="footnote-ref" id="fnref:2" href="#fn:2"><sup>2</sup></a>
</p>
<h2 id="linear-recursive-sequences">Linear recursive sequences</h2>
<p>A <a href="/facts/Linear_recursive_sequence/U3vHagFE">linear recursive sequence</a> defined by 
  
    
      
        
          a
          
            k
            +
            n
          
        
        =
        −
        
          c
          
            0
          
        
        
          a
          
            k
          
        
        −
        
          c
          
            1
          
        
        
          a
          
            k
            +
            1
          
        
        ⋯
        −
        
          c
          
            n
            −
            1
          
        
        
          a
          
            k
            +
            n
            −
            1
          
        
      
    
    {\displaystyle a_{k+n}=-c_{0}a_{k}-c_{1}a_{k+1}\cdots -c_{n-1}a_{k+n-1}}
  
 for 
  
    
      
        k
        ≥
        0
      
    
    {\displaystyle k\geq 0}
  
 has the characteristic polynomial 
  
    
      
        p
        (
        x
        )
        =
        
          c
          
            0
          
        
        +
        
          c
          
            1
          
        
        x
        +
        ⋯
        +
        
          c
          
            n
            −
            1
          
        
        
          x
          
            n
            −
            1
          
        
        +
        
          x
          
            n
          
        
      
    
    {\displaystyle p(x)=c_{0}+c_{1}x+\cdots +c_{n-1}x^{n-1}+x^{n}}
  
, whose transpose companion matrix 
  
    
      
        C
        (
        p
        
          )
          
            T
          
        
      
    
    {\displaystyle C(p)^{T}}
  
 generates the sequence:

[
            
              
                
                  
                    a
                    
                      k
                      +
                      1
                    
                  
                
              
              
                
                  
                    a
                    
                      k
                      +
                      2
                    
                  
                
              
              
                
                  ⋮
                
              
              
                
                  
                    a
                    
                      k
                      +
                      n
                      −
                      1
                    
                  
                
              
              
                
                  
                    a
                    
                      k
                      +
                      n
                    
                  
                
              
            
            ]
          
        
        =
        
          
            [
            
              
                
                  0
                
                
                  1
                
                
                  0
                
                
                  ⋯
                
                
                  0
                
              
              
                
                  0
                
                
                  0
                
                
                  1
                
                
                  ⋯
                
                
                  0
                
              
              
                
                  ⋮
                
                
                  ⋮
                
                
                  ⋮
                
                
                  ⋱
                
                
                  ⋮
                
              
              
                
                  0
                
                
                  0
                
                
                  0
                
                
                  ⋯
                
                
                  1
                
              
              
                
                  −
                  
                    c
                    
                      0
                    
                  
                
                
                  −
                  
                    c
                    
                      1
                    
                  
                
                
                  −
                  
                    c
                    
                      2
                    
                  
                
                
                  ⋯
                
                
                  −
                  
                    c
                    
                      n
                      −
                      1
                    
                  
                
              
            
            ]
          
        
        
          
            [
            
              
                
                  
                    a
                    
                      k
                    
                  
                
              
              
                
                  
                    a
                    
                      k
                      +
                      1
                    
                  
                
              
              
                
                  ⋮
                
              
              
                
                  
                    a
                    
                      k
                      +
                      n
                      −
                      2
                    
                  
                
              
              
                
                  
                    a
                    
                      k
                      +
                      n
                      −
                      1
                    
                  
                
              
            
            ]
          
        
        .
      
    
    {\displaystyle {\begin{bmatrix}a_{k+1}\\a_{k+2}\\\vdots \\a_{k+n-1}\\a_{k+n}\end{bmatrix}}={\begin{bmatrix}0&1&0&\cdots &0\\0&0&1&\cdots &0\\\vdots &\vdots &\vdots &\ddots &\vdots \\0&0&0&\cdots &1\\-c_{0}&-c_{1}&-c_{2}&\cdots &-c_{n-1}\end{bmatrix}}{\begin{bmatrix}a_{k}\\a_{k+1}\\\vdots \\a_{k+n-2}\\a_{k+n-1}\end{bmatrix}}.}

The vector 
  
    
      
        v
        =
        (
        1
        ,
        λ
        ,
        
          λ
          
            2
          
        
        ,
        …
        ,
        
          λ
          
            n
            −
            1
          
        
        )
      
    
    {\displaystyle v=(1,\lambda ,\lambda ^{2},\ldots ,\lambda ^{n-1})}
  
 is an eigenvector of this matrix, where the eigenvalue 
  
    
      
        λ
      
    
    {\displaystyle \lambda }
  
 is a root of 
  
    
      
        p
        (
        x
        )
      
    
    {\displaystyle p(x)}
  
. Setting the initial values of the sequence equal to this vector produces a geometric sequence 
  
    
      
        
          a
          
            k
          
        
        =
        
          λ
          
            k
          
        
      
    
    {\displaystyle a_{k}=\lambda ^{k}}
  
 which satisfies the recurrence. In the case of <i>n</i> distinct eigenvalues, an arbitrary solution 
  
    
      
        
          a
          
            k
          
        
      
    
    {\displaystyle a_{k}}
  
 can be written as a linear combination of such geometric solutions, and the eigenvalues of largest complex norm give an <a href="/facts/Asymptotic_approximation/kt87qLpu">asymptotic approximation</a>.
</p>
<h2 id="from-linear-ode-to-first-order-linear-ode-system">From linear ODE to first-order linear ODE system</h2>
<p>Similarly to the above case of linear recursions, consider a homogeneous <a href="/facts/Linear_ODE/nLEbIIVf">linear ODE</a> of order <i>n</i> for the scalar function 
  
    
      
        y
        =
        y
        (
        t
        )
      
    
    {\displaystyle y=y(t)}
  
:

y
          
            (
            n
            )
          
        
        +
        
          c
          
            n
            −
            1
          
        
        
          y
          
            (
            n
            −
            1
            )
          
        
        +
        ⋯
        +
        
          c
          
            1
          
        
        
          y
          
            (
            1
            )
          
        
        +
        
          c
          
            0
          
        
        y
        =
        0.
      
    
    {\displaystyle y^{(n)}+c_{n-1}y^{(n-1)}+\dots +c_{1}y^{(1)}+c_{0}y=0.}

This can be equivalently described as a coupled system of homogeneous linear ODE of order 1 for the vector function 
  
    
      
        z
        (
        t
        )
        =
        (
        y
        (
        t
        )
        ,
        
          y
          ′
        
        (
        t
        )
        ,
        …
        ,
        
          y
          
            (
            n
            −
            1
            )
          
        
        (
        t
        )
        )
      
    
    {\displaystyle z(t)=(y(t),y'(t),\ldots ,y^{(n-1)}(t))}
  
:

z
          ′
        
        =
        C
        (
        p
        
          )
          
            T
          
        
        z
      
    
    {\displaystyle z'=C(p)^{T}z}

where 
  
    
      
        C
        (
        p
        
          )
          
            T
          
        
      
    
    {\displaystyle C(p)^{T}}
  
 is the transpose companion matrix for the characteristic polynomial

p
        (
        x
        )
        =
        
          x
          
            n
          
        
        +
        
          c
          
            n
            −
            1
          
        
        
          x
          
            n
            −
            1
          
        
        +
        ⋯
        +
        
          c
          
            1
          
        
        x
        +
        
          c
          
            0
          
        
        .
      
    
    {\displaystyle p(x)=x^{n}+c_{n-1}x^{n-1}+\cdots +c_{1}x+c_{0}.}

Here the coefficients 
  
    
      
        
          c
          
            i
          
        
        =
        
          c
          
            i
          
        
        (
        t
        )
      
    
    {\displaystyle c_{i}=c_{i}(t)}
  
 may be also functions, not just constants. 
</p><p>If 
  
    
      
        C
        (
        p
        
          )
          
            T
          
        
      
    
    {\displaystyle C(p)^{T}}
  
 is diagonalizable, then a diagonalizing change of basis will transform this into a decoupled system equivalent to one scalar homogeneous first-order linear ODE in each coordinate.
</p><p>An inhomogeneous equation

y
          
            (
            n
            )
          
        
        +
        
          c
          
            n
            −
            1
          
        
        
          y
          
            (
            n
            −
            1
            )
          
        
        +
        ⋯
        +
        
          c
          
            1
          
        
        
          y
          
            (
            1
            )
          
        
        +
        
          c
          
            0
          
        
        y
        =
        f
        (
        t
        )
      
    
    {\displaystyle y^{(n)}+c_{n-1}y^{(n-1)}+\dots +c_{1}y^{(1)}+c_{0}y=f(t)}

is equivalent to the system:

z
          ′
        
        =
        C
        (
        p
        
          )
          
            T
          
        
        z
        +
        F
        (
        t
        )
      
    
    {\displaystyle z'=C(p)^{T}z+F(t)}

with the inhomogeneity term 
  
    
      
        F
        (
        t
        )
        =
        (
        0
        ,
        …
        ,
        0
        ,
        f
        (
        t
        )
        )
      
    
    {\displaystyle F(t)=(0,\ldots ,0,f(t))}
  
.
</p><p>Again, a diagonalizing change of basis will transform this into a decoupled system of scalar inhomogeneous first-order linear ODEs.
</p>
<h2 id="cyclic-shift-matrix">Cyclic shift matrix</h2>
<p>In the case of 
  
    
      
        p
        (
        x
        )
        =
        
          x
          
            n
          
        
        −
        1
      
    
    {\displaystyle p(x)=x^{n}-1}
  
, when the eigenvalues are the complex <a href="/facts/Root_of_unity/EqH0ho1Y">roots of unity</a>, the companion matrix and its transpose both reduce to Sylvester's cyclic <a href="/facts/Generalizations_of_Pauli_matrices/SjqqBE28">shift matrix</a>, a <a href="/facts/Circulant_matrix/71ohEKT2">circulant matrix</a>.
</p>
<h2 id="multiplication-map-on-a-simple-field-extension">Multiplication map on a simple field extension</h2>
<p>Consider a polynomial 
  
    
      
        p
        (
        x
        )
        =
        
          x
          
            n
          
        
        +
        
          c
          
            n
            −
            1
          
        
        
          x
          
            n
            −
            1
          
        
        +
        ⋯
        +
        
          c
          
            1
          
        
        x
        +
        
          c
          
            0
          
        
      
    
    {\displaystyle p(x)=x^{n}+c_{n-1}x^{n-1}+\cdots +c_{1}x+c_{0}}
  
 with coefficients in a <a href="/facts/Field_(mathematics)/xAjAS4ko">field</a> 
  
    
      
        F
      
    
    {\displaystyle F}
  
, and suppose 
  
    
      
        p
        (
        x
        )
      
    
    {\displaystyle p(x)}
  
 is <a href="/facts/Irreducible_polynomial/PTeuKm4W">irreducible</a> in the <a href="/facts/Polynomial_ring/ua3vk8Ih">polynomial ring</a> 
  
    
      
        F
        [
        x
        ]
      
    
    {\displaystyle F[x]}
  
. Then <a href="/facts/Simple_extension/xegn8Bpl">adjoining a root</a> 
  
    
      
        λ
      
    
    {\displaystyle \lambda }
  
 of 
  
    
      
        p
        (
        x
        )
      
    
    {\displaystyle p(x)}
  
 produces a <a href="/facts/Field_extension/L7yIDtyK">field extension</a> 
  
    
      
        K
        =
        F
        (
        λ
        )
        ≅
        F
        [
        x
        ]
        
          /
        
        (
        p
        (
        x
        )
        )
      
    
    {\displaystyle K=F(\lambda )\cong F[x]/(p(x))}
  
, which is also a vector space over 
  
    
      
        F
      
    
    {\displaystyle F}
  
 with standard basis 
  
    
      
        {
        1
        ,
        λ
        ,
        
          λ
          
            2
          
        
        ,
        …
        ,
        
          λ
          
            n
            −
            1
          
        
        }
      
    
    {\displaystyle \{1,\lambda ,\lambda ^{2},\ldots ,\lambda ^{n-1}\}}
  
. Then the 
  
    
      
        F
      
    
    {\displaystyle F}
  
-linear multiplication mapping
</p>

m
          
            λ
          
        
        :
        K
        →
        K
      
    
    {\displaystyle m_{\lambda }:K\to K}
  
  defined by  
  
    
      
        
          m
          
            λ
          
        
        (
        α
        )
        =
        λ
        α
      
    
    {\displaystyle m_{\lambda }(\alpha )=\lambda \alpha }

<p>has an <i>n</i> × <i>n</i> matrix 
  
    
      
        [
        
          m
          
            λ
          
        
        ]
      
    
    {\displaystyle [m_{\lambda }]}
  
 with respect to the standard basis. Since 
  
    
      
        
          m
          
            λ
          
        
        (
        
          λ
          
            i
          
        
        )
        =
        
          λ
          
            i
            +
            1
          
        
      
    
    {\displaystyle m_{\lambda }(\lambda ^{i})=\lambda ^{i+1}}
  
 and 
  
    
      
        
          m
          
            λ
          
        
        (
        
          λ
          
            n
            −
            1
          
        
        )
        =
        
          λ
          
            n
          
        
        =
        −
        
          c
          
            0
          
        
        −
        ⋯
        −
        
          c
          
            n
            −
            1
          
        
        
          λ
          
            n
            −
            1
          
        
      
    
    {\displaystyle m_{\lambda }(\lambda ^{n-1})=\lambda ^{n}=-c_{0}-\cdots -c_{n-1}\lambda ^{n-1}}
  
, this is the companion matrix of 
  
    
      
        p
        (
        x
        )
      
    
    {\displaystyle p(x)}
  
:

[
        
          m
          
            λ
          
        
        ]
        =
        C
        (
        p
        )
        .
      
    
    {\displaystyle [m_{\lambda }]=C(p).}

Assuming this extension is <a href="/facts/Separable_extension/9KBvXmz8">separable</a> (for example if 
  
    
      
        F
      
    
    {\displaystyle F}
  
 has <a href="/facts/Characteristic_zero/D2VzcQaG">characteristic zero</a> or is a <a href="/facts/Finite_field/UkMGJo3m">finite field</a>), 
  
    
      
        p
        (
        x
        )
      
    
    {\displaystyle p(x)}
  
 has distinct roots 
  
    
      
        
          λ
          
            1
          
        
        ,
        …
        ,
        
          λ
          
            n
          
        
      
    
    {\displaystyle \lambda _{1},\ldots ,\lambda _{n}}
  
 with 
  
    
      
        
          λ
          
            1
          
        
        =
        λ
      
    
    {\displaystyle \lambda _{1}=\lambda }
  
, so that

p
        (
        x
        )
        =
        (
        x
        −
        
          λ
          
            1
          
        
        )
        ⋯
        (
        x
        −
        
          λ
          
            n
          
        
        )
        ,
      
    
    {\displaystyle p(x)=(x-\lambda _{1})\cdots (x-\lambda _{n}),}

and it has <a href="/facts/Splitting_field/m38eWUH1">splitting field</a> 
  
    
      
        L
        =
        F
        (
        
          λ
          
            1
          
        
        ,
        …
        ,
        
          λ
          
            n
          
        
        )
      
    
    {\displaystyle L=F(\lambda _{1},\ldots ,\lambda _{n})}
  
. Now 
  
    
      
        
          m
          
            λ
          
        
      
    
    {\displaystyle m_{\lambda }}
  
 is not diagonalizable over 
  
    
      
        F
      
    
    {\displaystyle F}
  
; rather, we must <a href="/facts/Extension_of_scalars/RYGsTsYk">extend</a> it to an 
  
    
      
        L
      
    
    {\displaystyle L}
  
-linear map on 
  
    
      
        
          L
          
            n
          
        
        ≅
        L
        
          ⊗
          
            F
          
        
        K
      
    
    {\displaystyle L^{n}\cong L\otimes _{F}K}
  
, a vector space over 
  
    
      
        L
      
    
    {\displaystyle L}
  
 with standard basis 
  
    
      
        {
        1
        
          ⊗
        
        1
        ,
        
        1
        
          ⊗
        
        λ
        ,
        
        1
        
          ⊗
        
        
          λ
          
            2
          
        
        ,
        …
        ,
        1
        
          ⊗
        
        
          λ
          
            n
            −
            1
          
        
        }
      
    
    {\displaystyle \{1{\otimes }1,\,1{\otimes }\lambda ,\,1{\otimes }\lambda ^{2},\ldots ,1{\otimes }\lambda ^{n-1}\}}
  
, containing vectors 
  
    
      
        w
        =
        (
        
          β
          
            1
          
        
        ,
        …
        ,
        
          β
          
            n
          
        
        )
        =
        
          β
          
            1
          
        
        
          ⊗
        
        1
        +
        ⋯
        +
        
          β
          
            n
          
        
        
          ⊗
        
        
          λ
          
            n
            −
            1
          
        
      
    
    {\displaystyle w=(\beta _{1},\ldots ,\beta _{n})=\beta _{1}{\otimes }1+\cdots +\beta _{n}{\otimes }\lambda ^{n-1}}
  
. The extended mapping is defined by 
  
    
      
        
          m
          
            λ
          
        
        (
        β
        ⊗
        α
        )
        =
        β
        ⊗
        (
        λ
        α
        )
      
    
    {\displaystyle m_{\lambda }(\beta \otimes \alpha )=\beta \otimes (\lambda \alpha )}
  
. 
</p><p>The matrix 
  
    
      
        [
        
          m
          
            λ
          
        
        ]
        =
        C
        (
        p
        )
      
    
    {\displaystyle [m_{\lambda }]=C(p)}
  
 is unchanged, but as above, it can be diagonalized by matrices with entries in 
  
    
      
        L
      
    
    {\displaystyle L}
  
:

[
        
          m
          
            λ
          
        
        ]
        =
        C
        (
        p
        )
        =
        
          V
          
            −
            1
          
        
        
        D
        V
        ,
      
    
    {\displaystyle [m_{\lambda }]=C(p)=V^{-1}\!DV,}

for the diagonal matrix 
  
    
      
        D
        =
        diag
        ⁡
        (
        
          λ
          
            1
          
        
        ,
        …
        ,
        
          λ
          
            n
          
        
        )
      
    
    {\displaystyle D=\operatorname {diag} (\lambda _{1},\ldots ,\lambda _{n})}
  
 and the <a href="/facts/Vandermonde_matrix/8oSbUk6u">Vandermonde matrix</a> <i>V</i> corresponding to 
  
    
      
        
          λ
          
            1
          
        
        ,
        …
        ,
        
          λ
          
            n
          
        
        ∈
        L
      
    
    {\displaystyle \lambda _{1},\ldots ,\lambda _{n}\in L}
  
. The explicit formula for the eigenvectors (the scaled column vectors of the <a href="/facts/Vandermonde_matrix/8oSbUk6u">inverse Vandermonde matrix</a> 
  
    
      
        
          V
          
            −
            1
          
        
      
    
    {\displaystyle V^{-1}}
  
) can be written as:

w
                ~
              
            
          
          
            i
          
        
        =
        
          β
          
            0
            i
          
        
        
          ⊗
        
        1
        +
        
          β
          
            1
            i
          
        
        
          ⊗
        
        λ
        +
        ⋯
        +
        
          β
          
            (
            n
            −
            1
            )
            i
          
        
        
          ⊗
        
        
          λ
          
            n
            −
            1
          
        
        =
        
          ∏
          
            j
            ≠
            i
          
        
        (
        1
        
          ⊗
        
        λ
        −
        
          λ
          
            j
          
        
        
          ⊗
        
        1
        )
      
    
    {\displaystyle {\tilde {w}}_{i}=\beta _{0i}{\otimes }1+\beta _{1i}{\otimes }\lambda +\cdots +\beta _{(n-1)i}{\otimes }\lambda ^{n-1}=\prod _{j\neq i}(1{\otimes }\lambda -\lambda _{j}{\otimes }1)}

where 
  
    
      
        
          β
          
            i
            j
          
        
        ∈
        L
      
    
    {\displaystyle \beta _{ij}\in L}
  
 are the coefficients of the scaled Lagrange polynomial

p
              (
              x
              )
            
            
              x
              −
              
                λ
                
                  i
                
              
            
          
        
        =
        
          ∏
          
            j
            ≠
            i
          
        
        (
        x
        −
        
          λ
          
            j
          
        
        )
        =
        
          β
          
            0
            i
          
        
        +
        
          β
          
            1
            i
          
        
        x
        +
        ⋯
        +
        
          β
          
            (
            n
            −
            1
            )
            i
          
        
        
          x
          
            n
            −
            1
          
        
        .
      
    
    {\displaystyle {\frac {p(x)}{x-\lambda _{i}}}=\prod _{j\neq i}(x-\lambda _{j})=\beta _{0i}+\beta _{1i}x+\cdots +\beta _{(n-1)i}x^{n-1}.}

</p>
<h2 id="see-also">See also</h2>
<ul><li><a href="/facts/Frobenius_endomorphism/b9hkmc97">Frobenius endomorphism</a></li>
<li><a href="/facts/Cayley%E2%80%93Hamilton_theorem/pXdJXWkR">Cayley–Hamilton theorem</a></li>
<li><a href="/facts/Krylov_subspace/wtr5EXn6">Krylov subspace</a></li></ul>
<h2 id="notes">Notes</h2>

<h2 id="references">References</h2>

<ol>
<li id="fn:1"><p>Horn, Roger A.; Charles R. Johnson (1985). Matrix Analysis. Cambridge, UK: Cambridge University Press. pp. 146–147. ISBN 0-521-30586-1. Retrieved 2010-02-10. <a href="0-521-30586-1" target="_blank">0-521-30586-1</a> <a href="#fnref:1" class="footnote-back-ref">↩</a></p></li>
<li id="fn:2"><p>Turnbull, H. W.; Aitken, A. C. (1961). An Introduction to the Theory of Canonical Matrices. New York: Dover. p. 60. ISBN 978-0486441689. <a href="978-0486441689" target="_blank">978-0486441689</a> <a href="#fnref:2" class="footnote-back-ref">↩</a></p></li>
</ol>

Companion matrix open-in-new

Companion matrix