Bartels–Stewart algorithm

<h2 id="the-algorithm">The algorithm</h2>
<p>Let 
  
    
      
        X
        ,
        C
        ∈
        
          
            R
          
          
            m
            ×
            n
          
        
      
    
    {\displaystyle X,C\in \mathbb {R} ^{m\times n}}
  
, and assume that the eigenvalues of 
  
    
      
        A
      
    
    {\displaystyle A}
  
 are distinct from the eigenvalues of 
  
    
      
        B
      
    
    {\displaystyle B}
  
. Then, the matrix equation 
  
    
      
        A
        X
        −
        X
        B
        =
        C
      
    
    {\displaystyle AX-XB=C}
  
 has a unique solution. The Bartels–Stewart algorithm computes 
  
    
      
        X
      
    
    {\displaystyle X}
  
 by applying the following steps:<a class="footnote-ref" id="fnref:3" href="#fn:3"><sup>3</sup></a> 
</p><p>1.Compute the <a href="/facts/Schur_decomposition/EYNrqHmh">real Schur decompositions</a>
</p>

R
        =
        
          U
          
            T
          
        
        A
        U
        ,
      
    
    {\displaystyle R=U^{T}AU,}

S
        =
        
          V
          
            T
          
        
        
          B
          
            T
          
        
        V
        .
      
    
    {\displaystyle S=V^{T}B^{T}V.}

<p>The matrices 
  
    
      
        R
      
    
    {\displaystyle R}
  
 and 
  
    
      
        S
      
    
    {\displaystyle S}
  
 are block-upper triangular matrices, with diagonal blocks of size 
  
    
      
        1
        ×
        1
      
    
    {\displaystyle 1\times 1}
  
 or 
  
    
      
        2
        ×
        2
      
    
    {\displaystyle 2\times 2}
  
.
</p><p>2. Set 
  
    
      
        F
        =
        
          U
          
            T
          
        
        C
        V
        .
      
    
    {\displaystyle F=U^{T}CV.}

</p><p>3. Solve the  simplified system 
  
    
      
        R
        Y
        −
        Y
        
          S
          
            T
          
        
        =
        F
      
    
    {\displaystyle RY-YS^{T}=F}
  
, where 
  
    
      
        Y
        =
        
          U
          
            T
          
        
        X
        V
      
    
    {\displaystyle Y=U^{T}XV}
  
. This can be done using forward substitution on the blocks. Specifically, if 
  
    
      
        
          s
          
            k
            −
            1
            ,
            k
          
        
        =
        0
      
    
    {\displaystyle s_{k-1,k}=0}
  
, then
</p>

(
        R
        −
        
          s
          
            k
            k
          
        
        I
        )
        
          y
          
            k
          
        
        =
        
          f
          
            k
          
        
        +
        
          ∑
          
            j
            =
            k
            +
            1
          
          
            n
          
        
        
          s
          
            k
            j
          
        
        
          y
          
            j
          
        
        ,
      
    
    {\displaystyle (R-s_{kk}I)y_{k}=f_{k}+\sum _{j=k+1}^{n}s_{kj}y_{j},}

<p>where 
  
    
      
        
          y
          
            k
          
        
      
    
    {\displaystyle y_{k}}
  
 is the 
  
    
      
        k
      
    
    {\displaystyle k}
  
th column of 
  
    
      
        Y
      
    
    {\displaystyle Y}
  
. When 
  
    
      
        
          s
          
            k
            −
            1
            ,
            k
          
        
        ≠
        0
      
    
    {\displaystyle s_{k-1,k}\neq 0}
  
, columns 
  
    
      
        [
        
          y
          
            k
            −
            1
          
        
        ∣
        
          y
          
            k
          
        
        ]
      
    
    {\displaystyle [y_{k-1}\mid y_{k}]}
  
  should be concatenated and solved for simultaneously. 
</p><p>4. Set 
  
    
      
        X
        =
        U
        Y
        
          V
          
            T
          
        
        .
      
    
    {\displaystyle X=UYV^{T}.}

</p>
<h3>Computational cost</h3>
<p>Using the <a href="/facts/QR_algorithm/7LfgtT40">QR algorithm</a>, the <a href="/facts/Schur_decomposition/EYNrqHmh"> real Schur decompositions</a> in step 1 require approximately 
  
    
      
        10
        (
        
          m
          
            3
          
        
        +
        
          n
          
            3
          
        
        )
      
    
    {\displaystyle 10(m^{3}+n^{3})}
  
 flops, so that the overall computational cost is  
  
    
      
        10
        (
        
          m
          
            3
          
        
        +
        
          n
          
            3
          
        
        )
        +
        2.5
        (
        m
        
          n
          
            2
          
        
        +
        n
        
          m
          
            2
          
        
        )
      
    
    {\displaystyle 10(m^{3}+n^{3})+2.5(mn^{2}+nm^{2})}
  
.<a class="footnote-ref" id="fnref:4" href="#fn:4"><sup>4</sup></a> 
</p>
<h3>Simplifications and special cases</h3>
<p>In the special case where 
  
    
      
        B
        =
        −
        
          A
          
            T
          
        
      
    
    {\displaystyle B=-A^{T}}
  
 and 
  
    
      
        C
      
    
    {\displaystyle C}
  
 is symmetric, the solution 
  
    
      
        X
      
    
    {\displaystyle X}
  
 will also be symmetric. This symmetry can be exploited so that 
  
    
      
        Y
      
    
    {\displaystyle Y}
  
 is found more efficiently in step 3 of the algorithm.<a class="footnote-ref" id="fnref:5" href="#fn:5"><sup>5</sup></a>
</p>
<h2 id="the-hessenbergschur-algorithm">The Hessenberg–Schur algorithm</h2>
<p>The Hessenberg–Schur algorithm<a class="footnote-ref" id="fnref:6" href="#fn:6"><sup>6</sup></a> replaces the decomposition 
  
    
      
        R
        =
        
          U
          
            T
          
        
        A
        U
      
    
    {\displaystyle R=U^{T}AU}
  
 in step 1 with the decomposition 
  
    
      
        H
        =
        
          Q
          
            T
          
        
        A
        Q
      
    
    {\displaystyle H=Q^{T}AQ}
  
, where 
  
    
      
        H
      
    
    {\displaystyle H}
  
 is an <a href="/facts/Hessenberg_matrix/vy7p7s3Y"> upper-Hessenberg matrix</a>. This leads to a system of the form 
  
    
      
        H
        Y
        −
        Y
        
          S
          
            T
          
        
        =
        F
      
    
    {\displaystyle HY-YS^{T}=F}
  
 that can be solved using forward substitution. The advantage of this approach is that 
  
    
      
        H
        =
        
          Q
          
            T
          
        
        A
        Q
      
    
    {\displaystyle H=Q^{T}AQ}
  
 can be found using <a href="/facts/Householder_transformation/1L87qsqE"> Householder reflections</a> at a cost of 
  
    
      
        (
        5
        
          /
        
        3
        )
        
          m
          
            3
          
        
      
    
    {\displaystyle (5/3)m^{3}}
  
 flops, compared to the 
  
    
      
        10
        
          m
          
            3
          
        
      
    
    {\displaystyle 10m^{3}}
  
 flops required to compute the real Schur decomposition of 
  
    
      
        A
      
    
    {\displaystyle A}
  
. 
</p>
<h2 id="software-and-implementation">Software and implementation</h2>
<p>The subroutines required for the Hessenberg-Schur variant of the Bartels–Stewart  algorithm are implemented in the SLICOT library. These are used in the MATLAB control system toolbox.
</p>
<h2 id="alternative-approaches">Alternative approaches</h2>
<p>For large systems, the 
  
    
      
        
          
            O
          
        
        (
        
          m
          
            3
          
        
        +
        
          n
          
            3
          
        
        )
      
    
    {\displaystyle {\mathcal {O}}(m^{3}+n^{3})}
  
 cost of the Bartels–Stewart algorithm can be prohibitive. When 
  
    
      
        A
      
    
    {\displaystyle A}
  
 and 
  
    
      
        B
      
    
    {\displaystyle B}
  
 are sparse or structured, so that linear solves and matrix vector multiplies involving them are efficient, iterative algorithms can potentially perform better. These include projection-based methods, which use <a href="/facts/Krylov_subspace_method/uKMAJhEh">Krylov subspace</a> iterations, methods based on the <a href="/facts/Alternating_direction_implicit_method/uV9e0ccz">alternating direction implicit</a> (ADI) iteration, and hybridizations that involve both projection and ADI.<a class="footnote-ref" id="fnref:7" href="#fn:7"><sup>7</sup></a>  Iterative methods can also be used to directly construct <a href="/facts/Low-rank_approximation/qj1Wq2FE">low rank approximations</a> to 
  
    
      
        X
      
    
    {\displaystyle X}
  
 when solving 
  
    
      
        A
        X
        −
        X
        B
        =
        C
      
    
    {\displaystyle AX-XB=C}
  
. 
</p>

<h2 id="references">References</h2>

<ol>
<li id="fn:1"><p>Bartels, R. H.; Stewart, G. W. (1972). "Solution of the matrix equation AX + XB = C [F4]". Communications of the ACM. 15 (9): 820–826. doi:10.1145/361573.361582. ISSN 0001-0782. <a href="https://doi.org/10.1145%2F361573.361582" target="_blank">https://doi.org/10.1145%2F361573.361582</a> <a href="#fnref:1" class="footnote-back-ref">↩</a></p></li>
<li id="fn:2"><p>Golub, G.; Nash, S.; Loan, C. Van (1979). "A Hessenberg–Schur method for the problem AX + XB= C". IEEE Transactions on Automatic Control. 24 (6): 909–913. doi:10.1109/TAC.1979.1102170. hdl:1813/7472. ISSN 0018-9286. <a href="/wiki/Doi_(identifier)" target="_blank">/wiki/Doi_(identifier)</a> <a href="#fnref:2" class="footnote-back-ref">↩</a></p></li>
<li id="fn:3"><p>Golub, G.; Nash, S.; Loan, C. Van (1979). "A Hessenberg–Schur method for the problem AX + XB= C". IEEE Transactions on Automatic Control. 24 (6): 909–913. doi:10.1109/TAC.1979.1102170. hdl:1813/7472. ISSN 0018-9286. <a href="/wiki/Doi_(identifier)" target="_blank">/wiki/Doi_(identifier)</a> <a href="#fnref:3" class="footnote-back-ref">↩</a></p></li>
<li id="fn:4"><p>Golub, G.; Nash, S.; Loan, C. Van (1979). "A Hessenberg–Schur method for the problem AX + XB= C". IEEE Transactions on Automatic Control. 24 (6): 909–913. doi:10.1109/TAC.1979.1102170. hdl:1813/7472. ISSN 0018-9286. <a href="/wiki/Doi_(identifier)" target="_blank">/wiki/Doi_(identifier)</a> <a href="#fnref:4" class="footnote-back-ref">↩</a></p></li>
<li id="fn:5"><p>Bartels, R. H.; Stewart, G. W. (1972). "Solution of the matrix equation AX + XB = C [F4]". Communications of the ACM. 15 (9): 820–826. doi:10.1145/361573.361582. ISSN 0001-0782. <a href="https://doi.org/10.1145%2F361573.361582" target="_blank">https://doi.org/10.1145%2F361573.361582</a> <a href="#fnref:5" class="footnote-back-ref">↩</a></p></li>
<li id="fn:6"><p>Golub, G.; Nash, S.; Loan, C. Van (1979). "A Hessenberg–Schur method for the problem AX + XB= C". IEEE Transactions on Automatic Control. 24 (6): 909–913. doi:10.1109/TAC.1979.1102170. hdl:1813/7472. ISSN 0018-9286. <a href="/wiki/Doi_(identifier)" target="_blank">/wiki/Doi_(identifier)</a> <a href="#fnref:6" class="footnote-back-ref">↩</a></p></li>
<li id="fn:7"><p>Simoncini, V. (2016). "Computational Methods for Linear Matrix Equations". SIAM Review. 58 (3): 377–441. doi:10.1137/130912839. hdl:11585/586011. ISSN 0036-1445. S2CID 17271167. <a href="/wiki/Doi_(identifier)" target="_blank">/wiki/Doi_(identifier)</a> <a href="#fnref:7" class="footnote-back-ref">↩</a></p></li>
</ol>

Bartels–Stewart algorithm open-in-new

Bartels–Stewart algorithm