Maximal pair

<h2 id="example">Example</h2>
<table><tbody><tr><td>Index</td><td>1</td><td>2</td><td>3</td><td>4</td><td>5</td><td>6</td><td>7</td><td>8</td><td>9</td><td>10</td><td>11</td><td>12</td><td>13</td><td>14</td></tr><tr><td>Character</td><td>x</td><td>a</td><td>b</td><td>c</td><td>y</td><td>a</td><td>b</td><td>c</td><td>w</td><td>a</td><td>b</td><td>c</td><td>y</td><td>z</td></tr></tbody></table>
<p>For example, in this table, the substrings at indices 2 to 4 (in red) and indices 6 to 8 (in blue) are a maximal pair, because they contain identical characters (abc), and they have different characters to the left (x at index 1 and y at index 5) and different characters to the right (y at index 5 and w at index 9). Similarly, the substrings at indices 6 to 8 (in blue) and indices 10 to 12 (in green) are a maximal pair.
</p><p>However, the substrings at indices 2 to 4 (in red) and indices 10 to 12 (in green) are not a maximal pair, as the character y follows both substrings, and so they can be extended to the right to make a longer pair.
</p>
<h2 id="formal-definition">Formal definition</h2>
<p>Formally, a maximal pair of substrings with starting positions 
  
    
      
        
          p
          
            1
          
        
      
    
    {\displaystyle p_{1}}
  
 and 
  
    
      
        
          p
          
            2
          
        
      
    
    {\displaystyle p_{2}}
  
 respectively, and both of length 
  
    
      
        l
      
    
    {\displaystyle l}
  
, is specified by a <a href="/facts/Tuple/BI1HIxL8">triple</a> 
  
    
      
        (
        
          p
          
            1
          
        
        ,
        
          p
          
            2
          
        
        ,
        l
        )
      
    
    {\displaystyle (p_{1},p_{2},l)}
  
, such that, given a string 
  
    
      
        S
      
    
    {\displaystyle S}
  
 of length 
  
    
      
        n
      
    
    {\displaystyle n}
  
, 
  
    
      
        S
        [
        
          p
          
            1
          
        
        .
        .
        
          p
          
            1
          
        
        +
        l
        −
        1
        ]
        =
        S
        [
        
          p
          
            2
          
        
        .
        .
        
          p
          
            2
          
        
        +
        l
        −
        1
        ]
      
    
    {\displaystyle S[p_{1}..p_{1}+l-1]=S[p_{2}..p_{2}+l-1]}
  
 (meaning that the substrings have identical contents), but 
  
    
      
        S
        [
        
          p
          
            1
          
        
        −
        1
        ]
        ≠
        S
        [
        
          p
          
            2
          
        
        −
        1
        ]
      
    
    {\displaystyle S[p_{1}-1]\neq S[p_{2}-1]}
  
 (they have different characters to their left) and 
  
    
      
        S
        [
        
          p
          
            1
          
        
        +
        l
        ]
        ≠
        S
        [
        
          p
          
            2
          
        
        +
        l
        ]
      
    
    {\displaystyle S[p_{1}+l]\neq S[p_{2}+l]}
  
 (they also have different characters to their right; together, these two inequalities are the condition for being maximal). Thus, in the example above, the maximal pairs are 
  
    
      
        (
        2
        ,
        6
        ,
        3
        )
      
    
    {\displaystyle (2,6,3)}
  
 (the red and blue substrings) and 
  
    
      
        (
        6
        ,
        10
        ,
        3
        )
      
    
    {\displaystyle (6,10,3)}
  
 (the green and blue substrings), and 
  
    
      
        (
        2
        ,
        10
        ,
        3
        )
      
    
    {\displaystyle (2,10,3)}
  
 is not a maximal pair.
</p>
<h2 id="related-concepts-and-time-complexity">Related concepts and time complexity</h2>
<p>A maximal repeat is the string represented by a maximal pair. A supermaximal repeat is a maximal repeat never occurring as a proper substring of another maximal repeat. In the above example, abc and abcy are both maximal repeats, but only abcy is a supermaximal repeat.
</p><p>Maximal pairs, maximal repeats and supermaximal repeats can each be found in 
  
    
      
        Θ
        (
        n
        +
        z
        )
      
    
    {\displaystyle \Theta (n+z)}
  
 time using a <a href="/facts/Suffix_tree/pdQdCxbe">suffix tree</a>,<a class="footnote-ref" id="fnref:1" href="#fn:1"><sup>1</sup></a> if there are 
  
    
      
        z
      
    
    {\displaystyle z}
  
 such structures.
</p>

<h2 id="external-links">External links</h2>
<ul><li><a href="https://code.google.com/p/py-rstr-max/">Project for the computation of all maximal repeats in one ore more strings in Python</a>, using <a href="/facts/Suffix_array/Q87p7bYH">suffix array</a>.</li></ul>

<h2 id="references">References</h2>

<ol>
<li id="fn:1"><p>Gusfield, Dan (1999) [1997]. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. USA: Cambridge University Press. p. 143. ISBN 0-521-58519-8. <a href="0-521-58519-8" target="_blank">0-521-58519-8</a> <a href="#fnref:1" class="footnote-back-ref">↩</a></p></li>
</ol>

Maximal pair open-in-new

Maximal pair