For example, in this table, the substrings at indices 2 to 4 (in red) and indices 6 to 8 (in blue) are a maximal pair, because they contain identical characters (abc), and they have different characters to the left (x at index 1 and y at index 5) and different characters to the right (y at index 5 and w at index 9). Similarly, the substrings at indices 6 to 8 (in blue) and indices 10 to 12 (in green) are a maximal pair.
However, the substrings at indices 2 to 4 (in red) and indices 10 to 12 (in green) are not a maximal pair, as the character y follows both substrings, and so they can be extended to the right to make a longer pair.
Formally, a maximal pair of substrings with starting positions p 1 {\displaystyle p_{1}} and p 2 {\displaystyle p_{2}} respectively, and both of length l {\displaystyle l} , is specified by a triple ( p 1 , p 2 , l ) {\displaystyle (p_{1},p_{2},l)} , such that, given a string S {\displaystyle S} of length n {\displaystyle n} , S [ p 1 . . p 1 + l − 1 ] = S [ p 2 . . p 2 + l − 1 ] {\displaystyle S[p_{1}..p_{1}+l-1]=S[p_{2}..p_{2}+l-1]} (meaning that the substrings have identical contents), but S [ p 1 − 1 ] ≠ S [ p 2 − 1 ] {\displaystyle S[p_{1}-1]\neq S[p_{2}-1]} (they have different characters to their left) and S [ p 1 + l ] ≠ S [ p 2 + l ] {\displaystyle S[p_{1}+l]\neq S[p_{2}+l]} (they also have different characters to their right; together, these two inequalities are the condition for being maximal). Thus, in the example above, the maximal pairs are ( 2 , 6 , 3 ) {\displaystyle (2,6,3)} (the red and blue substrings) and ( 6 , 10 , 3 ) {\displaystyle (6,10,3)} (the green and blue substrings), and ( 2 , 10 , 3 ) {\displaystyle (2,10,3)} is not a maximal pair.
A maximal repeat is the string represented by a maximal pair. A supermaximal repeat is a maximal repeat never occurring as a proper substring of another maximal repeat. In the above example, abc and abcy are both maximal repeats, but only abcy is a supermaximal repeat.
Maximal pairs, maximal repeats and supermaximal repeats can each be found in Θ ( n + z ) {\displaystyle \Theta (n+z)} time using a suffix tree,1 if there are z {\displaystyle z} such structures.
Gusfield, Dan (1999) [1997]. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. USA: Cambridge University Press. p. 143. ISBN 0-521-58519-8. 0-521-58519-8 ↩