The two strings 0010 and 1000 may be distinguished from each other by a three-state automaton in which the transitions from the start state go to two different states, both of which are terminal in the sense that subsequent transitions from these two states always return to the same state. The state of this automaton records the first symbol of the input string. If one of the two terminal states is accepting and the other is rejecting, then the automaton will accept only one of the strings 0010 and 1000. However, these two strings cannot be distinguished by any automaton with fewer than three states.
For proving bounds on this problem, it may be assumed without loss of generality that the inputs are strings over a two-letter alphabet. For, if two strings over a larger alphabet differ then there exists a string homomorphism that maps them to binary strings of the same length that also differ. Any automaton that distinguishes the binary strings can be translated into an automaton that distinguishes the original strings, without any increase in the number of states.
It may also be assumed that the two strings have equal length. For strings of unequal length, there always exists a prime number p whose value is logarithmic in the smaller of the two input lengths, such that the two lengths are different modulo p. An automaton that counts the length of its input modulo p can be used to distinguish the two strings from each other in this case. Therefore, strings of unequal lengths can always be distinguished from each other by automata with few states.
The problem of bounding the size of an automaton that distinguishes two given strings was first formulated by Goralčík & Koubek (1986), who showed that the automaton size is always sublinear. Later, Robson (1989) proved the upper bound O(n2/5(log n)3/5) on the automaton size that may be required. This was improved by Chase (2020) to O(n1/3(log n)7).
Several special cases of the separating words problem are known to be solvable using few states:
Demaine, Erik D.; Eisenstat, Sarah; Shallit, Jeffrey; Wilson, David A. (2011), "Remarks on separating words", Descriptional Complexity of Formal Systems: 13th International Workshop, DCFS 2011, Gießen/Limburg, Germany, July 25-27, 2011, Proceedings, Lecture Notes in Computer Science, vol. 6808, Heidelberg: Springer-Verlag, pp. 147–157, arXiv:1103.4513, doi:10.1007/978-3-642-22600-7_12, ISBN 978-3-642-22599-4, MR 2910373, S2CID 6959459. 978-3-642-22599-4
Demaine, Erik D.; Eisenstat, Sarah; Shallit, Jeffrey; Wilson, David A. (2011), "Remarks on separating words", Descriptional Complexity of Formal Systems: 13th International Workshop, DCFS 2011, Gießen/Limburg, Germany, July 25-27, 2011, Proceedings, Lecture Notes in Computer Science, vol. 6808, Heidelberg: Springer-Verlag, pp. 147–157, arXiv:1103.4513, doi:10.1007/978-3-642-22600-7_12, ISBN 978-3-642-22599-4, MR 2910373, S2CID 6959459. 978-3-642-22599-4
Demaine, Erik D.; Eisenstat, Sarah; Shallit, Jeffrey; Wilson, David A. (2011), "Remarks on separating words", Descriptional Complexity of Formal Systems: 13th International Workshop, DCFS 2011, Gießen/Limburg, Germany, July 25-27, 2011, Proceedings, Lecture Notes in Computer Science, vol. 6808, Heidelberg: Springer-Verlag, pp. 147–157, arXiv:1103.4513, doi:10.1007/978-3-642-22600-7_12, ISBN 978-3-642-22599-4, MR 2910373, S2CID 6959459. 978-3-642-22599-4
Goralčík, P.; Koubek, V. (1986), "On discerning words by automata", Automata, Languages and Programming: 13th International Colloquium, Rennes, France, July 15–19, 1986, Proceedings, Lecture Notes in Computer Science, vol. 226, Berlin: Springer-Verlag, pp. 116–122, doi:10.1007/3-540-16761-7_61, ISBN 978-3-540-16761-7, MR 0864674. 978-3-540-16761-7
Robson, J. M. (1989), "Separating strings with small automata", Information Processing Letters, 30 (4): 209–214, doi:10.1016/0020-0190(89)90215-9, MR 0986823. /wiki/Doi_(identifier)
Chase, Z. (2020), "A New Upper Bound for Separating Words", arXiv:2007.12097 [math.CO]. /wiki/ArXiv_(identifier)
Chase, Zachary (2021-06-15). "Separating words and trace reconstruction". Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing. STOC 2021. New York, NY, USA: Association for Computing Machinery. pp. 21–31. doi:10.1145/3406325.3451118. ISBN 978-1-4503-8053-9. 978-1-4503-8053-9
Shallit, Jeffrey (2014), "Open Problems in Automata Theory: An Idiosyncratic View", British Colloquium for Theoretical Computer Science (BCTCS 2014), Loughborough University (PDF). /wiki/Jeffrey_Shallit
Demaine, Erik D.; Eisenstat, Sarah; Shallit, Jeffrey; Wilson, David A. (2011), "Remarks on separating words", Descriptional Complexity of Formal Systems: 13th International Workshop, DCFS 2011, Gießen/Limburg, Germany, July 25-27, 2011, Proceedings, Lecture Notes in Computer Science, vol. 6808, Heidelberg: Springer-Verlag, pp. 147–157, arXiv:1103.4513, doi:10.1007/978-3-642-22600-7_12, ISBN 978-3-642-22599-4, MR 2910373, S2CID 6959459. 978-3-642-22599-4
Demaine, Erik D.; Eisenstat, Sarah; Shallit, Jeffrey; Wilson, David A. (2011), "Remarks on separating words", Descriptional Complexity of Formal Systems: 13th International Workshop, DCFS 2011, Gießen/Limburg, Germany, July 25-27, 2011, Proceedings, Lecture Notes in Computer Science, vol. 6808, Heidelberg: Springer-Verlag, pp. 147–157, arXiv:1103.4513, doi:10.1007/978-3-642-22600-7_12, ISBN 978-3-642-22599-4, MR 2910373, S2CID 6959459. 978-3-642-22599-4
Demaine, Erik D.; Eisenstat, Sarah; Shallit, Jeffrey; Wilson, David A. (2011), "Remarks on separating words", Descriptional Complexity of Formal Systems: 13th International Workshop, DCFS 2011, Gießen/Limburg, Germany, July 25-27, 2011, Proceedings, Lecture Notes in Computer Science, vol. 6808, Heidelberg: Springer-Verlag, pp. 147–157, arXiv:1103.4513, doi:10.1007/978-3-642-22600-7_12, ISBN 978-3-642-22599-4, MR 2910373, S2CID 6959459. 978-3-642-22599-4