An S-box substitutes a small block of bits (the input of the S-box) by another block of bits (the output of the S-box). This substitution should be one-to-one, to ensure invertibility (hence decryption). In particular, the length of the output should be the same as the length of the input (the picture on the right has S-boxes with 4 input and 4 output bits), which is different from S-boxes in general that could also change the length, as in Data Encryption Standard (DES), for example. An S-box is usually not simply a permutation of the bits. Rather, in a good S-box each output bit will be affected by every input bit. More precisely, in a good S-box each output bit will be changed with 50% probability by every input bit. Since each output bit changes with the 50% probability, about half of the output bits will actually change with an input bit change (cf. Strict avalanche criterion).1
A P-box is a permutation of all the bits: it takes the outputs of all the S-boxes of one round, permutes the bits, and feeds them into the S-boxes of the next round. A good P-box has the property that the output bits of any S-box are distributed to as many S-box inputs as possible.
At each round, the round key (obtained from the key with some simple operations, for instance, using S-boxes and P-boxes) is combined using some group operation, typically XOR.
A single typical S-box or a single P-box alone does not have much cryptographic strength: an S-box could be thought of as a substitution cipher, while a P-box could be thought of as a transposition cipher. However, a well-designed SP network with several alternating rounds of S- and P-boxes already satisfies Shannon's confusion and diffusion properties:
Although a Feistel network that uses S-boxes (such as DES) is quite similar to SP networks, there are some differences that make either this or that more applicable in certain situations. For a given amount of confusion and diffusion, an SP network has more "inherent parallelism"2 and so — given a CPU with many execution units — can be computed faster than a Feistel network.3 CPUs with few execution units — such as most smart cards — cannot take advantage of this inherent parallelism. Also SP ciphers require S-boxes to be invertible (to perform decryption); Feistel inner functions have no such restriction and can be constructed as one-way functions.
Webster, A. F.; Tavares, Stafford E. (1985). "On the design of S-boxes". Advances in Cryptology – Crypto '85. Lecture Notes in Computer Science. Vol. 218. New York, NY: Springer-Verlag New York, Inc. pp. 523–534. ISBN 0-387-16463-4. 0-387-16463-4 ↩
"Principles and Performance of Cryptographic Algorithms" by Bart Preneel, Vincent Rijmen, and Antoon Bosselaers. http://www.ddj.com/184410756 ↩
"The Skein Hash Function Family" Archived 2009-01-15 at the Wayback Machine 2008 by Niels Ferguson, Stefan Lucks, Bruce Schneier, Doug Whiting, Mihir Bellare, Tadayoshi Kohno, Jon Callas, Jesse Walker page 40. http://www.schneier.com/skein1.1.pdf ↩