In coding theory, decoding is the process of translating received messages into codewords of a given code. There have been many common methods of mapping messages to codewords. These are often used to recover messages sent over a noisy channel, such as a binary symmetric channel.
Notation
C ⊂ F 2 n {\displaystyle C\subset \mathbb {F} _{2}^{n}} is considered a binary code with the length n {\displaystyle n} ; x , y {\displaystyle x,y} shall be elements of F 2 n {\displaystyle \mathbb {F} _{2}^{n}} ; and d ( x , y ) {\displaystyle d(x,y)} is the distance between those elements.
Ideal observer decoding
One may be given the message x ∈ F 2 n {\displaystyle x\in \mathbb {F} _{2}^{n}} , then ideal observer decoding generates the codeword y ∈ C {\displaystyle y\in C} . The process results in this solution:
P ( y sent ∣ x received ) {\displaystyle \mathbb {P} (y{\mbox{ sent}}\mid x{\mbox{ received}})}For example, a person can choose the codeword y {\displaystyle y} that is most likely to be received as the message x {\displaystyle x} after transmission.
Decoding conventions
Each codeword does not have an expected possibility: there may be more than one codeword with an equal likelihood of mutating into the received message. In such a case, the sender and receiver(s) must agree ahead of time on a decoding convention. Popular conventions include:
- Request that the codeword be resent – automatic repeat-request.
- Choose any random codeword from the set of most likely codewords which is nearer to that.
- If another code follows, mark the ambiguous bits of the codeword as erasures and hope that the outer code disambiguates them
- Report a decoding failure to the system
Maximum likelihood decoding
Further information: Maximum likelihood
Given a received vector x ∈ F 2 n {\displaystyle x\in \mathbb {F} _{2}^{n}} maximum likelihood decoding picks a codeword y ∈ C {\displaystyle y\in C} that maximizes
P ( x received ∣ y sent ) {\displaystyle \mathbb {P} (x{\mbox{ received}}\mid y{\mbox{ sent}})} ,that is, the codeword y {\displaystyle y} that maximizes the probability that x {\displaystyle x} was received, given that y {\displaystyle y} was sent. If all codewords are equally likely to be sent then this scheme is equivalent to ideal observer decoding. In fact, by Bayes' theorem,
P ( x received ∣ y sent ) = P ( x received , y sent ) P ( y sent ) = P ( y sent ∣ x received ) ⋅ P ( x received ) P ( y sent ) . {\displaystyle {\begin{aligned}\mathbb {P} (x{\mbox{ received}}\mid y{\mbox{ sent}})&{}={\frac {\mathbb {P} (x{\mbox{ received}},y{\mbox{ sent}})}{\mathbb {P} (y{\mbox{ sent}})}}\\&{}=\mathbb {P} (y{\mbox{ sent}}\mid x{\mbox{ received}})\cdot {\frac {\mathbb {P} (x{\mbox{ received}})}{\mathbb {P} (y{\mbox{ sent}})}}.\end{aligned}}}Upon fixing P ( x received ) {\displaystyle \mathbb {P} (x{\mbox{ received}})} , x {\displaystyle x} is restructured and P ( y sent ) {\displaystyle \mathbb {P} (y{\mbox{ sent}})} is constant as all codewords are equally likely to be sent. Therefore, P ( x received ∣ y sent ) {\displaystyle \mathbb {P} (x{\mbox{ received}}\mid y{\mbox{ sent}})} is maximised as a function of the variable y {\displaystyle y} precisely when P ( y sent ∣ x received ) {\displaystyle \mathbb {P} (y{\mbox{ sent}}\mid x{\mbox{ received}})} is maximised, and the claim follows.
As with ideal observer decoding, a convention must be agreed to for non-unique decoding.
The maximum likelihood decoding problem can also be modeled as an integer programming problem.1
The maximum likelihood decoding algorithm is an instance of the "marginalize a product function" problem which is solved by applying the generalized distributive law.2
Minimum distance decoding
Given a received codeword x ∈ F 2 n {\displaystyle x\in \mathbb {F} _{2}^{n}} , minimum distance decoding picks a codeword y ∈ C {\displaystyle y\in C} to minimise the Hamming distance:
d ( x , y ) = # { i : x i ≠ y i } {\displaystyle d(x,y)=\#\{i:x_{i}\not =y_{i}\}}i.e. choose the codeword y {\displaystyle y} that is as close as possible to x {\displaystyle x} .
Note that if the probability of error on a discrete memoryless channel p {\displaystyle p} is strictly less than one half, then minimum distance decoding is equivalent to maximum likelihood decoding, since if
d ( x , y ) = d , {\displaystyle d(x,y)=d,\,}then:
P ( y received ∣ x sent ) = ( 1 − p ) n − d ⋅ p d = ( 1 − p ) n ⋅ ( p 1 − p ) d {\displaystyle {\begin{aligned}\mathbb {P} (y{\mbox{ received}}\mid x{\mbox{ sent}})&{}=(1-p)^{n-d}\cdot p^{d}\\&{}=(1-p)^{n}\cdot \left({\frac {p}{1-p}}\right)^{d}\\\end{aligned}}}which (since p is less than one half) is maximised by minimising d.
Minimum distance decoding is also known as nearest neighbour decoding. It can be assisted or automated by using a standard array. Minimum distance decoding is a reasonable decoding method when the following conditions are met:
- The probability p {\displaystyle p} that an error occurs is independent of the position of the symbol.
- Errors are independent events – an error at one position in the message does not affect other positions.
These assumptions may be reasonable for transmissions over a binary symmetric channel. They may be unreasonable for other media, such as a DVD, where a single scratch on the disk can cause an error in many neighbouring symbols or codewords.
As with other decoding methods, a convention must be agreed to for non-unique decoding.
Syndrome decoding
Syndrome decoding is a highly efficient method of decoding a linear code over a noisy channel, i.e. one on which errors are made. In essence, syndrome decoding is minimum distance decoding using a reduced lookup table. This is allowed by the linearity of the code.3
Suppose that C ⊂ F 2 n {\displaystyle C\subset \mathbb {F} _{2}^{n}} is a linear code of length n {\displaystyle n} and minimum distance d {\displaystyle d} with parity-check matrix H {\displaystyle H} . Then clearly C {\displaystyle C} is capable of correcting up to
t = ⌊ d − 1 2 ⌋ {\displaystyle t=\left\lfloor {\frac {d-1}{2}}\right\rfloor }errors made by the channel (since if no more than t {\displaystyle t} errors are made then minimum distance decoding will still correctly decode the incorrectly transmitted codeword).
Now suppose that a codeword x ∈ F 2 n {\displaystyle x\in \mathbb {F} _{2}^{n}} is sent over the channel and the error pattern e ∈ F 2 n {\displaystyle e\in \mathbb {F} _{2}^{n}} occurs. Then z = x + e {\displaystyle z=x+e} is received. Ordinary minimum distance decoding would lookup the vector z {\displaystyle z} in a table of size | C | {\displaystyle |C|} for the nearest match - i.e. an element (not necessarily unique) c ∈ C {\displaystyle c\in C} with
d ( c , z ) ≤ d ( y , z ) {\displaystyle d(c,z)\leq d(y,z)}for all y ∈ C {\displaystyle y\in C} . Syndrome decoding takes advantage of the property of the parity matrix that:
H x = 0 {\displaystyle Hx=0}for all x ∈ C {\displaystyle x\in C} . The syndrome of the received z = x + e {\displaystyle z=x+e} is defined to be:
H z = H ( x + e ) = H x + H e = 0 + H e = H e {\displaystyle Hz=H(x+e)=Hx+He=0+He=He}To perform ML decoding in a binary symmetric channel, one has to look-up a precomputed table of size 2 n − k {\displaystyle 2^{n-k}} , mapping H e {\displaystyle He} to e {\displaystyle e} .
Note that this is already of significantly less complexity than that of a standard array decoding.
However, under the assumption that no more than t {\displaystyle t} errors were made during transmission, the receiver can look up the value H e {\displaystyle He} in a further reduced table of size
∑ i = 0 t ( n i ) {\displaystyle {\begin{matrix}\sum _{i=0}^{t}{\binom {n}{i}}\\\end{matrix}}}List decoding
Main article: List decoding
Information set decoding
This is a family of Las Vegas-probabilistic methods all based on the observation that it is easier to guess enough error-free positions, than it is to guess all the error-positions.
The simplest form is due to Prange: Let G {\displaystyle G} be the k × n {\displaystyle k\times n} generator matrix of C {\displaystyle C} used for encoding. Select k {\displaystyle k} columns of G {\displaystyle G} at random, and denote by G ′ {\displaystyle G'} the corresponding submatrix of G {\displaystyle G} . With reasonable probability G ′ {\displaystyle G'} will have full rank, which means that if we let c ′ {\displaystyle c'} be the sub-vector for the corresponding positions of any codeword c = m G {\displaystyle c=mG} of C {\displaystyle C} for a message m {\displaystyle m} , we can recover m {\displaystyle m} as m = c ′ G ′ − 1 {\displaystyle m=c'G'^{-1}} . Hence, if we were lucky that these k {\displaystyle k} positions of the received word y {\displaystyle y} contained no errors, and hence equalled the positions of the sent codeword, then we may decode.
If t {\displaystyle t} errors occurred, the probability of such a fortunate selection of columns is given by ( n − t k ) / ( n k ) {\displaystyle \textstyle {\binom {n-t}{k}}/{\binom {n}{k}}} .
This method has been improved in various ways, e.g. by Stern4 and Canteaut and Sendrier.5
Partial response maximum likelihood
Main article: PRML
Partial response maximum likelihood (PRML) is a method for converting the weak analog signal from the head of a magnetic disk or tape drive into a digital signal.
Viterbi decoder
Main article: Viterbi decoder
A Viterbi decoder uses the Viterbi algorithm for decoding a bitstream that has been encoded using forward error correction based on a convolutional code. The Hamming distance is used as a metric for hard decision Viterbi decoders. The squared Euclidean distance is used as a metric for soft decision decoders.
Optimal decision decoding algorithm (ODDA)
Optimal decision decoding algorithm (ODDA) for an asymmetric TWRC system.6
See also
Further reading
- Hill, Raymond (1986). A first course in coding theory. Oxford Applied Mathematics and Computing Science Series. Oxford University Press. ISBN 978-0-19-853803-5.
- Pless, Vera (1982). Introduction to the theory of error-correcting codes. Wiley-Interscience Series in Discrete Mathematics. John Wiley & Sons. ISBN 978-0-471-08684-0.
- van Lint, Jacobus H. (1992). Introduction to Coding Theory. Graduate Texts in Mathematics (GTM). Vol. 86 (2 ed.). Springer-Verlag. ISBN 978-3-540-54894-2.
References
Feldman, Jon; Wainwright, Martin J.; Karger, David R. (March 2005). "Using Linear Programming to Decode Binary Linear Codes". IEEE Transactions on Information Theory. 51 (3): 954–972. CiteSeerX 10.1.1.111.6585. doi:10.1109/TIT.2004.842696. S2CID 3120399. /wiki/IEEE_Transactions_on_Information_Theory ↩
Aji, Srinivas M.; McEliece, Robert J. (March 2000). "The Generalized Distributive Law" (PDF). IEEE Transactions on Information Theory. 46 (2): 325–343. doi:10.1109/18.825794. https://authors.library.caltech.edu/1541/1/AJIieeetit00.pdf ↩
Beutelspacher, Albrecht; Rosenbaum, Ute (1998). Projective Geometry. Cambridge University Press. p. 190. ISBN 0-521-48277-1. 0-521-48277-1 ↩
Stern, Jacques (1989). "A method for finding codewords of small weight". Coding Theory and Applications. Lecture Notes in Computer Science. Vol. 388. Springer-Verlag. pp. 106–113. doi:10.1007/BFb0019850. ISBN 978-3-540-51643-9. 978-3-540-51643-9 ↩
Ohta, Kazuo; Pei, Dingyi, eds. (1998). Advances in Cryptology — ASIACRYPT'98. Lecture Notes in Computer Science. Vol. 1514. pp. 187–199. doi:10.1007/3-540-49649-1. ISBN 978-3-540-65109-3. S2CID 37257901. 978-3-540-65109-3 ↩
Siamack Ghadimi (2020), Optimal decision decoding algorithm (ODDA) for an asymmetric TWRC system;, Universal Journal of Electrical and Electronic Engineering ↩