Additive white Gaussian noise

<h2 id="channel-capacity">Channel capacity</h2>
The AWGN channel is represented by a series of outputs 
 
 
 
 
 Y
 
 i
 
 
 
 
 {\displaystyle Y_{i}}
 
 at discrete-time event index 
 
 
 
 i
 
 
 {\displaystyle i}
 
. 
 
 
 
 
 Y
 
 i
 
 
 
 
 {\displaystyle Y_{i}}
 
 is the sum of the input 
 
 
 
 
 X
 
 i
 
 
 
 
 {\displaystyle X_{i}}
 
 and noise, 
 
 
 
 
 Z
 
 i
 
 
 
 
 {\displaystyle Z_{i}}
 
, where 
 
 
 
 
 Z
 
 i
 
 
 
 
 {\displaystyle Z_{i}}
 
 is <a href="/facts/Independent_and_identically_distributed_random_variables/othIRaWt">independent and identically distributed</a> and drawn from a zero-mean <a href="/facts/Normal_distribution/UapjjPyQ">normal distribution</a> with <a href="/facts/Variance/ULBJKXD1">variance</a> 
 
 
 
 N
 
 
 {\displaystyle N}
 
 (the noise). The 
 
 
 
 
 Z
 
 i
 
 
 
 
 {\displaystyle Z_{i}}
 
 are further assumed to not be correlated with the 
 
 
 
 
 X
 
 i
 
 
 
 
 {\displaystyle X_{i}}
 
.

Z
          
            i
          
        
        ∼
        
          
            N
          
        
        (
        0
        ,
        N
        )
        
        
      
    
    {\displaystyle Z_{i}\sim {\mathcal {N}}(0,N)\,\!}

Y
          
            i
          
        
        =
        
          X
          
            i
          
        
        +
        
          Z
          
            i
          
        
        .
        
        
      
    
    {\displaystyle Y_{i}=X_{i}+Z_{i}.\,\!}

The capacity of the channel is infinite unless the noise 
 
 
 
 N
 
 
 {\displaystyle N}
 
 is nonzero, and the 
 
 
 
 
 X
 
 i
 
 
 
 
 {\displaystyle X_{i}}
 
 are sufficiently constrained. The most common constraint on the input is the so-called "power" constraint, requiring that for a codeword 
 
 
 
 (
 
 x
 
 1
 
 
 ,
 
 x
 
 2
 
 
 ,
 …
 ,
 
 x
 
 k
 
 
 )
 
 
 {\displaystyle (x_{1},x_{2},\dots ,x_{k})}
 
 transmitted through the channel, we have:

1
            k
          
        
        
          ∑
          
            i
            =
            1
          
          
            k
          
        
        
          x
          
            i
          
          
            2
          
        
        ≤
        P
        ,
      
    
    {\displaystyle {\frac {1}{k}}\sum _{i=1}^{k}x_{i}^{2}\leq P,}

where 
 
 
 
 P
 
 
 {\displaystyle P}
 
 represents the maximum channel power.
Therefore, the <a href="/facts/Channel_capacity/55wJYsS6">channel capacity</a> for the power-constrained channel is given by:

C
        =
        max
        
          {
          
            I
            (
            X
            ;
            Y
            )
            :
            f
            
               s.t. 
            
            E
            
              (
              
                X
                
                  2
                
              
              )
            
            ≤
            P
          
          }
        
        
        
      
    
    {\displaystyle C=\max \left\{I(X;Y):f{\text{ s.t. }}E\left(X^{2}\right)\leq P\right\}\,\!}

where 
 
 
 
 f
 
 
 {\displaystyle f}
 
 is the distribution of 
 
 
 
 X
 
 
 {\displaystyle X}
 
. Expand 
 
 
 
 I
 (
 X
 ;
 Y
 )
 
 
 {\displaystyle I(X;Y)}
 
, writing it in terms of the <a href="/facts/Differential_entropy/EoJSyw95">differential entropy</a>:

I
                (
                X
                ;
                Y
                )
                =
                h
                (
                Y
                )
                −
                h
                (
                Y
                ∣
                X
                )
              
            
            
              
                =

h
                (
                Y
                )
                −
                h
                (
                X
                +
                Z
                ∣
                X
                )
              
            
            
              
                =

h
                (
                Y
                )
                −
                h
                (
                Z
                ∣
                X
                )
              
            
          
        
        
        
      
    
    {\displaystyle {\begin{aligned}&I(X;Y)=h(Y)-h(Y\mid X)\\[5pt]={}&h(Y)-h(X+Z\mid X)\\[5pt]={}&h(Y)-h(Z\mid X)\end{aligned}}\,\!}

But 
 
 
 
 X
 
 
 {\displaystyle X}
 
 and 
 
 
 
 Z
 
 
 {\displaystyle Z}
 
 are independent, therefore:

I
        (
        X
        ;
        Y
        )
        =
        h
        (
        Y
        )
        −
        h
        (
        Z
        )
        
        
      
    
    {\displaystyle I(X;Y)=h(Y)-h(Z)\,\!}

Evaluating the <a href="/facts/Differential_entropy/EoJSyw95">differential entropy</a> of a Gaussian gives:

h
        (
        Z
        )
        =
        
          
            1
            2
          
        
        log
        ⁡
        (
        2
        π
        e
        N
        )
        
        
      
    
    {\displaystyle h(Z)={\frac {1}{2}}\log(2\pi eN)\,\!}

Because 
 
 
 
 X
 
 
 {\displaystyle X}
 
 and 
 
 
 
 Z
 
 
 {\displaystyle Z}
 
 are independent and their sum gives 
 
 
 
 Y
 
 
 {\displaystyle Y}
 
:

E
        (
        
          Y
          
            2
          
        
        )
        =
        E
        (
        (
        X
        +
        Z
        
          )
          
            2
          
        
        )
        =
        E
        (
        
          X
          
            2
          
        
        )
        +
        2
        E
        (
        X
        )
        E
        (
        Z
        )
        +
        E
        (
        
          Z
          
            2
          
        
        )
        ≤
        P
        +
        N
        
        
      
    
    {\displaystyle E(Y^{2})=E((X+Z)^{2})=E(X^{2})+2E(X)E(Z)+E(Z^{2})\leq P+N\,\!}

From this bound, we infer from a property of the differential entropy that

h
        (
        Y
        )
        ≤
        
          
            1
            2
          
        
        log
        ⁡
        (
        2
        π
        e
        (
        P
        +
        N
        )
        )
        
        
      
    
    {\displaystyle h(Y)\leq {\frac {1}{2}}\log(2\pi e(P+N))\,\!}

Therefore, the channel capacity is given by the highest achievable bound on the <a href="/facts/Mutual_information/HIUvsjvV">mutual information</a>:

I
        (
        X
        ;
        Y
        )
        ≤
        
          
            1
            2
          
        
        log
        ⁡
        (
        2
        π
        e
        (
        P
        +
        N
        )
        )
        −
        
          
            1
            2
          
        
        log
        ⁡
        (
        2
        π
        e
        N
        )
        
        
      
    
    {\displaystyle I(X;Y)\leq {\frac {1}{2}}\log(2\pi e(P+N))-{\frac {1}{2}}\log(2\pi eN)\,\!}

Where 
 
 
 
 I
 (
 X
 ;
 Y
 )
 
 
 {\displaystyle I(X;Y)}
 
 is maximized when:

X
        ∼
        
          
            N
          
        
        (
        0
        ,
        P
        )
        
        
      
    
    {\displaystyle X\sim {\mathcal {N}}(0,P)\,\!}

Thus the channel capacity 
 
 
 
 C
 
 
 {\displaystyle C}
 
 for the AWGN channel is given by:

C
        =
        
          
            1
            2
          
        
        log
        ⁡
        
          (
          
            1
            +
            
              
                P
                N
              
            
          
          )
        
        
        
      
    
    {\displaystyle C={\frac {1}{2}}\log \left(1+{\frac {P}{N}}\right)\,\!}

<h3>Channel capacity and sphere packing</h3>
Suppose that we are sending messages through the channel with index ranging from 
 
 
 
 1
 
 
 {\displaystyle 1}
 
 to 
 
 
 
 M
 
 
 {\displaystyle M}
 
, the number of distinct possible messages. If we encode the 
 
 
 
 M
 
 
 {\displaystyle M}
 
 messages to 
 
 
 
 n
 
 
 {\displaystyle n}
 
 bits, then we define the rate 
 
 
 
 R
 
 
 {\displaystyle R}
 
 as:

R
        =
        
          
            
              log
              ⁡
              M
            
            n
          
        
        
        
      
    
    {\displaystyle R={\frac {\log M}{n}}\,\!}

A rate is said to be achievable if there is a sequence of codes so that the maximum probability of error tends to zero as 
 
 
 
 n
 
 
 {\displaystyle n}
 
 approaches infinity. The capacity 
 
 
 
 C
 
 
 {\displaystyle C}
 
 is the highest achievable rate.
Consider a codeword of length 
 
 
 
 n
 
 
 {\displaystyle n}
 
 sent through the AWGN channel with noise level 
 
 
 
 N
 
 
 {\displaystyle N}
 
. When received, the codeword vector variance is now 
 
 
 
 N
 
 
 {\displaystyle N}
 
, and its mean is the codeword sent. The vector is very likely to be contained in a sphere of radius 
 
 
 
 
 
 n
 (
 N
 +
 ε
 )
 
 
 
 
 {\textstyle {\sqrt {n(N+\varepsilon )}}}
 
 around the codeword sent. If we decode by mapping every message received onto the codeword at the center of this sphere, then an error occurs only when the received vector is outside of this sphere, which is very unlikely.
Each codeword vector has an associated sphere of received codeword vectors which are decoded to it and each such sphere must map uniquely onto a codeword. Because these spheres therefore must not intersect, we are faced with the problem of <a href="/facts/Sphere_packing/NY2FKyMf">sphere packing</a>. How many distinct codewords can we pack into our 
 
 
 
 n
 
 
 {\displaystyle n}
 
-bit codeword vector? The received vectors have a maximum energy of 
 
 
 
 n
 (
 P
 +
 N
 )
 
 
 {\displaystyle n(P+N)}
 
 and therefore must occupy a sphere of radius 
 
 
 
 
 
 n
 (
 P
 +
 N
 )
 
 
 
 
 {\textstyle {\sqrt {n(P+N)}}}
 
. Each codeword sphere has radius 
 
 
 
 
 
 n
 N
 
 
 
 
 {\displaystyle {\sqrt {nN}}}
 
. The volume of an n-dimensional sphere is directly proportional to 
 
 
 
 
 r
 
 n
 
 
 
 
 {\displaystyle r^{n}}
 
, so the maximum number of uniquely decodeable spheres that can be packed into our sphere with transmission power P is:

(
              n
              (
              P
              +
              N
              )
              
                )
                
                  n
                  
                    /
                  
                  2
                
              
            
            
              (
              n
              N
              
                )
                
                  n
                  
                    /
                  
                  2
                
              
            
          
        
        =
        
          2
          
            (
            n
            
              /
            
            2
            )
            log
            ⁡
            
              (
              
                1
                +
                P
                
                  /
                
                N
              
              )
            
          
        
        
        
      
    
    {\displaystyle {\frac {(n(P+N))^{n/2}}{(nN)^{n/2}}}=2^{(n/2)\log \left(1+P/N\right)}\,\!}

By this argument, the rate R can be no more than 
 
 
 
 
 
 1
 2
 
 
 log
 ⁡
 
 (
 
 1
 +
 
 
 P
 N
 
 
 
 )
 
 
 
 {\displaystyle {\frac {1}{2}}\log \left(1+{\frac {P}{N}}\right)}
 
.

<h3>Achievability</h3>
In this section, we show achievability of the upper bound on the rate from the last section.
A codebook, known to both encoder and decoder, is generated by selecting codewords of length n, i.i.d. Gaussian with variance 
 
 
 
 P
 −
 ε
 
 
 {\displaystyle P-\varepsilon }
 
 and mean zero. For large n, the empirical variance of the codebook will be very close to the variance of its distribution, thereby avoiding violation of the power constraint probabilistically.
Received messages are decoded to a message in the codebook which is uniquely jointly typical. If there is no such message or if the power constraint is violated, a decoding error is declared.
Let 
 
 
 
 
 X
 
 n
 
 
 (
 i
 )
 
 
 {\displaystyle X^{n}(i)}
 
 denote the codeword for message 
 
 
 
 i
 
 
 {\displaystyle i}
 
, while 
 
 
 
 
 Y
 
 n
 
 
 
 
 {\displaystyle Y^{n}}
 
 is, as before the received vector. Define the following three events:

<ol><li>Event 
 
 
 
 U
 
 
 {\displaystyle U}
 
:the power of the received message is larger than 
 
 
 
 P
 
 
 {\displaystyle P}
 
.</li>
<li>Event 
 
 
 
 V
 
 
 {\displaystyle V}
 
: the transmitted and received codewords are not jointly typical.</li>
<li>Event 
 
 
 
 
 E
 
 j
 
 
 
 
 {\displaystyle E_{j}}
 
: 
 
 
 
 (
 
 X
 
 n
 
 
 (
 j
 )
 ,
 
 Y
 
 n
 
 
 )
 
 
 {\displaystyle (X^{n}(j),Y^{n})}
 
 is in 
 
 
 
 
 A
 
 ε
 
 
 (
 n
 )
 
 
 
 
 {\displaystyle A_{\varepsilon }^{(n)}}
 
, the <a href="/facts/Typical_set/trEjxlVz">typical set</a> where 
 
 
 
 i
 ≠
 j
 
 
 {\displaystyle i\neq j}
 
, which is to say that the incorrect codeword is jointly typical with the received vector.</li></ol>
An error therefore occurs if 
 
 
 
 U
 
 
 {\displaystyle U}
 
, 
 
 
 
 V
 
 
 {\displaystyle V}
 
 or any of the 
 
 
 
 
 E
 
 i
 
 
 
 
 {\displaystyle E_{i}}
 
 occur. By the law of large numbers, 
 
 
 
 P
 (
 U
 )
 
 
 {\displaystyle P(U)}
 
 goes to zero as n approaches infinity, and by the joint <a href="/facts/Asymptotic_Equipartition_Property/Jcy9SLWc">Asymptotic Equipartition Property</a> the same applies to 
 
 
 
 P
 (
 V
 )
 
 
 {\displaystyle P(V)}
 
. Therefore, for a sufficiently large 
 
 
 
 n
 
 
 {\displaystyle n}
 
, both 
 
 
 
 P
 (
 U
 )
 
 
 {\displaystyle P(U)}
 
 and 
 
 
 
 P
 (
 V
 )
 
 
 {\displaystyle P(V)}
 
 are each less than 
 
 
 
 ε
 
 
 {\displaystyle \varepsilon }
 
. Since 
 
 
 
 
 X
 
 n
 
 
 (
 i
 )
 
 
 {\displaystyle X^{n}(i)}
 
 and 
 
 
 
 
 X
 
 n
 
 
 (
 j
 )
 
 
 {\displaystyle X^{n}(j)}
 
 are independent for 
 
 
 
 i
 ≠
 j
 
 
 {\displaystyle i\neq j}
 
, we have that 
 
 
 
 
 X
 
 n
 
 
 (
 i
 )
 
 
 {\displaystyle X^{n}(i)}
 
 and 
 
 
 
 
 Y
 
 n
 
 
 
 
 {\displaystyle Y^{n}}
 
 are also independent. Therefore, by the joint AEP, 
 
 
 
 P
 (
 
 E
 
 j
 
 
 )
 =
 
 2
 
 −
 n
 (
 I
 (
 X
 ;
 Y
 )
 −
 3
 ε
 )
 
 
 
 
 {\displaystyle P(E_{j})=2^{-n(I(X;Y)-3\varepsilon )}}
 
. This allows us to calculate 
 
 
 
 
 P
 
 e
 
 
 (
 n
 )
 
 
 
 
 {\displaystyle P_{e}^{(n)}}
 
, the probability of error as follows:

P
                  
                    e
                  
                  
                    (
                    n
                    )
                  
                
              
              
                
                ≤
                P
                (
                U
                )
                +
                P
                (
                V
                )
                +
                
                  ∑
                  
                    j
                    ≠
                    i
                  
                
                P
                (
                
                  E
                  
                    j
                  
                
                )
              
            
            
              
              
                
                ≤
                ε
                +
                ε
                +
                
                  ∑
                  
                    j
                    ≠
                    i
                  
                
                
                  2
                  
                    −
                    n
                    (
                    I
                    (
                    X
                    ;
                    Y
                    )
                    −
                    3
                    ε
                    )
                  
                
              
            
            
              
              
                
                ≤
                2
                ε
                +
                (
                
                  2
                  
                    n
                    R
                  
                
                −
                1
                )
                
                  2
                  
                    −
                    n
                    (
                    I
                    (
                    X
                    ;
                    Y
                    )
                    −
                    3
                    ε
                    )
                  
                
              
            
            
              
              
                
                ≤
                2
                ε
                +
                (
                
                  2
                  
                    3
                    n
                    ε
                  
                
                )
                
                  2
                  
                    −
                    n
                    (
                    I
                    (
                    X
                    ;
                    Y
                    )
                    −
                    R
                    )
                  
                
              
            
            
              
              
                
                ≤
                3
                ε
              
            
          
        
      
    
    {\displaystyle {\begin{aligned}P_{e}^{(n)}&\leq P(U)+P(V)+\sum _{j\neq i}P(E_{j})\\&\leq \varepsilon +\varepsilon +\sum _{j\neq i}2^{-n(I(X;Y)-3\varepsilon )}\\&\leq 2\varepsilon +(2^{nR}-1)2^{-n(I(X;Y)-3\varepsilon )}\\&\leq 2\varepsilon +(2^{3n\varepsilon })2^{-n(I(X;Y)-R)}\\&\leq 3\varepsilon \end{aligned}}}

Therefore, as n approaches infinity, 
 
 
 
 
 P
 
 e
 
 
 (
 n
 )
 
 
 
 
 {\displaystyle P_{e}^{(n)}}
 
 goes to zero and 
 
 
 
 R
 <
 I
 (
 X
 ;
 Y
 )
 −
 3
 ε
 
 
 {\displaystyle R<I(X;Y)-3\varepsilon }
 
. Therefore, there is a code of rate R arbitrarily close to the capacity derived earlier.

<h3>Coding theorem converse</h3>
Here we show that rates above the capacity 
 
 
 
 C
 =
 
 
 1
 2
 
 
 log
 ⁡
 
 (
 
 1
 +
 
 
 P
 N
 
 
 
 )
 
 
 
 {\displaystyle C={\frac {1}{2}}\log \left(1+{\frac {P}{N}}\right)}
 
 are not achievable.
Suppose that the power constraint is satisfied for a codebook, and further suppose that the messages follow a uniform distribution. Let 
 
 
 
 W
 
 
 {\displaystyle W}
 
 be the input messages and 
 
 
 
 
 
 
 W
 ^
 
 
 
 
 
 {\displaystyle {\hat {W}}}
 
 the output messages. Thus the information flows as:

 
 
 
 W
 ⟶
 
 X
 
 (
 n
 )
 
 
 (
 W
 )
 ⟶
 
 Y
 
 (
 n
 )
 
 
 ⟶
 
 
 
 W
 ^
 
 
 
 
 
 {\displaystyle W\longrightarrow X^{(n)}(W)\longrightarrow Y^{(n)}\longrightarrow {\hat {W}}}

Making use of <a href="/facts/Fano%27s_inequality/6vC7eQd5">Fano's inequality</a> gives:

 
 
 
 H
 (
 W
 ∣
 
 
 
 W
 ^
 
 
 
 )
 ≤
 1
 +
 n
 R
 
 P
 
 e
 
 
 (
 n
 )
 
 
 =
 n
 
 ε
 
 n
 
 
 
 
 {\displaystyle H(W\mid {\hat {W}})\leq 1+nRP_{e}^{(n)}=n\varepsilon _{n}}
 
 where 
 
 
 
 
 ε
 
 n
 
 
 →
 0
 
 
 {\displaystyle \varepsilon _{n}\rightarrow 0}
 
 as 
 
 
 
 
 P
 
 e
 
 
 (
 n
 )
 
 
 →
 0
 
 
 {\displaystyle P_{e}^{(n)}\rightarrow 0}

Let 
 
 
 
 
 X
 
 i
 
 
 
 
 {\displaystyle X_{i}}
 
 be the encoded message of codeword index i. Then:

n
                R
              
              
                
                =
                H
                (
                W
                )
              
            
            
              
              
                
                =
                I
                (
                W
                ;
                
                  
                    
                      W
                      ^
                    
                  
                
                )
                +
                H
                (
                W
                ∣
                
                  
                    
                      W
                      ^
                    
                  
                
                )
              
            
            
              
              
                
                ≤
                I
                (
                W
                ;
                
                  
                    
                      W
                      ^
                    
                  
                
                )
                +
                n
                
                  ε
                  
                    n
                  
                
              
            
            
              
              
                
                ≤
                I
                (
                
                  X
                  
                    (
                    n
                    )
                  
                
                ;
                
                  Y
                  
                    (
                    n
                    )
                  
                
                )
                +
                n
                
                  ε
                  
                    n
                  
                
              
            
            
              
              
                
                =
                h
                (
                
                  Y
                  
                    (
                    n
                    )
                  
                
                )
                −
                h
                (
                
                  Y
                  
                    (
                    n
                    )
                  
                
                ∣
                
                  X
                  
                    (
                    n
                    )
                  
                
                )
                +
                n
                
                  ε
                  
                    n
                  
                
              
            
            
              
              
                
                =
                h
                (
                
                  Y
                  
                    (
                    n
                    )
                  
                
                )
                −
                h
                (
                
                  Z
                  
                    (
                    n
                    )
                  
                
                )
                +
                n
                
                  ε
                  
                    n
                  
                
              
            
            
              
              
                
                ≤
                
                  ∑
                  
                    i
                    =
                    1
                  
                  
                    n
                  
                
                h
                (
                
                  Y
                  
                    i
                  
                
                )
                −
                h
                (
                
                  Z
                  
                    (
                    n
                    )
                  
                
                )
                +
                n
                
                  ε
                  
                    n
                  
                
              
            
            
              
              
                
                ≤
                
                  ∑
                  
                    i
                    =
                    1
                  
                  
                    n
                  
                
                I
                (
                
                  X
                  
                    i
                  
                
                ;
                
                  Y
                  
                    i
                  
                
                )
                +
                n
                
                  ε
                  
                    n
                  
                
              
            
          
        
      
    
    {\displaystyle {\begin{aligned}nR&=H(W)\\&=I(W;{\hat {W}})+H(W\mid {\hat {W}})\\&\leq I(W;{\hat {W}})+n\varepsilon _{n}\\&\leq I(X^{(n)};Y^{(n)})+n\varepsilon _{n}\\&=h(Y^{(n)})-h(Y^{(n)}\mid X^{(n)})+n\varepsilon _{n}\\&=h(Y^{(n)})-h(Z^{(n)})+n\varepsilon _{n}\\&\leq \sum _{i=1}^{n}h(Y_{i})-h(Z^{(n)})+n\varepsilon _{n}\\&\leq \sum _{i=1}^{n}I(X_{i};Y_{i})+n\varepsilon _{n}\end{aligned}}}

Let 
 
 
 
 
 P
 
 i
 
 
 
 
 {\displaystyle P_{i}}
 
 be the average power of the codeword of index i:

P
          
            i
          
        
        =
        
          
            1
            
              2
              
                n
                R
              
            
          
        
        
          ∑
          
            w
          
        
        
          x
          
            i
          
          
            2
          
        
        (
        w
        )
        
        
      
    
    {\displaystyle P_{i}={\frac {1}{2^{nR}}}\sum _{w}x_{i}^{2}(w)\,\!}

where the sum is over all input messages 
 
 
 
 w
 
 
 {\displaystyle w}
 
. 
 
 
 
 
 X
 
 i
 
 
 
 
 {\displaystyle X_{i}}
 
 and 
 
 
 
 
 Z
 
 i
 
 
 
 
 {\displaystyle Z_{i}}
 
 are independent, thus the expectation of the power of 
 
 
 
 
 Y
 
 i
 
 
 
 
 {\displaystyle Y_{i}}
 
 is, for noise level 
 
 
 
 N
 
 
 {\displaystyle N}
 
:

E
        (
        
          Y
          
            i
          
          
            2
          
        
        )
        =
        
          P
          
            i
          
        
        +
        N
        
        
      
    
    {\displaystyle E(Y_{i}^{2})=P_{i}+N\,\!}

And, if 
 
 
 
 
 Y
 
 i
 
 
 
 
 {\displaystyle Y_{i}}
 
 is normally distributed, we have that

h
        (
        
          Y
          
            i
          
        
        )
        ≤
        
          
            1
            2
          
        
        log
        ⁡
        
          2
          π
          e
        
        (
        
          P
          
            i
          
        
        +
        N
        )
        
        
      
    
    {\displaystyle h(Y_{i})\leq {\frac {1}{2}}\log {2\pi e}(P_{i}+N)\,\!}

Therefore,

n
                R
              
              
                
                ≤
                ∑
                (
                h
                (
                
                  Y
                  
                    i
                  
                
                )
                −
                h
                (
                
                  Z
                  
                    i
                  
                
                )
                )
                +
                n
                
                  ε
                  
                    n
                  
                
              
            
            
              
              
                
                ≤
                ∑
                
                  (
                  
                    
                      
                        1
                        2
                      
                    
                    log
                    ⁡
                    (
                    2
                    π
                    e
                    (
                    
                      P
                      
                        i
                      
                    
                    +
                    N
                    )
                    )
                    −
                    
                      
                        1
                        2
                      
                    
                    log
                    ⁡
                    (
                    2
                    π
                    e
                    N
                    )
                  
                  )
                
                +
                n
                
                  ε
                  
                    n
                  
                
              
            
            
              
              
                
                =
                ∑
                
                  
                    1
                    2
                  
                
                log
                ⁡
                
                  (
                  
                    1
                    +
                    
                      
                        
                          P
                          
                            i
                          
                        
                        N
                      
                    
                  
                  )
                
                +
                n
                
                  ε
                  
                    n
                  
                
              
            
          
        
      
    
    {\displaystyle {\begin{aligned}nR&\leq \sum (h(Y_{i})-h(Z_{i}))+n\varepsilon _{n}\\&\leq \sum \left({\frac {1}{2}}\log(2\pi e(P_{i}+N))-{\frac {1}{2}}\log(2\pi eN)\right)+n\varepsilon _{n}\\&=\sum {\frac {1}{2}}\log \left(1+{\frac {P_{i}}{N}}\right)+n\varepsilon _{n}\end{aligned}}}

We may apply Jensen's equality to 
 
 
 
 log
 ⁡
 (
 1
 +
 x
 )
 
 
 {\displaystyle \log(1+x)}
 
, a concave (downward) function of x, to get:

1
            n
          
        
        
          ∑
          
            i
            =
            1
          
          
            n
          
        
        
          
            1
            2
          
        
        log
        ⁡
        
          (
          
            1
            +
            
              
                
                  P
                  
                    i
                  
                
                N
              
            
          
          )
        
        ≤
        
          
            1
            2
          
        
        log
        ⁡
        
          (
          
            1
            +
            
              
                1
                n
              
            
            
              ∑
              
                i
                =
                1
              
              
                n
              
            
            
              
                
                  P
                  
                    i
                  
                
                N
              
            
          
          )
        
        
        
      
    
    {\displaystyle {\frac {1}{n}}\sum _{i=1}^{n}{\frac {1}{2}}\log \left(1+{\frac {P_{i}}{N}}\right)\leq {\frac {1}{2}}\log \left(1+{\frac {1}{n}}\sum _{i=1}^{n}{\frac {P_{i}}{N}}\right)\,\!}

Because each codeword individually satisfies the power constraint, the average also satisfies the power constraint. Therefore,

1
            n
          
        
        
          ∑
          
            i
            =
            1
          
          
            n
          
        
        
          
            
              P
              
                i
              
            
            N
          
        
        ,
        
        
      
    
    {\displaystyle {\frac {1}{n}}\sum _{i=1}^{n}{\frac {P_{i}}{N}},\,\!}

which we may apply to simplify the inequality above and get:

1
            2
          
        
        log
        ⁡
        
          (
          
            1
            +
            
              
                1
                n
              
            
            
              ∑
              
                i
                =
                1
              
              
                n
              
            
            
              
                
                  P
                  
                    i
                  
                
                N
              
            
          
          )
        
        ≤
        
          
            1
            2
          
        
        log
        ⁡
        
          (
          
            1
            +
            
              
                P
                N
              
            
          
          )
        
        .
        
        
      
    
    {\displaystyle {\frac {1}{2}}\log \left(1+{\frac {1}{n}}\sum _{i=1}^{n}{\frac {P_{i}}{N}}\right)\leq {\frac {1}{2}}\log \left(1+{\frac {P}{N}}\right).\,\!}

Therefore, it must be that 
 
 
 
 R
 ≤
 
 
 1
 2
 
 
 log
 ⁡
 
 (
 
 1
 +
 
 
 P
 N
 
 
 
 )
 
 +
 
 ε
 
 n
 
 
 
 
 {\displaystyle R\leq {\frac {1}{2}}\log \left(1+{\frac {P}{N}}\right)+\varepsilon _{n}}
 
. Therefore, R must be less than a value arbitrarily close to the capacity derived earlier, as 
 
 
 
 
 ε
 
 n
 
 
 →
 0
 
 
 {\displaystyle \varepsilon _{n}\rightarrow 0}
 
.

<h2 id="effects-in-time-domain">Effects in time domain</h2>

In serial data communications, the AWGN mathematical model is used to model the timing error caused by random <a href="/facts/Jitter/UpLIJDVQ">jitter</a> (RJ).
The graph to the right shows an example of timing errors associated with AWGN. The variable Δt represents the uncertainty in the zero crossing. As the amplitude of the AWGN is increased, the <a href="/facts/Signal-to-noise_ratio/qohClhyG">signal-to-noise ratio</a> decreases. This results in increased uncertainty Δt.<a class="footnote-ref" id="fnref:1" href="#fn:1">1</a>
When affected by AWGN, the average number of either positive-going or negative-going zero crossings per second at the output of a narrow bandpass filter when the input is a sine wave is

positive zero crossings
                    second
                  
                
                =
                
                  
                    negative zero crossings
                    second
                  
                
              
            
            
              
                =

f
                  
                    0
                  
                
                
                  
                    
                      
                        
                          SNR
                        
                        +
                        1
                        +
                        
                          
                            
                              B
                              
                                2
                              
                            
                            
                              12
                              
                                f
                                
                                  0
                                
                                
                                  2
                                
                              
                            
                          
                        
                      
                      
                        
                          SNR
                        
                        +
                        1
                      
                    
                  
                
                ,
              
            
          
        
      
    
    {\displaystyle {\begin{aligned}&{\frac {\text{positive zero crossings}}{\text{second}}}={\frac {\text{negative zero crossings}}{\text{second}}}\\[8pt]={}&f_{0}{\sqrt {\frac {{\text{SNR}}+1+{\frac {B^{2}}{12f_{0}^{2}}}}{{\text{SNR}}+1}}},\end{aligned}}}

where

ƒ0 = the center frequency of the filter,
B = the filter bandwidth,
SNR = the signal-to-noise power ratio in linear terms.
<h2 id="effects-in-phasor-domain">Effects in phasor domain</h2>

In modern communication systems, bandlimited AWGN cannot be ignored. When modeling bandlimited AWGN in the <a href="/facts/Phasor/LyjSSUDo">phasor</a> domain, statistical analysis reveals that the amplitudes of the real and imaginary contributions are independent variables which follow the <a href="/facts/Gaussian_distribution/UapjjPyQ">Gaussian distribution</a> model. When combined, the resultant phasor's magnitude is a <a href="/facts/Rayleigh_distribution/vLFh4zHr">Rayleigh-distributed</a> random variable, while the phase is uniformly distributed from 0 to 2π.
The graph to the right shows an example of how bandlimited AWGN can affect a coherent carrier signal. The instantaneous response of the noise vector cannot be precisely predicted, however, its time-averaged response can be statistically predicted. As shown in the graph, we confidently predict that the noise phasor will reside about 38% of the time inside the 1σ circle, about 86% of the time inside the 2σ circle, and about 98% of the time inside the 3σ circle.<a class="footnote-ref" id="fnref:2" href="#fn:2">2</a>

<h2 id="see-also">See also</h2>
<ul><li><a href="/facts/Ground_bounce/DENdhdUJ">Ground bounce</a></li>
<li><a href="/facts/Noisy-channel_coding_theorem/vvr8u4Gg">Noisy-channel coding theorem</a></li>
<li><a href="/facts/Gaussian_process/MrBq7kYW">Gaussian process</a></li></ul>

<h2 id="references">References</h2>

<ol>
<li id="fn:1">McClaning, Kevin, Radio Receiver Design, Noble Publishing Corporation <a href="#fnref:1" class="footnote-back-ref">↩</a></li>
<li id="fn:2">McClaning, Kevin, Radio Receiver Design, Noble Publishing Corporation <a href="#fnref:2" class="footnote-back-ref">↩</a></li>
</ol>

Additive white Gaussian noise open-in-new

Additive white Gaussian noise