Poisson binomial distribution

<h2 id="definitions">Definitions</h2>
<h3>Probability mass function</h3>
The probability of having k successful trials out of a total of n can be written as the sum
<a class="footnote-ref" id="fnref:1" href="#fn:1">1</a>

Pr
        (
        K
        =
        k
        )
        =
        
          ∑
          
            A
            ∈
            
              F
              
                k
              
            
          
        
        
          ∏
          
            i
            ∈
            A
          
        
        
          p
          
            i
          
        
        
          ∏
          
            j
            ∈
            
              A
              
                c
              
            
          
        
        (
        1
        −
        
          p
          
            j
          
        
        )
      
    
    {\displaystyle \Pr(K=k)=\sum \limits _{A\in F_{k}}\prod \limits _{i\in A}p_{i}\prod \limits _{j\in A^{c}}(1-p_{j})}

where 
 
 
 
 
 F
 
 k
 
 
 
 
 {\displaystyle F_{k}}
 
 is the set of all subsets of k integers that can be selected from 
 
 
 
 {
 1
 ,
 2
 ,
 3
 ,
 .
 .
 .
 ,
 n
 }
 
 
 {\displaystyle \{1,2,3,...,n\}}
 
. For example, if n = 3, then 
 
 
 
 
 F
 
 2
 
 
 =
 
 {
 
 {
 1
 ,
 2
 }
 ,
 {
 1
 ,
 3
 }
 ,
 {
 2
 ,
 3
 }
 
 }
 
 
 
 {\displaystyle F_{2}=\left\{\{1,2\},\{1,3\},\{2,3\}\right\}}
 
. 
 
 
 
 
 A
 
 c
 
 
 
 
 {\displaystyle A^{c}}
 
 is the <a href="/facts/Complement_(set_theory)/yWUePIYY">complement</a> of 
 
 
 
 A
 
 
 {\displaystyle A}
 
, i.e. 
 
 
 
 
 A
 
 c
 
 
 =
 {
 1
 ,
 2
 ,
 3
 ,
 …
 ,
 n
 }
 ∖
 A
 
 
 {\displaystyle A^{c}=\{1,2,3,\dots ,n\}\smallsetminus A}
 
.

 
 
 
 
 F
 
 k
 
 
 
 
 {\displaystyle F_{k}}
 
 will contain 
 
 
 
 n
 !
 
 /
 
 (
 (
 n
 −
 k
 )
 !
 k
 !
 )
 
 
 {\displaystyle n!/((n-k)!k!)}
 
 elements, the sum over which is infeasible to compute in practice unless the number of trials n is small (e.g. if n = 30, 
 
 
 
 
 F
 
 15
 
 
 
 
 {\displaystyle F_{15}}
 
 contains over 1020 elements). However, there are other, more efficient ways to calculate 
 
 
 
 Pr
 (
 K
 =
 k
 )
 
 
 {\displaystyle \Pr(K=k)}
 
.
As long as none of the success probabilities are equal to one, one can calculate the probability of k successes using the recursive formula 
<a class="footnote-ref" id="fnref:2" href="#fn:2">2</a>
<a class="footnote-ref" id="fnref:3" href="#fn:3">3</a>

Pr
        (
        K
        =
        k
        )
        =
        
          
            {
            
              
                
                  
                    ∏
                    
                      i
                      =
                      1
                    
                    
                      n
                    
                  
                  (
                  1
                  −
                  
                    p
                    
                      i
                    
                  
                  )
                
                
                  k
                  =
                  0
                
              
              
                
                  
                    
                      1
                      k
                    
                  
                  
                    ∑
                    
                      i
                      =
                      1
                    
                    
                      k
                    
                  
                  (
                  −
                  1
                  
                    )
                    
                      i
                      −
                      1
                    
                  
                  Pr
                  (
                  K
                  =
                  k
                  −
                  i
                  )
                  T
                  (
                  i
                  )
                
                
                  k
                  >
                  0
                
              
            
            
          
        
      
    
    {\displaystyle \Pr(K=k)={\begin{cases}\prod \limits _{i=1}^{n}(1-p_{i})&k=0\\{\frac {1}{k}}\sum \limits _{i=1}^{k}(-1)^{i-1}\Pr(K=k-i)T(i)&k>0\\\end{cases}}}

where

T
        (
        i
        )
        =
        
          ∑
          
            j
            =
            1
          
          
            n
          
        
        
          
            (
            
              
                
                  p
                  
                    j
                  
                
                
                  1
                  −
                  
                    p
                    
                      j
                    
                  
                
              
            
            )
          
          
            i
          
        
        .
      
    
    {\displaystyle T(i)=\sum \limits _{j=1}^{n}\left({\frac {p_{j}}{1-p_{j}}}\right)^{i}.}

The recursive formula is not <a href="/facts/Numerically_stable/jRDdo8zV">numerically stable</a>, and should be avoided if 
 
 
 
 n
 
 
 {\displaystyle n}
 
 is greater than approximately 20. 
An alternative is to use a <a href="/facts/Divide-and-conquer_algorithm/pC1X7Ws7">divide-and-conquer algorithm</a>: if we assume 
 
 
 
 n
 =
 
 2
 
 b
 
 
 
 
 {\displaystyle n=2^{b}}
 
 is a power of two, denoting by 
 
 
 
 f
 (
 
 p
 
 i
 :
 j
 
 
 )
 
 
 {\displaystyle f(p_{i:j})}
 
 the Poisson binomial of 
 
 
 
 
 p
 
 i
 
 
 ,
 …
 ,
 
 p
 
 j
 
 
 
 
 {\displaystyle p_{i},\dots ,p_{j}}
 
 and 
 
 
 
 ∗
 
 
 {\displaystyle *}
 
 the <a href="/facts/Convolution_of_probability_distributions/l467HgVK">convolution</a> operator, we have 
 
 
 
 f
 (
 
 p
 
 1
 :
 
 2
 
 b
 
 
 
 
 )
 =
 f
 (
 
 p
 
 1
 :
 
 2
 
 b
 −
 1
 
 
 
 
 )
 ∗
 f
 (
 
 p
 
 
 2
 
 b
 −
 1
 
 
 +
 1
 :
 
 2
 
 b
 
 
 
 
 )
 
 
 {\displaystyle f(p_{1:2^{b}})=f(p_{1:2^{b-1}})*f(p_{2^{b-1}+1:2^{b}})}
 
. 
More generally, the probability mass function of a Poisson binomial can be expressed as the convolution of the vectors 
 
 
 
 
 P
 
 1
 
 
 ,
 …
 ,
 
 P
 
 n
 
 
 
 
 {\displaystyle P_{1},\dots ,P_{n}}
 
 where 
 
 
 
 
 P
 
 i
 
 
 =
 [
 1
 −
 
 p
 
 i
 
 
 ,
 
 p
 
 i
 
 
 ]
 
 
 {\displaystyle P_{i}=[1-p_{i},p_{i}]}
 
. This observation leads to the direct convolution (DC) algorithm for computing 
 
 
 
 Pr
 (
 K
 =
 0
 )
 
 
 {\displaystyle \Pr(K=0)}
 
 through 
 
 
 
 Pr
 (
 K
 =
 n
 )
 
 
 {\displaystyle \Pr(K=n)}
 
:

// PMF and nextPMF begin at index 0
function DC(
 
 
 
 
 p
 
 1
 
 
 ,
 …
 ,
 
 p
 
 n
 
 
 
 
 {\displaystyle p_{1},\dots ,p_{n}}
 
) is 
 declare new PMF array of size 1
 PMF[0] = [1]
 for i = 1 to 
 
 
 
 n
 
 
 {\displaystyle n}
 
 do 
 declare new nextPMF array of size i + 1
 nextPMF[0] = (1 - 
 
 
 
 
 p
 
 i
 
 
 
 
 {\displaystyle p_{i}}
 
) * PMF[0]
 nextPMF[i] = 
 
 
 
 
 p
 
 i
 
 
 
 
 {\displaystyle p_{i}}
 
 * PMF[i - 1]
 for k = 1 to i - 1 do
 nextPMF[k] = 
 
 
 
 
 p
 
 i
 
 
 
 
 {\displaystyle p_{i}}
 
 * PMF[k - 1] + (1 - 
 
 
 
 
 p
 
 i
 
 
 
 
 {\displaystyle p_{i}}
 
) * PMF[k]
 repeat
 PMF = nextPMF
 repeat
 return PMF
end function

Pr
 (
 K
 =
 k
 )
 
 
 {\displaystyle \Pr(K=k)}
 
will be found in PMF[k]. DC is numerically stable, exact, and, when implemented as a software routine, exceptionally fast for 
 
 
 
 n
 ≤
 2000
 
 
 {\displaystyle n\leq 2000}
 
. It can also be quite fast for larger 
 
 
 
 n
 
 
 {\displaystyle n}
 
, depending on the distribution of the 
 
 
 
 
 p
 
 i
 
 
 
 
 {\displaystyle p_{i}}
 
.<a class="footnote-ref" id="fnref:4" href="#fn:4">4</a>
Another possibility is using the <a href="/facts/Discrete_Fourier_transform/uK9Tjsbl">discrete Fourier transform</a>.<a class="footnote-ref" id="fnref:5" href="#fn:5">5</a>

Pr
        (
        K
        =
        k
        )
        =
        
          
            1
            
              n
              +
              1
            
          
        
        
          ∑
          
            ℓ
            =
            0
          
          
            n
          
        
        
          C
          
            −
            l
            k
          
        
        
          ∏
          
            m
            =
            1
          
          
            n
          
        
        
          (
          
            1
            +
            (
            
              C
              
                ℓ
              
            
            −
            1
            )
            
              p
              
                m
              
            
          
          )
        
      
    
    {\displaystyle \Pr(K=k)={\frac {1}{n+1}}\sum _{\ell =0}^{n}C^{-lk}\prod _{m=1}^{n}\left(1+(C^{\ell }-1)p_{m}\right)}

where 
 
 
 
 C
 =
 exp
 ⁡
 
 (
 
 
 
 2
 i
 π
 
 
 n
 +
 1
 
 
 
 )
 
 
 
 {\displaystyle C=\exp \left({\frac {2i\pi }{n+1}}\right)}
 
 and 
 
 
 
 i
 =
 
 
 −
 1
 
 
 
 
 {\displaystyle i={\sqrt {-1}}}
 
.
Still other methods are described in "Statistical Applications of the Poisson-Binomial and conditional Bernoulli distributions" by Chen and Liu<a class="footnote-ref" id="fnref:6" href="#fn:6">6</a> and in "A simple and fast method for computing the Poisson binomial distribution function" by Biscarri et al.<a class="footnote-ref" id="fnref:7" href="#fn:7">7</a>

<h3>Cumulative distribution function</h3>
The <a href="/facts/Cumulative_distribution_function/WaKU8tp4">cumulative distribution function</a> (CDF) can be expressed as:

Pr
        (
        K
        ≤
        k
        )
        =
        
          ∑
          
            ℓ
            =
            0
          
          
            k
          
        
        
          ∑
          
            A
            ∈
            
              F
              
                ℓ
              
            
          
        
        
          ∏
          
            i
            ∈
            A
          
        
        
          p
          
            i
          
        
        
          ∏
          
            j
            ∈
            
              A
              
                c
              
            
          
        
        (
        1
        −
        
          p
          
            j
          
        
        )
        ,
      
    
    {\displaystyle \Pr(K\leq k)=\sum _{\ell =0}^{k}\sum _{A\in F_{\ell }}\prod _{i\in A}p_{i}\prod _{j\in A^{c}}(1-p_{j}),}

where 
 
 
 
 
 F
 
 ℓ
 
 
 
 
 {\displaystyle F_{\ell }}
 
 is the set of all subsets of size 
 
 
 
 ℓ
 
 
 {\displaystyle \ell }
 
 that can be selected from 
 
 
 
 {
 1
 ,
 2
 ,
 3
 ,
 …
 ,
 n
 }
 
 
 {\displaystyle \{1,2,3,\ldots ,n\}}
 
.
It can be computed by invoking the DC function above, and then adding elements 
 
 
 
 0
 
 
 {\displaystyle 0}
 
 through 
 
 
 
 k
 
 
 {\displaystyle k}
 
 of the returned PMF array. 

<h2 id="properties">Properties</h2>
<h3>Mean and variance</h3>
Since a Poisson binomial distributed variable is a sum of n independent Bernoulli distributed variables, its mean and variance will simply be sums of the mean and variance of the n Bernoulli distributions:

μ
        =
        
          ∑
          
            i
            =
            1
          
          
            n
          
        
        
          p
          
            i
          
        
      
    
    {\displaystyle \mu =\sum \limits _{i=1}^{n}p_{i}}

σ
          
            2
          
        
        =
        
          ∑
          
            i
            =
            1
          
          
            n
          
        
        (
        1
        −
        
          p
          
            i
          
        
        )
        
          p
          
            i
          
        
      
    
    {\displaystyle \sigma ^{2}=\sum \limits _{i=1}^{n}(1-p_{i})p_{i}}

<h3>Entropy</h3>
There is no simple formula for the entropy of a Poisson binomial distribution, but the entropy is bounded above by the entropy of a binomial distribution with the same number parameter and the same mean. Therefore, the entropy is also bounded above by the entropy of a Poisson distribution with the same mean.<a class="footnote-ref" id="fnref:8" href="#fn:8">8</a>
The Shepp–Olkin concavity conjecture, due to <a href="/facts/Lawrence_Shepp/un54gtmS">Lawrence Shepp</a> and <a href="/facts/Ingram_Olkin/omLnnzI2">Ingram Olkin</a> in 1981, states that the entropy of a Poisson binomial distribution is a concave function of the success probabilities 
 
 
 
 
 p
 
 1
 
 
 ,
 
 p
 
 2
 
 
 ,
 …
 ,
 
 p
 
 n
 
 
 
 
 {\displaystyle p_{1},p_{2},\dots ,p_{n}}
 
.<a class="footnote-ref" id="fnref:9" href="#fn:9">9</a> This conjecture was proved by Erwan Hillion and Oliver Johnson in 2015.<a class="footnote-ref" id="fnref:10" href="#fn:10">10</a> The Shepp–Olkin monotonicity conjecture, also from the same 1981 paper, is that the entropy is monotone increasing in 
 
 
 
 
 p
 
 i
 
 
 
 
 {\displaystyle p_{i}}
 
, if all 
 
 
 
 
 p
 
 i
 
 
 ≤
 1
 
 /
 
 2
 
 
 {\displaystyle p_{i}\leq 1/2}
 
. This conjecture was also proved by Hillion and Johnson, in 2019.<a class="footnote-ref" id="fnref:11" href="#fn:11">11</a>

<h3>Chernoff bound</h3>
The probability that a Poisson binomial distribution gets large, can be bounded using its moment generating function as follows (valid when 
 
 
 
 s
 ≥
 μ
 
 
 {\displaystyle s\geq \mu }
 
 and for any 
 
 
 
 t
 >
 0
 
 
 {\displaystyle t>0}
 
):

Pr
                [
                S
                ≥
                s
                ]
              
              
                
                ≤
                exp
                ⁡
                (
                −
                s
                t
                )
                E
                ⁡
                
                  [
                  
                    exp
                    ⁡
                    
                      [
                      
                        t
                        
                          ∑
                          
                            i
                          
                        
                        
                          X
                          
                            i
                          
                        
                      
                      ]
                    
                  
                  ]
                
              
            
            
              
              
                
                =
                exp
                ⁡
                (
                −
                s
                t
                )
                
                  ∏
                  
                    i
                  
                
                (
                1
                −
                
                  p
                  
                    i
                  
                
                +
                
                  e
                  
                    t
                  
                
                
                  p
                  
                    i
                  
                
                )
              
            
            
              
              
                
                =
                exp
                ⁡
                
                  (
                  
                    −
                    s
                    t
                    +
                    
                      ∑
                      
                        i
                      
                    
                    log
                    ⁡
                    
                      (
                      
                        
                          p
                          
                            i
                          
                        
                        (
                        
                          e
                          
                            t
                          
                        
                        −
                        1
                        )
                        +
                        1
                      
                      )
                    
                  
                  )
                
              
            
            
              
              
                
                ≤
                exp
                ⁡
                
                  (
                  
                    −
                    s
                    t
                    +
                    
                      ∑
                      
                        i
                      
                    
                    log
                    ⁡
                    
                      (
                      
                        exp
                        ⁡
                        (
                        
                          p
                          
                            i
                          
                        
                        (
                        
                          e
                          
                            t
                          
                        
                        −
                        1
                        )
                        )
                      
                      )
                    
                  
                  )
                
              
            
            
              
              
                
                =
                exp
                ⁡
                
                  (
                  
                    −
                    s
                    t
                    +
                    
                      ∑
                      
                        i
                      
                    
                    
                      p
                      
                        i
                      
                    
                    (
                    
                      e
                      
                        t
                      
                    
                    −
                    1
                    )
                  
                  )
                
              
            
            
              
              
                
                =
                exp
                ⁡
                
                  (
                  
                    s
                    −
                    μ
                    −
                    s
                    log
                    ⁡
                    
                      
                        s
                        μ
                      
                    
                  
                  )
                
                ,
              
            
          
        
      
    
    {\displaystyle {\begin{aligned}\Pr[S\geq s]&\leq \exp(-st)\operatorname {E} \left[\exp \left[t\sum _{i}X_{i}\right]\right]\\&=\exp(-st)\prod _{i}(1-p_{i}+e^{t}p_{i})\\&=\exp \left(-st+\sum _{i}\log \left(p_{i}(e^{t}-1)+1\right)\right)\\&\leq \exp \left(-st+\sum _{i}\log \left(\exp(p_{i}(e^{t}-1))\right)\right)\\&=\exp \left(-st+\sum _{i}p_{i}(e^{t}-1)\right)\\&=\exp \left(s-\mu -s\log {\frac {s}{\mu }}\right),\end{aligned}}}

where we took 
 
 
 
 t
 =
 log
 ⁡
 
 (
 
 s
 
 /
 
 μ
 
 )
 
 
 
 {\textstyle t=\log \left(s/\mu \right)}
 
. This is similar to the <a href="/facts/Binomial_distribution/UMoFMjDj">tail bounds of a binomial distribution</a>.

<h2 id="related-distribution">Related distribution</h2>
<h3>Approximation by binomial distribution</h3>
A Poisson binomial distribution 
 
 
 
 P
 B
 
 
 {\displaystyle PB}
 
 can be approximated by a binomial distribution 
 
 
 
 B
 
 
 {\displaystyle B}
 
 where 
 
 
 
 μ
 
 
 {\displaystyle \mu }
 
, the mean of the 
 
 
 
 
 p
 
 i
 
 
 
 
 {\displaystyle p_{i}}
 
, is the success probability of 
 
 
 
 B
 
 
 {\displaystyle B}
 
. The variances of 
 
 
 
 P
 B
 
 
 {\displaystyle PB}
 
 and 
 
 
 
 B
 
 
 {\displaystyle B}
 
 are related by the formula

Var
        ⁡
        (
        P
        B
        )
        =
        Var
        ⁡
        (
        B
        )
        −
        
          ∑
          
            i
            =
            1
          
          
            n
          
        
        (
        
          p
          
            i
          
        
        −
        μ
        
          )
          
            2
          
        
      
    
    {\displaystyle \operatorname {Var} (PB)=\operatorname {Var} (B)-\sum _{i=1}^{n}(p_{i}-\mu )^{2}}

As can be seen, the closer the 
 
 
 
 
 p
 
 i
 
 
 
 
 {\displaystyle p_{i}}
 
 are to 
 
 
 
 μ
 
 
 {\displaystyle \mu }
 
, that is, the more the 
 
 
 
 
 p
 
 i
 
 
 
 
 {\displaystyle p_{i}}
 
 tend to homogeneity, the larger 
 
 
 
 P
 B
 
 
 {\displaystyle PB}
 
's variance. When all the 
 
 
 
 
 p
 
 i
 
 
 
 
 {\displaystyle p_{i}}
 
are equal to 
 
 
 
 μ
 
 
 {\displaystyle \mu }
 
, 
 
 
 
 P
 B
 
 
 {\displaystyle PB}
 
 becomes 
 
 
 
 B
 
 
 {\displaystyle B}
 
, 
 
 
 
 Var
 ⁡
 (
 P
 B
 )
 =
 Var
 ⁡
 (
 B
 )
 
 
 {\displaystyle \operatorname {Var} (PB)=\operatorname {Var} (B)}
 
, and the variance is at its maximum.<a class="footnote-ref" id="fnref:12" href="#fn:12">12</a>
Ehm has determined bounds for the <a href="/facts/Total_variation_distance_of_probability_measures/agdAvDuW">total variation distance</a> of 
 
 
 
 P
 B
 
 
 {\displaystyle PB}
 
 and 
 
 
 
 B
 
 
 {\displaystyle B}
 
, in effect providing bounds on the error introduced when approximating 
 
 
 
 P
 B
 
 
 {\displaystyle PB}
 
 with 
 
 
 
 B
 
 
 {\displaystyle B}
 
. Let 
 
 
 
 ν
 =
 1
 −
 μ
 
 
 {\displaystyle \nu =1-\mu }
 
 and 
 
 
 
 d
 (
 P
 B
 ,
 B
 )
 
 
 {\displaystyle d(PB,B)}
 
 be the total variation distance of 
 
 
 
 P
 B
 
 
 {\displaystyle PB}
 
 and 
 
 
 
 B
 
 
 {\displaystyle B}
 
. Then

d
        (
        P
        B
        ,
        B
        )
        ≤
        (
        1
        −
        
          μ
          
            n
            +
            1
          
        
        −
        
          ν
          
            n
            +
            1
          
        
        )
        
          
            
              
                ∑
                
                  i
                  =
                  1
                
                
                  n
                
              
              (
              
                p
                
                  i
                
              
              −
              μ
              
                )
                
                  2
                
              
            
            
              (
              (
              n
              +
              1
              )
              μ
              ν
              )
            
          
        
      
    
    {\displaystyle d(PB,B)\leq (1-\mu ^{n+1}-\nu ^{n+1}){\frac {\sum _{i=1}^{n}(p_{i}-\mu )^{2}}{((n+1)\mu \nu )}}}

d
        (
        P
        B
        ,
        B
        )
        ≥
        C
        min
        
          {
          
            
            1
            ,
            
              
                1
                
                  n
                  μ
                  ν
                
              
            
            
          
          }
        
        
          ∑
          
            i
            =
            1
          
          
            n
          
        
        (
        
          p
          
            i
          
        
        −
        μ
        
          )
          
            2
          
        
      
    
    {\displaystyle d(PB,B)\geq C\min \left\{\,1,{\frac {1}{n\mu \nu }}\,\right\}\sum _{i=1}^{n}(p_{i}-\mu )^{2}}

where 
 
 
 
 C
 ≥
 
 
 1
 124
 
 
 
 
 {\displaystyle C\geq {\frac {1}{124}}}
 
.

 
 
 
 d
 (
 P
 B
 ,
 B
 )
 
 
 {\displaystyle d(PB,B)}
 
 tends to 0 if and only if 
 
 
 
 Var
 ⁡
 (
 P
 B
 )
 
 /
 
 Var
 ⁡
 (
 B
 )
 
 
 {\displaystyle \operatorname {Var} (PB)/\operatorname {Var} (B)}
 
 tends to 1.<a class="footnote-ref" id="fnref:13" href="#fn:13">13</a>

<h3>Approximation by Poisson distribution</h3>
A Poisson binomial distribution 
 
 
 
 P
 B
 
 
 {\displaystyle PB}
 
 can also be approximated by a <a href="/facts/Poisson_distribution/CvCzXkHr">Poisson distribution</a> 
 
 
 
 P
 o
 
 
 {\displaystyle Po}
 
 with mean 
 
 
 
 λ
 =
 
 ∑
 
 i
 =
 1
 
 
 n
 
 
 
 p
 
 i
 
 
 
 
 {\displaystyle \lambda =\sum _{i=1}^{n}p_{i}}
 
. Barbour and Hall have shown that

1
            32
          
        
        min
        
          {
          
            
            
              
                1
                λ
              
            
            ,
            1
            
          
          }
        
        
          ∑
          
            i
            =
            1
          
          
            n
          
        
        
          p
          
            i
          
          
            2
          
        
        ≤
        d
        (
        P
        B
        ,
        P
        o
        )
        ≤
        
          
            
              1
              −
              
                ε
                
                  −
                  λ
                
              
            
            λ
          
        
        
          ∑
          
            i
            =
            1
          
          
            n
          
        
        
          p
          
            i
          
          
            2
          
        
      
    
    {\displaystyle {\frac {1}{32}}\min \left\{\,{\frac {1}{\lambda }},1\,\right\}\sum _{i=1}^{n}p_{i}^{2}\leq d(PB,Po)\leq {\frac {1-\varepsilon ^{-\lambda }}{\lambda }}\sum _{i=1}^{n}p_{i}^{2}}

where 
 
 
 
 d
 (
 P
 B
 ,
 B
 )
 
 
 {\displaystyle d(PB,B)}
 
 is the total variation distance of 
 
 
 
 P
 B
 
 
 {\displaystyle PB}
 
 and 
 
 
 
 P
 o
 
 
 {\displaystyle Po}
 
.<a class="footnote-ref" id="fnref:14" href="#fn:14">14</a> It can be seen that the smaller the 
 
 
 
 
 p
 
 i
 
 
 
 
 {\displaystyle p_{i}}
 
, the better 
 
 
 
 P
 o
 
 
 {\displaystyle Po}
 
 approximates 
 
 
 
 P
 B
 
 
 {\displaystyle PB}
 
. 
As 
 
 
 
 Var
 ⁡
 (
 P
 o
 )
 =
 λ
 =
 
 ∑
 
 i
 =
 1
 
 
 n
 
 
 
 p
 
 i
 
 
 
 
 {\displaystyle \operatorname {Var} (Po)=\lambda =\sum _{i=1}^{n}p_{i}}
 
 and 
 
 
 
 Var
 ⁡
 (
 P
 B
 )
 =
 
 ∑
 
 i
 =
 1
 
 
 n
 
 
 
 p
 
 i
 
 
 −
 
 ∑
 
 i
 =
 1
 
 
 n
 
 
 
 p
 
 i
 
 
 2
 
 
 
 
 {\displaystyle \operatorname {Var} (PB)=\sum \limits _{i=1}^{n}p_{i}-\sum \limits _{i=1}^{n}p_{i}^{2}}
 
, 
 
 
 
 Var
 ⁡
 (
 
 P
 o
 
 )
 >
 Var
 ⁡
 (
 P
 B
 )
 
 
 {\displaystyle \operatorname {Var} (\mathrm {Po} )>\operatorname {Var} (PB)}
 
; so a Poisson binomial distribution's variance is bounded above by a Poisson distribution with 
 
 
 
 λ
 =
 
 ∑
 
 i
 =
 1
 
 
 n
 
 
 
 p
 
 i
 
 
 
 
 {\displaystyle \lambda =\sum _{i=1}^{n}p_{i}}
 
, and the smaller the 
 
 
 
 
 p
 
 i
 
 
 
 
 {\displaystyle p_{i}}
 
, the closer 
 
 
 
 Var
 ⁡
 (
 
 P
 o
 
 )
 
 
 {\displaystyle \operatorname {Var} (\mathrm {Po} )}
 
 will be to 
 
 
 
 Var
 ⁡
 (
 P
 B
 )
 
 
 {\displaystyle \operatorname {Var} (PB)}
 
.

<h2 id="computational-methods">Computational methods</h2>
The reference <a class="footnote-ref" id="fnref:15" href="#fn:15">15</a> discusses techniques of evaluating the probability mass function of the Poisson binomial distribution. The following software implementations are based on it:

<ul><li>An R package <a href="https://CRAN.R-project.org/package=poibin">poibin</a> was provided along with the paper,<a class="footnote-ref" id="fnref:16" href="#fn:16">16</a> which is available for the computing of the cdf, pmf, quantile function, and random number generation of the Poisson binomial distribution. For computing the PMF, a DFT algorithm or a recursive algorithm can be specified to compute the exact PMF, and approximation methods using the normal and Poisson distribution can also be specified.</li>
<li><a href="https://github.com/tsakim/poibin">poibin – Python implementation</a> – can compute the PMF and CDF, uses the DFT method described in the paper for doing so.</li></ul>
<h2 id="see-also">See also</h2>
<ul>
<li>Mathematics portal</li></ul>
<ul><li><a href="/facts/Le_Cam%2527s_theorem/lN6AW5Cu">Le Cam's theorem</a></li></ul>

<h2 id="references">References</h2>

<ol>
<li id="fn:1">Wang, Y. H. (1993). "On the number of successes in independent trials" (PDF). Statistica Sinica. 3 (2): 295–312. <a href="http://www3.stat.sinica.edu.tw/statistica/oldpdf/A3n23.pdf" target="_blank">http://www3.stat.sinica.edu.tw/statistica/oldpdf/A3n23.pdf</a> <a href="#fnref:1" class="footnote-back-ref">↩</a></li>
<li id="fn:2">Shah, B. K. (1994). "On the distribution of the sum of independent integer valued random variables". American Statistician. 27 (3): 123–124. JSTOR 2683639. <a href="/wiki/JSTOR_(identifier)" target="_blank">/wiki/JSTOR_(identifier)</a> <a href="#fnref:2" class="footnote-back-ref">↩</a></li>
<li id="fn:3">Chen, X. H.; A. P. Dempster; J. S. Liu (1994). "Weighted finite population sampling to maximize entropy" (PDF). Biometrika. 81 (3): 457. doi:10.1093/biomet/81.3.457. <a href="http://www.people.fas.harvard.edu/~junliu/TechRept/94folder/cdl94.pdf" target="_blank">http://www.people.fas.harvard.edu/~junliu/TechRept/94folder/cdl94.pdf</a> <a href="#fnref:3" class="footnote-back-ref">↩</a></li>
<li id="fn:4">Biscarri, William; Zhao, Sihai Dave; Brunner, Robert J. (2018-06-01). "A simple and fast method for computing the Poisson binomial distribution function". Computational Statistics & Data Analysis. 122: 92–100. doi:10.1016/j.csda.2018.01.007. ISSN 0167-9473. <a href="https://www.sciencedirect.com/science/article/pii/S0167947318300082" target="_blank">https://www.sciencedirect.com/science/article/pii/S0167947318300082</a> <a href="#fnref:4" class="footnote-back-ref">↩</a></li>
<li id="fn:5">Fernandez, M.; S. Williams (2010). "Closed-Form Expression for the Poisson-Binomial Probability Density Function". IEEE Transactions on Aerospace and Electronic Systems. 46 (2): 803–817. Bibcode:2010ITAES..46..803F. doi:10.1109/TAES.2010.5461658. S2CID 1456258. <a href="/wiki/Bibcode_(identifier)" target="_blank">/wiki/Bibcode_(identifier)</a> <a href="#fnref:5" class="footnote-back-ref">↩</a></li>
<li id="fn:6">Chen, S. X.; J. S. Liu (1997). "Statistical Applications of the Poisson–Binomial and conditional Bernoulli distributions". Statistica Sinica. 7: 875–892. <a href="http://www3.stat.sinica.edu.tw/statistica/password.asp?vol=7&num=4&art=4" target="_blank">http://www3.stat.sinica.edu.tw/statistica/password.asp?vol=7&num=4&art=4</a> <a href="#fnref:6" class="footnote-back-ref">↩</a></li>
<li id="fn:7">Biscarri, William; Zhao, Sihai Dave; Brunner, Robert J. (2018-06-01). "A simple and fast method for computing the Poisson binomial distribution function". Computational Statistics & Data Analysis. 122: 92–100. doi:10.1016/j.csda.2018.01.007. ISSN 0167-9473. <a href="https://www.sciencedirect.com/science/article/pii/S0167947318300082" target="_blank">https://www.sciencedirect.com/science/article/pii/S0167947318300082</a> <a href="#fnref:7" class="footnote-back-ref">↩</a></li>
<li id="fn:8">Harremoës, P. (2001). "Binomial and Poisson distributions as maximum entropy distributions" (PDF). IEEE Transactions on Information Theory. 47 (5): 2039–2041. doi:10.1109/18.930936. <a href="https://web.archive.org/web/20160116154755id_/http://yaroslavvb.com/papers/harremoes-binomial.pdf" target="_blank">https://web.archive.org/web/20160116154755id_/http://yaroslavvb.com/papers/harremoes-binomial.pdf</a> <a href="#fnref:8" class="footnote-back-ref">↩</a></li>
<li id="fn:9">Shepp, Lawrence; Olkin, Ingram (1981). "Entropy of the sum of independent Bernoulli random variables and of the multinomial distribution". In Gani, J.; Rohatgi, V.K. (eds.). Contributions to probability: A collection of papers dedicated to Eugene Lukacs. New York: Academic Press. pp. 201–206. ISBN 0-12-274460-8. MR 0618689. <a href="0-12-274460-8" target="_blank">0-12-274460-8</a> <a href="#fnref:9" class="footnote-back-ref">↩</a></li>
<li id="fn:10">Hillion, Erwan; Johnson, Oliver (2015-03-05). "A proof of the Shepp–Olkin entropy concavity conjecture". Bernoulli. 23 (4B): 3638–3649. arXiv:1503.01570. doi:10.3150/16-BEJ860. S2CID 8358662. <a href="/wiki/ArXiv_(identifier)" target="_blank">/wiki/ArXiv_(identifier)</a> <a href="#fnref:10" class="footnote-back-ref">↩</a></li>
<li id="fn:11">Hillion, Erwan; Johnson, Oliver (2019-11-09). "A proof of the Shepp–Olkin entropy monotonicity conjecture". Electronic Journal of Probability. 24 (126): 1–14. arXiv:1810.09791. doi:10.1214/19-EJP380. <a href="https://doi.org/10.1214%2F19-EJP380" target="_blank">https://doi.org/10.1214%2F19-EJP380</a> <a href="#fnref:11" class="footnote-back-ref">↩</a></li>
<li id="fn:12">Wang, Y. H. (1993). "On the number of successes in independent trials" (PDF). Statistica Sinica. 3 (2): 295–312. <a href="http://www3.stat.sinica.edu.tw/statistica/oldpdf/A3n23.pdf" target="_blank">http://www3.stat.sinica.edu.tw/statistica/oldpdf/A3n23.pdf</a> <a href="#fnref:12" class="footnote-back-ref">↩</a></li>
<li id="fn:13">Ehm, Werner (1991-01-01). "Binomial approximation to the Poisson binomial distribution". Statistics & Probability Letters. 11 (1): 7–16. doi:10.1016/0167-7152(91)90170-V. ISSN 0167-7152. <a href="https://dx.doi.org/10.1016/0167-7152%2891%2990170-V" target="_blank">https://dx.doi.org/10.1016/0167-7152%2891%2990170-V</a> <a href="#fnref:13" class="footnote-back-ref">↩</a></li>
<li id="fn:14">Barbour, A. D.; Hall, Peter (1984). "On the Rate of Poisson Convergence" (PDF). Mathematical Proceedings of the Cambridge Philosophical Society. 95 (3): 473–480. doi:10.1017/S0305004100061806 (inactive 24 December 2024).{{cite journal}}: CS1 maint: DOI inactive as of December 2024 (link) <a href="https://www.zora.uzh.ch/id/eprint/23054/11/Barbour_1984V.pdf" target="_blank">https://www.zora.uzh.ch/id/eprint/23054/11/Barbour_1984V.pdf</a> <a href="#fnref:14" class="footnote-back-ref">↩</a></li>
<li id="fn:15">Hong, Yili (March 2013). "On computing the distribution function for the Poisson binomial distribution". Computational Statistics & Data Analysis. 59: 41–51. doi:10.1016/j.csda.2012.10.006. <a href="/wiki/Doi_(identifier)" target="_blank">/wiki/Doi_(identifier)</a> <a href="#fnref:15" class="footnote-back-ref">↩</a></li>
<li id="fn:16">Hong, Yili (March 2013). "On computing the distribution function for the Poisson binomial distribution". Computational Statistics & Data Analysis. 59: 41–51. doi:10.1016/j.csda.2012.10.006. <a href="/wiki/Doi_(identifier)" target="_blank">/wiki/Doi_(identifier)</a> <a href="#fnref:16" class="footnote-back-ref">↩</a></li>
</ol>

Poisson binomial distribution open-in-new

Poisson binomial distribution