The probability of having k successful trials out of a total of n can be written as the sum 1
where F k {\displaystyle F_{k}} is the set of all subsets of k integers that can be selected from { 1 , 2 , 3 , . . . , n } {\displaystyle \{1,2,3,...,n\}} . For example, if n = 3, then F 2 = { { 1 , 2 } , { 1 , 3 } , { 2 , 3 } } {\displaystyle F_{2}=\left\{\{1,2\},\{1,3\},\{2,3\}\right\}} . A c {\displaystyle A^{c}} is the complement of A {\displaystyle A} , i.e. A c = { 1 , 2 , 3 , … , n } ∖ A {\displaystyle A^{c}=\{1,2,3,\dots ,n\}\smallsetminus A} .
F k {\displaystyle F_{k}} will contain n ! / ( ( n − k ) ! k ! ) {\displaystyle n!/((n-k)!k!)} elements, the sum over which is infeasible to compute in practice unless the number of trials n is small (e.g. if n = 30, F 15 {\displaystyle F_{15}} contains over 1020 elements). However, there are other, more efficient ways to calculate Pr ( K = k ) {\displaystyle \Pr(K=k)} .
As long as none of the success probabilities are equal to one, one can calculate the probability of k successes using the recursive formula 2 3
where
The recursive formula is not numerically stable, and should be avoided if n {\displaystyle n} is greater than approximately 20.
An alternative is to use a divide-and-conquer algorithm: if we assume n = 2 b {\displaystyle n=2^{b}} is a power of two, denoting by f ( p i : j ) {\displaystyle f(p_{i:j})} the Poisson binomial of p i , … , p j {\displaystyle p_{i},\dots ,p_{j}} and ∗ {\displaystyle *} the convolution operator, we have f ( p 1 : 2 b ) = f ( p 1 : 2 b − 1 ) ∗ f ( p 2 b − 1 + 1 : 2 b ) {\displaystyle f(p_{1:2^{b}})=f(p_{1:2^{b-1}})*f(p_{2^{b-1}+1:2^{b}})} .
More generally, the probability mass function of a Poisson binomial can be expressed as the convolution of the vectors P 1 , … , P n {\displaystyle P_{1},\dots ,P_{n}} where P i = [ 1 − p i , p i ] {\displaystyle P_{i}=[1-p_{i},p_{i}]} . This observation leads to the direct convolution (DC) algorithm for computing Pr ( K = 0 ) {\displaystyle \Pr(K=0)} through Pr ( K = n ) {\displaystyle \Pr(K=n)} :
Pr ( K = k ) {\displaystyle \Pr(K=k)} will be found in PMF[k]. DC is numerically stable, exact, and, when implemented as a software routine, exceptionally fast for n ≤ 2000 {\displaystyle n\leq 2000} . It can also be quite fast for larger n {\displaystyle n} , depending on the distribution of the p i {\displaystyle p_{i}} .4
Another possibility is using the discrete Fourier transform.5
where C = exp ( 2 i π n + 1 ) {\displaystyle C=\exp \left({\frac {2i\pi }{n+1}}\right)} and i = − 1 {\displaystyle i={\sqrt {-1}}} .
Still other methods are described in "Statistical Applications of the Poisson-Binomial and conditional Bernoulli distributions" by Chen and Liu6 and in "A simple and fast method for computing the Poisson binomial distribution function" by Biscarri et al.7
The cumulative distribution function (CDF) can be expressed as:
where F ℓ {\displaystyle F_{\ell }} is the set of all subsets of size ℓ {\displaystyle \ell } that can be selected from { 1 , 2 , 3 , … , n } {\displaystyle \{1,2,3,\ldots ,n\}} .
It can be computed by invoking the DC function above, and then adding elements 0 {\displaystyle 0} through k {\displaystyle k} of the returned PMF array.
Since a Poisson binomial distributed variable is a sum of n independent Bernoulli distributed variables, its mean and variance will simply be sums of the mean and variance of the n Bernoulli distributions:
There is no simple formula for the entropy of a Poisson binomial distribution, but the entropy is bounded above by the entropy of a binomial distribution with the same number parameter and the same mean. Therefore, the entropy is also bounded above by the entropy of a Poisson distribution with the same mean.8
The Shepp–Olkin concavity conjecture, due to Lawrence Shepp and Ingram Olkin in 1981, states that the entropy of a Poisson binomial distribution is a concave function of the success probabilities p 1 , p 2 , … , p n {\displaystyle p_{1},p_{2},\dots ,p_{n}} .9 This conjecture was proved by Erwan Hillion and Oliver Johnson in 2015.10 The Shepp–Olkin monotonicity conjecture, also from the same 1981 paper, is that the entropy is monotone increasing in p i {\displaystyle p_{i}} , if all p i ≤ 1 / 2 {\displaystyle p_{i}\leq 1/2} . This conjecture was also proved by Hillion and Johnson, in 2019.11
The probability that a Poisson binomial distribution gets large, can be bounded using its moment generating function as follows (valid when s ≥ μ {\displaystyle s\geq \mu } and for any t > 0 {\displaystyle t>0} ):
where we took t = log ( s / μ ) {\textstyle t=\log \left(s/\mu \right)} . This is similar to the tail bounds of a binomial distribution.
A Poisson binomial distribution P B {\displaystyle PB} can be approximated by a binomial distribution B {\displaystyle B} where μ {\displaystyle \mu } , the mean of the p i {\displaystyle p_{i}} , is the success probability of B {\displaystyle B} . The variances of P B {\displaystyle PB} and B {\displaystyle B} are related by the formula
As can be seen, the closer the p i {\displaystyle p_{i}} are to μ {\displaystyle \mu } , that is, the more the p i {\displaystyle p_{i}} tend to homogeneity, the larger P B {\displaystyle PB} 's variance. When all the p i {\displaystyle p_{i}} are equal to μ {\displaystyle \mu } , P B {\displaystyle PB} becomes B {\displaystyle B} , Var ( P B ) = Var ( B ) {\displaystyle \operatorname {Var} (PB)=\operatorname {Var} (B)} , and the variance is at its maximum.12
Ehm has determined bounds for the total variation distance of P B {\displaystyle PB} and B {\displaystyle B} , in effect providing bounds on the error introduced when approximating P B {\displaystyle PB} with B {\displaystyle B} . Let ν = 1 − μ {\displaystyle \nu =1-\mu } and d ( P B , B ) {\displaystyle d(PB,B)} be the total variation distance of P B {\displaystyle PB} and B {\displaystyle B} . Then
where C ≥ 1 124 {\displaystyle C\geq {\frac {1}{124}}} .
d ( P B , B ) {\displaystyle d(PB,B)} tends to 0 if and only if Var ( P B ) / Var ( B ) {\displaystyle \operatorname {Var} (PB)/\operatorname {Var} (B)} tends to 1.13
A Poisson binomial distribution P B {\displaystyle PB} can also be approximated by a Poisson distribution P o {\displaystyle Po} with mean λ = ∑ i = 1 n p i {\displaystyle \lambda =\sum _{i=1}^{n}p_{i}} . Barbour and Hall have shown that
where d ( P B , B ) {\displaystyle d(PB,B)} is the total variation distance of P B {\displaystyle PB} and P o {\displaystyle Po} .14 It can be seen that the smaller the p i {\displaystyle p_{i}} , the better P o {\displaystyle Po} approximates P B {\displaystyle PB} .
As Var ( P o ) = λ = ∑ i = 1 n p i {\displaystyle \operatorname {Var} (Po)=\lambda =\sum _{i=1}^{n}p_{i}} and Var ( P B ) = ∑ i = 1 n p i − ∑ i = 1 n p i 2 {\displaystyle \operatorname {Var} (PB)=\sum \limits _{i=1}^{n}p_{i}-\sum \limits _{i=1}^{n}p_{i}^{2}} , Var ( P o ) > Var ( P B ) {\displaystyle \operatorname {Var} (\mathrm {Po} )>\operatorname {Var} (PB)} ; so a Poisson binomial distribution's variance is bounded above by a Poisson distribution with λ = ∑ i = 1 n p i {\displaystyle \lambda =\sum _{i=1}^{n}p_{i}} , and the smaller the p i {\displaystyle p_{i}} , the closer Var ( P o ) {\displaystyle \operatorname {Var} (\mathrm {Po} )} will be to Var ( P B ) {\displaystyle \operatorname {Var} (PB)} .
The reference 15 discusses techniques of evaluating the probability mass function of the Poisson binomial distribution. The following software implementations are based on it:
Wang, Y. H. (1993). "On the number of successes in independent trials" (PDF). Statistica Sinica. 3 (2): 295–312. http://www3.stat.sinica.edu.tw/statistica/oldpdf/A3n23.pdf ↩
Shah, B. K. (1994). "On the distribution of the sum of independent integer valued random variables". American Statistician. 27 (3): 123–124. JSTOR 2683639. /wiki/JSTOR_(identifier) ↩
Chen, X. H.; A. P. Dempster; J. S. Liu (1994). "Weighted finite population sampling to maximize entropy" (PDF). Biometrika. 81 (3): 457. doi:10.1093/biomet/81.3.457. http://www.people.fas.harvard.edu/~junliu/TechRept/94folder/cdl94.pdf ↩
Biscarri, William; Zhao, Sihai Dave; Brunner, Robert J. (2018-06-01). "A simple and fast method for computing the Poisson binomial distribution function". Computational Statistics & Data Analysis. 122: 92–100. doi:10.1016/j.csda.2018.01.007. ISSN 0167-9473. https://www.sciencedirect.com/science/article/pii/S0167947318300082 ↩
Fernandez, M.; S. Williams (2010). "Closed-Form Expression for the Poisson-Binomial Probability Density Function". IEEE Transactions on Aerospace and Electronic Systems. 46 (2): 803–817. Bibcode:2010ITAES..46..803F. doi:10.1109/TAES.2010.5461658. S2CID 1456258. /wiki/Bibcode_(identifier) ↩
Chen, S. X.; J. S. Liu (1997). "Statistical Applications of the Poisson–Binomial and conditional Bernoulli distributions". Statistica Sinica. 7: 875–892. http://www3.stat.sinica.edu.tw/statistica/password.asp?vol=7&num=4&art=4 ↩
Harremoës, P. (2001). "Binomial and Poisson distributions as maximum entropy distributions" (PDF). IEEE Transactions on Information Theory. 47 (5): 2039–2041. doi:10.1109/18.930936. https://web.archive.org/web/20160116154755id_/http://yaroslavvb.com/papers/harremoes-binomial.pdf ↩
Shepp, Lawrence; Olkin, Ingram (1981). "Entropy of the sum of independent Bernoulli random variables and of the multinomial distribution". In Gani, J.; Rohatgi, V.K. (eds.). Contributions to probability: A collection of papers dedicated to Eugene Lukacs. New York: Academic Press. pp. 201–206. ISBN 0-12-274460-8. MR 0618689. 0-12-274460-8 ↩
Hillion, Erwan; Johnson, Oliver (2015-03-05). "A proof of the Shepp–Olkin entropy concavity conjecture". Bernoulli. 23 (4B): 3638–3649. arXiv:1503.01570. doi:10.3150/16-BEJ860. S2CID 8358662. /wiki/ArXiv_(identifier) ↩
Hillion, Erwan; Johnson, Oliver (2019-11-09). "A proof of the Shepp–Olkin entropy monotonicity conjecture". Electronic Journal of Probability. 24 (126): 1–14. arXiv:1810.09791. doi:10.1214/19-EJP380. https://doi.org/10.1214%2F19-EJP380 ↩
Ehm, Werner (1991-01-01). "Binomial approximation to the Poisson binomial distribution". Statistics & Probability Letters. 11 (1): 7–16. doi:10.1016/0167-7152(91)90170-V. ISSN 0167-7152. https://dx.doi.org/10.1016/0167-7152%2891%2990170-V ↩
Barbour, A. D.; Hall, Peter (1984). "On the Rate of Poisson Convergence" (PDF). Mathematical Proceedings of the Cambridge Philosophical Society. 95 (3): 473–480. doi:10.1017/S0305004100061806 (inactive 24 December 2024).{{cite journal}}: CS1 maint: DOI inactive as of December 2024 (link) https://www.zora.uzh.ch/id/eprint/23054/11/Barbour_1984V.pdf ↩
Hong, Yili (March 2013). "On computing the distribution function for the Poisson binomial distribution". Computational Statistics & Data Analysis. 59: 41–51. doi:10.1016/j.csda.2012.10.006. /wiki/Doi_(identifier) ↩