Dirichlet negative multinomial distribution

<h2 id="motivation">Motivation</h2>
<h3>Dirichlet negative multinomial as a compound distribution</h3>
<p>The Dirichlet distribution is a <a href="/facts/Conjugate_distribution/ScCFcs8b">conjugate distribution</a> to the negative multinomial distribution. This fact leads to an analytically tractable <a href="/facts/Compound_distribution/ccJIMBxT">compound distribution</a>.
For a random vector of category counts 
  
    
      
        
          x
        
        =
        (
        
          x
          
            1
          
        
        ,
        …
        ,
        
          x
          
            m
          
        
        )
      
    
    {\displaystyle \mathbf {x} =(x_{1},\dots ,x_{m})}
  
, distributed according to a <a href="/facts/Negative_multinomial_distribution/L5F1qvyl">negative multinomial distribution</a>, the compound distribution is obtained by integrating on the distribution for p which can be thought of as a <a href="/facts/Random_vector/qMfooyVf">random vector</a> following a Dirichlet distribution:
</p>

Pr
        (
        
          x
        
        ∣
        
          x
          
            0
          
        
        ,
        
          α
          
            0
          
        
        ,
        
          α
        
        )
        =
        
          ∫
          
            
              p
            
          
        
        
          N
          e
          g
          M
          u
          l
          t
        
        (
        
          x
        
        ∣
        
          x
          
            0
          
        
        ,
        
          p
        
        )
        
          D
          i
          r
        
        (
        
          p
        
        ∣
        
          α
          
            0
          
        
        ,
        
          α
        
        )
        
          
            d
          
        
        
          p
        
      
    
    {\displaystyle \Pr(\mathbf {x} \mid x_{0},\alpha _{0},{\boldsymbol {\alpha }})=\int _{\mathbf {p} }\mathrm {NegMult} (\mathbf {x} \mid x_{0},\mathbf {p} )\mathrm {Dir} (\mathbf {p} \mid \alpha _{0},{\boldsymbol {\alpha }}){\textrm {d}}\mathbf {p} }

Pr
        (
        
          x
        
        ∣
        
          x
          
            0
          
        
        ,
        
          α
          
            0
          
        
        ,
        
          α
        
        )
        =
        
          
            
              Γ
              
                (
                
                  
                    ∑
                    
                      i
                      =
                      0
                    
                    
                      m
                    
                  
                  
                    
                      x
                      
                        i
                      
                    
                  
                
                )
              
            
            
              Γ
              (
              
                x
                
                  0
                
              
              )
              
                ∏
                
                  i
                  =
                  1
                
                
                  m
                
              
              
                x
                
                  i
                
              
              !
            
          
        
        
          
            1
            
              
                B
              
              (
              
                
                  α
                
                
                  +
                
              
              )
            
          
        
        
          ∫
          
            
              p
            
          
        
        
          ∏
          
            i
            =
            0
          
          
            m
          
        
        
          p
          
            i
          
          
            
              x
              
                i
              
            
            +
            
              α
              
                i
              
            
            −
            1
          
        
        
          
            d
          
        
        
          p
        
      
    
    {\displaystyle \Pr(\mathbf {x} \mid x_{0},\alpha _{0},{\boldsymbol {\alpha }})={\frac {\Gamma \left(\sum _{i=0}^{m}{x_{i}}\right)}{\Gamma (x_{0})\prod _{i=1}^{m}x_{i}!}}{\frac {1}{\mathrm {B} ({\boldsymbol {\alpha }}_{+})}}\int _{\mathbf {p} }\prod _{i=0}^{m}p_{i}^{x_{i}+\alpha _{i}-1}{\textrm {d}}\mathbf {p} }

<p>which results in the following formula:
</p>

Pr
        (
        
          x
        
        ∣
        
          x
          
            0
          
        
        ,
        
          α
          
            0
          
        
        ,
        
          α
        
        )
        =
        
          
            
              Γ
              
                (
                
                  
                    ∑
                    
                      i
                      =
                      0
                    
                    
                      m
                    
                  
                  
                    
                      x
                      
                        i
                      
                    
                  
                
                )
              
            
            
              Γ
              (
              
                x
                
                  0
                
              
              )
              
                ∏
                
                  i
                  =
                  1
                
                
                  m
                
              
              
                x
                
                  i
                
              
              !
            
          
        
        
          
            
              
                
                  B
                
              
              (
              
                
                  x
                  
                    +
                  
                
              
              +
              
                
                  α
                
                
                  +
                
              
              )
            
            
              
                B
              
              (
              
                
                  α
                
                
                  +
                
              
              )
            
          
        
      
    
    {\displaystyle \Pr(\mathbf {x} \mid x_{0},\alpha _{0},{\boldsymbol {\alpha }})={\frac {\Gamma \left(\sum _{i=0}^{m}{x_{i}}\right)}{\Gamma (x_{0})\prod _{i=1}^{m}x_{i}!}}{\frac {{\mathrm {B} }(\mathbf {x_{+}} +{\boldsymbol {\alpha }}_{+})}{\mathrm {B} ({\boldsymbol {\alpha }}_{+})}}}

<p>where 
  
    
      
        
          
            x
            
              +
            
          
        
      
    
    {\displaystyle \mathbf {x_{+}} }
  
 and 
  
    
      
        
          
            α
          
          
            +
          
        
      
    
    {\displaystyle {\boldsymbol {\alpha }}_{+}}
  
 are the 
  
    
      
        m
        +
        1
      
    
    {\displaystyle m+1}
  
 dimensional vectors created by appending the scalars 
  
    
      
        
          x
          
            0
          
        
      
    
    {\displaystyle x_{0}}
  
 and 
  
    
      
        
          α
          
            0
          
        
      
    
    {\displaystyle \alpha _{0}}
  
 to the 
  
    
      
        m
      
    
    {\displaystyle m}
  
 dimensional vectors 
  
    
      
        
          x
        
      
    
    {\displaystyle \mathbf {x} }
  
 and 
  
    
      
        
          α
        
      
    
    {\displaystyle {\boldsymbol {\alpha }}}
  
  respectively and 
  
    
      
        
          B
        
      
    
    {\displaystyle \mathrm {B} }
  
 is the multivariate version of the <a href="/facts/Beta_function/Dcm1JAxb">beta function</a>. We can write this equation explicitly as
</p>

Pr
        (
        
          x
        
        ∣
        
          x
          
            0
          
        
        ,
        
          α
          
            0
          
        
        ,
        
          α
        
        )
        =
        
          x
          
            0
          
        
        
          
            
              Γ
              (
              
                ∑
                
                  i
                  =
                  0
                
                
                  m
                
              
              
                x
                
                  i
                
              
              )
              Γ
              (
              
                ∑
                
                  i
                  =
                  0
                
                
                  m
                
              
              
                α
                
                  i
                
              
              )
            
            
              Γ
              (
              
                ∑
                
                  i
                  =
                  0
                
                
                  m
                
              
              (
              
                x
                
                  i
                
              
              +
              
                α
                
                  i
                
              
              )
              )
            
          
        
        
          ∏
          
            i
            =
            0
          
          
            m
          
        
        
          
            
              Γ
              (
              
                x
                
                  i
                
              
              +
              
                α
                
                  i
                
              
              )
            
            
              Γ
              (
              
                x
                
                  i
                
              
              +
              1
              )
              Γ
              (
              
                α
                
                  i
                
              
              )
            
          
        
        .
      
    
    {\displaystyle \Pr(\mathbf {x} \mid x_{0},\alpha _{0},{\boldsymbol {\alpha }})=x_{0}{\frac {\Gamma (\sum _{i=0}^{m}x_{i})\Gamma (\sum _{i=0}^{m}\alpha _{i})}{\Gamma (\sum _{i=0}^{m}(x_{i}+\alpha _{i}))}}\prod _{i=0}^{m}{\frac {\Gamma (x_{i}+\alpha _{i})}{\Gamma (x_{i}+1)\Gamma (\alpha _{i})}}.}

<p>Alternative formulations exist. One convenient representation<a class="footnote-ref" id="fnref:1" href="#fn:1"><sup>1</sup></a> is
</p>

Pr
        (
        
          x
        
        ∣
        
          x
          
            0
          
        
        ,
        
          α
          
            0
          
        
        ,
        
          α
        
        )
        =
        
          
            
              Γ
              (
              
                x
                
                  ∙
                
              
              )
            
            
              Γ
              (
              
                x
                
                  0
                
              
              )
              
                ∏
                
                  i
                  =
                  1
                
                
                  m
                
              
              Γ
              (
              
                x
                
                  i
                
              
              +
              1
              )
            
          
        
        ×
        
          
            
              Γ
              (
              
                α
                
                  ∙
                
              
              )
            
            
              
                ∏
                
                  i
                  =
                  0
                
                
                  m
                
              
              Γ
              (
              
                α
                
                  i
                
              
              )
            
          
        
        ×
        
          
            
              
                ∏
                
                  i
                  =
                  0
                
                
                  m
                
              
              Γ
              (
              
                x
                
                  i
                
              
              +
              
                α
                
                  i
                
              
              )
            
            
              Γ
              (
              
                x
                
                  ∙
                
              
              +
              
                α
                
                  ∙
                
              
              )
            
          
        
      
    
    {\displaystyle \Pr(\mathbf {x} \mid x_{0},\alpha _{0},{\boldsymbol {\alpha }})={\frac {\Gamma (x_{\bullet })}{\Gamma (x_{0})\prod _{i=1}^{m}\Gamma (x_{i}+1)}}\times {\frac {\Gamma (\alpha _{\bullet })}{\prod _{i=0}^{m}\Gamma (\alpha _{i})}}\times {\frac {\prod _{i=0}^{m}\Gamma (x_{i}+\alpha _{i})}{\Gamma (x_{\bullet }+\alpha _{\bullet })}}}

<p>where 
  
    
      
        
          x
          
            ∙
          
        
        =
        
          x
          
            0
          
        
        +
        
          x
          
            1
          
        
        +
        ⋯
        +
        
          x
          
            m
          
        
      
    
    {\displaystyle x_{\bullet }=x_{0}+x_{1}+\cdots +x_{m}}
  
 and 
  
    
      
        
          α
          
            ∙
          
        
        =
        
          α
          
            0
          
        
        +
        
          α
          
            1
          
        
        +
        ⋯
        +
        
          α
          
            m
          
        
      
    
    {\displaystyle \alpha _{\bullet }=\alpha _{0}+\alpha _{1}+\cdots +\alpha _{m}}
  
.
</p><p>This can also be written
</p>

Pr
        (
        
          x
        
        ∣
        
          x
          
            0
          
        
        ,
        
          α
          
            0
          
        
        ,
        
          α
        
        )
        =
        
          
            
              
                B
              
              (
              
                x
                
                  ∙
                
              
              ,
              
                α
                
                  ∙
                
              
              )
            
            
              
                B
              
              (
              
                x
                
                  0
                
              
              ,
              
                α
                
                  0
                
              
              )
            
          
        
        
          ∏
          
            i
            =
            1
          
          
            m
          
        
        
          
            
              Γ
              (
              
                x
                
                  i
                
              
              +
              
                α
                
                  i
                
              
              )
            
            
              
                x
                
                  i
                
              
              !
              Γ
              (
              
                α
                
                  i
                
              
              )
            
          
        
        .
      
    
    {\displaystyle \Pr(\mathbf {x} \mid x_{0},\alpha _{0},{\boldsymbol {\alpha }})={\frac {\mathrm {B} (x_{\bullet },\alpha _{\bullet })}{\mathrm {B} (x_{0},\alpha _{0})}}\prod _{i=1}^{m}{\frac {\Gamma (x_{i}+\alpha _{i})}{x_{i}!\Gamma (\alpha _{i})}}.}

<h2 id="properties">Properties</h2>
<h3>Marginal distributions</h3>
<p>To obtain the <a href="/facts/Marginal_distribution/U9XBWAd1">marginal distribution</a> over a subset of Dirichlet negative multinomial random variables, one only needs to drop the irrelevant 
  
    
      
        
          α
          
            i
          
        
      
    
    {\displaystyle \alpha _{i}}
  
's (the variables that one wants to marginalize out) from the 
  
    
      
        
          α
        
      
    
    {\displaystyle {\boldsymbol {\alpha }}}
  
 vector. The joint distribution of the remaining random variates is 
  
    
      
        
          D
          N
          M
        
        (
        
          x
          
            0
          
        
        ,
        
          α
          
            0
          
        
        ,
        
          
            α
            
              (
              −
              )
            
          
        
        )
      
    
    {\displaystyle \mathrm {DNM} (x_{0},\alpha _{0},{\boldsymbol {\alpha _{(-)}}})}
  
 where 
  
    
      
        
          
            α
            
              (
              −
              )
            
          
        
      
    
    {\displaystyle {\boldsymbol {\alpha _{(-)}}}}
  
 is the vector with the removed 
  
    
      
        
          α
          
            i
          
        
      
    
    {\displaystyle \alpha _{i}}
  
's. The univariate marginals are said to be <a href="/facts/Beta_negative_binomial_distribution/FdNmEqFQ">beta negative binomially</a> distributed.
</p>
<h3>Conditional distributions</h3>
<p>If <i>m</i>-dimensional x is partitioned as follows
</p>

x
        
        =
        
          
            [
            
              
                
                  
                    
                      x
                    
                    
                      (
                      1
                      )
                    
                  
                
              
              
                
                  
                    
                      x
                    
                    
                      (
                      2
                      )
                    
                  
                
              
            
            ]
          
        
        
           with sizes 
        
        
          
            [
            
              
                
                  q
                  ×
                  1
                
              
              
                
                  (
                  m
                  −
                  q
                  )
                  ×
                  1
                
              
            
            ]
          
        
      
    
    {\displaystyle \mathbf {x} ={\begin{bmatrix}\mathbf {x} ^{(1)}\\\mathbf {x} ^{(2)}\end{bmatrix}}{\text{ with sizes }}{\begin{bmatrix}q\times 1\\(m-q)\times 1\end{bmatrix}}}

<p>and accordingly  
  
    
      
        
          α
        
      
    
    {\displaystyle {\boldsymbol {\alpha }}}
  
 
</p>

α
        
        =
        
          
            [
            
              
                
                  
                    
                      α
                    
                    
                      (
                      1
                      )
                    
                  
                
              
              
                
                  
                    
                      α
                    
                    
                      (
                      2
                      )
                    
                  
                
              
            
            ]
          
        
        
           with sizes 
        
        
          
            [
            
              
                
                  q
                  ×
                  1
                
              
              
                
                  (
                  m
                  −
                  q
                  )
                  ×
                  1
                
              
            
            ]
          
        
      
    
    {\displaystyle {\boldsymbol {\alpha }}={\begin{bmatrix}{\boldsymbol {\alpha }}^{(1)}\\{\boldsymbol {\alpha }}^{(2)}\end{bmatrix}}{\text{ with sizes }}{\begin{bmatrix}q\times 1\\(m-q)\times 1\end{bmatrix}}}

<p>then the <a href="/facts/Conditional_probability_distribution/0eGm3P9W">conditional distribution</a> of 
  
    
      
        
          
            X
          
          
            (
            1
            )
          
        
      
    
    {\displaystyle \mathbf {X} ^{(1)}}
  
 on 
  
    
      
        
          
            X
          
          
            (
            2
            )
          
        
        =
        
          
            x
          
          
            (
            2
            )
          
        
      
    
    {\displaystyle \mathbf {X} ^{(2)}=\mathbf {x} ^{(2)}}
  
 is 
  
    
      
        
          D
          N
          M
        
        (
        
          x
          
            0
          
          
            ′
          
        
        ,
        
          α
          
            0
          
          
            ′
          
        
        ,
        
          
            α
          
          
            (
            1
            )
          
        
        )
      
    
    {\displaystyle \mathrm {DNM} (x_{0}^{\prime },\alpha _{0}^{\prime },{\boldsymbol {\alpha }}^{(1)})}
  
 where
</p>

x
          
            0
          
          
            ′
          
        
        =
        
          x
          
            0
          
        
        +
        
          ∑
          
            i
            =
            1
          
          
            m
            −
            q
          
        
        
          x
          
            i
          
          
            (
            2
            )
          
        
      
    
    {\displaystyle x_{0}^{\prime }=x_{0}+\sum _{i=1}^{m-q}x_{i}^{(2)}}

α
          
            0
          
          
            ′
          
        
        =
        
          α
          
            0
          
        
        +
        
          ∑
          
            i
            =
            1
          
          
            m
            −
            q
          
        
        
          α
          
            i
          
          
            (
            2
            )
          
        
      
    
    {\displaystyle \alpha _{0}^{\prime }=\alpha _{0}+\sum _{i=1}^{m-q}\alpha _{i}^{(2)}}
  
.
<p>That is,
</p>

Pr
        (
        
          
            x
          
          
            (
            1
            )
          
        
        ∣
        
          
            x
          
          
            (
            2
            )
          
        
        ,
        
          x
          
            0
          
        
        ,
        
          α
          
            0
          
        
        ,
        
          α
        
        )
        =
        
          
            
              
                B
              
              (
              
                x
                
                  ∙
                
              
              ,
              
                α
                
                  ∙
                
              
              )
            
            
              
                B
              
              (
              
                x
                
                  0
                
                
                  ′
                
              
              ,
              
                α
                
                  0
                
                
                  ′
                
              
              )
            
          
        
        
          ∏
          
            i
            =
            1
          
          
            q
          
        
        
          
            
              Γ
              (
              
                x
                
                  i
                
                
                  (
                  1
                  )
                
              
              +
              
                α
                
                  i
                
                
                  (
                  1
                  )
                
              
              )
            
            
              (
              
                x
                
                  i
                
                
                  (
                  1
                  )
                
              
              !
              )
              Γ
              (
              
                α
                
                  i
                
                
                  (
                  1
                  )
                
              
              )
            
          
        
      
    
    {\displaystyle \Pr(\mathbf {x} ^{(1)}\mid \mathbf {x} ^{(2)},x_{0},\alpha _{0},{\boldsymbol {\alpha }})={\frac {\mathrm {B} (x_{\bullet },\alpha _{\bullet })}{\mathrm {B} (x_{0}^{\prime },\alpha _{0}^{\prime })}}\prod _{i=1}^{q}{\frac {\Gamma (x_{i}^{(1)}+\alpha _{i}^{(1)})}{(x_{i}^{(1)}!)\Gamma (\alpha _{i}^{(1)})}}}

<h4>Conditional on the sum</h4>
<p>The conditional distribution of a Dirichlet negative multinomial distribution on 
  
    
      
        
          ∑
          
            i
            =
            1
          
          
            m
          
        
        
          x
          
            i
          
        
        =
        n
      
    
    {\displaystyle \sum _{i=1}^{m}x_{i}=n}
  
 is <a href="/facts/Dirichlet-multinomial_distribution/b5b8vqSS">Dirichlet-multinomial distribution</a> with parameters 
  
    
      
        n
      
    
    {\displaystyle n}
  
  and 
  
    
      
        
          α
        
      
    
    {\displaystyle {\boldsymbol {\alpha }}}
  
. That is
</p>

Pr
        (
        
          x
        
        ∣
        
          ∑
          
            i
            =
            1
          
          
            m
          
        
        
          x
          
            i
          
        
        =
        n
        ,
        
          x
          
            0
          
        
        ,
        
          α
          
            0
          
        
        ,
        
          α
        
        )
        =
        
          
            
              n
              !
              Γ
              
                (
                
                  
                    ∑
                    
                      i
                      =
                      1
                    
                    
                      m
                    
                  
                  
                    α
                    
                      i
                    
                  
                
                )
              
            
            
              Γ
              
                (
                
                  n
                  +
                  
                    ∑
                    
                      i
                      =
                      1
                    
                    
                      m
                    
                  
                  
                    α
                    
                      i
                    
                  
                
                )
              
            
          
        
        
          ∏
          
            i
            =
            1
          
          
            m
          
        
        
          
            
              Γ
              (
              
                x
                
                  i
                
              
              +
              
                α
                
                  i
                
              
              )
            
            
              
                x
                
                  i
                
              
              !
              Γ
              (
              
                α
                
                  i
                
              
              )
            
          
        
      
    
    {\displaystyle \Pr(\mathbf {x} \mid \sum _{i=1}^{m}x_{i}=n,x_{0},\alpha _{0},{\boldsymbol {\alpha }})={\frac {n!\Gamma \left(\sum _{i=1}^{m}\alpha _{i}\right)}{\Gamma \left(n+\sum _{i=1}^{m}\alpha _{i}\right)}}\prod _{i=1}^{m}{\frac {\Gamma (x_{i}+\alpha _{i})}{x_{i}!\Gamma (\alpha _{i})}}}
  
.
<p>Notice that the expression does not depend on 
  
    
      
        
          x
          
            0
          
        
      
    
    {\displaystyle x_{0}}
  
 or 
  
    
      
        
          α
          
            0
          
        
      
    
    {\displaystyle \alpha _{0}}
  
.
</p>
<h3>Aggregation</h3>
<p>If
</p>

X
        =
        (
        
          X
          
            1
          
        
        ,
        …
        ,
        
          X
          
            m
          
        
        )
        ∼
        DNM
        ⁡
        (
        
          x
          
            0
          
        
        ,
        
          α
          
            0
          
        
        ,
        
          α
          
            1
          
        
        ,
        …
        ,
        
          α
          
            m
          
        
        )
      
    
    {\displaystyle X=(X_{1},\ldots ,X_{m})\sim \operatorname {DNM} (x_{0},\alpha _{0},\alpha _{1},\ldots ,\alpha _{m})}

<p>then, if the random variables with positive subscripts <i>i</i> and <i>j</i> are dropped from the vector and replaced by their sum,
</p>

X
          ′
        
        =
        (
        
          X
          
            1
          
        
        ,
        …
        ,
        
          X
          
            i
          
        
        +
        
          X
          
            j
          
        
        ,
        …
        ,
        
          X
          
            m
          
        
        )
        ∼
        DNM
        ⁡
        
          (
          
            
              x
              
                0
              
            
            ,
            
              α
              
                0
              
            
            ,
            
              α
              
                1
              
            
            ,
            …
            ,
            
              α
              
                i
              
            
            +
            
              α
              
                j
              
            
            ,
            …
            ,
            
              α
              
                m
              
            
          
          )
        
        .
      
    
    {\displaystyle X'=(X_{1},\ldots ,X_{i}+X_{j},\ldots ,X_{m})\sim \operatorname {DNM} \left(x_{0},\alpha _{0},\alpha _{1},\ldots ,\alpha _{i}+\alpha _{j},\ldots ,\alpha _{m}\right).}

<h3>Correlation matrix</h3>
<p>For 
  
    
      
        
          α
          
            0
          
        
        >
        2
      
    
    {\displaystyle \alpha _{0}>2}
  
 the entries of the <a href="/facts/Correlation_matrix/egkluAEm">correlation matrix</a> are
</p>

ρ
        (
        
          X
          
            i
          
        
        ,
        
          X
          
            i
          
        
        )
        =
        1.
      
    
    {\displaystyle \rho (X_{i},X_{i})=1.}

ρ
        (
        
          X
          
            i
          
        
        ,
        
          X
          
            j
          
        
        )
        =
        
          
            
              cov
              ⁡
              (
              
                X
                
                  i
                
              
              ,
              
                X
                
                  j
                
              
              )
            
            
              var
              ⁡
              (
              
                X
                
                  i
                
              
              )
              var
              ⁡
              (
              
                X
                
                  j
                
              
              )
            
          
        
        =
        
          
            
              
                
                  α
                  
                    i
                  
                
                
                  α
                  
                    j
                  
                
              
              
                (
                
                  α
                  
                    0
                  
                
                +
                
                  α
                  
                    i
                  
                
                −
                1
                )
                (
                
                  α
                  
                    0
                  
                
                +
                
                  α
                  
                    j
                  
                
                −
                1
                )
              
            
          
        
        .
      
    
    {\displaystyle \rho (X_{i},X_{j})={\frac {\operatorname {cov} (X_{i},X_{j})}{\sqrt {\operatorname {var} (X_{i})\operatorname {var} (X_{j})}}}={\sqrt {\frac {\alpha _{i}\alpha _{j}}{(\alpha _{0}+\alpha _{i}-1)(\alpha _{0}+\alpha _{j}-1)}}}.}

<h3>Heavy tailed</h3>
<p>The Dirichlet negative multinomial is a <a href="/facts/Heavy_tailed_distribution/BHLSHPku">heavy tailed distribution</a>. It does not have a <a href="/facts/List_of_mathematical_jargon/fTRoAtn2">finite</a> <a href="/facts/Mean/swcEd4Pg">mean</a> for 
  
    
      
        
          α
          
            0
          
        
        ≤
        1
      
    
    {\displaystyle \alpha _{0}\leq 1}
  
 and it has infinite <a href="/facts/Covariance_matrix/kLhgjHC6">covariance matrix</a> for 
  
    
      
        
          α
          
            0
          
        
        ≤
        2
      
    
    {\displaystyle \alpha _{0}\leq 2}
  
. Therefore the <a href="/facts/Moment_generating_function/eTmk43QU">moment generating function</a> does not exist.
</p>
<h2 id="applications">Applications</h2>
<h3>Dirichlet negative multinomial as a Pólya urn model</h3>
<p>In the case when the 
  
    
      
        m
        +
        2
      
    
    {\displaystyle m+2}
  
 parameters 
  
    
      
        
          x
          
            0
          
        
        ,
        
          α
          
            0
          
        
      
    
    {\displaystyle x_{0},\alpha _{0}}
  
 and 
  
    
      
        
          α
        
      
    
    {\displaystyle {\boldsymbol {\alpha }}}
  
 are positive integers the Dirichlet negative multinomial can also be motivated by an <a href="/facts/Urn_model/rkjeqn8E">urn model</a> - or more specifically a basic <a href="/facts/P%C3%B3lya_urn_model/B4z1nsww">Pólya urn model</a>. Consider an urn initially containing 
  
    
      
        
          ∑
          
            i
            =
            0
          
          
            m
          
        
        
          
            α
            
              i
            
          
        
      
    
    {\displaystyle \sum _{i=0}^{m}{\alpha _{i}}}
  
 balls of 
  
    
      
        m
        +
        1
      
    
    {\displaystyle m+1}
  
 various colors including 
  
    
      
        
          α
          
            0
          
        
      
    
    {\displaystyle \alpha _{0}}
  
 red balls (the stopping color). The vector 
  
    
      
        
          α
        
      
    
    {\displaystyle {\boldsymbol {\alpha }}}
  
 gives the respective counts of the other balls of various 
  
    
      
        m
      
    
    {\displaystyle m}
  
 non-red colors. At each step of the model, a ball is drawn at random from the urn and replaced, along with one additional ball of the same color. The process is repeated over and over, until 
  
    
      
        
          x
          
            0
          
        
      
    
    {\displaystyle x_{0}}
  
 red colored balls are drawn. The random vector 
  
    
      
        
          X
        
      
    
    {\displaystyle \mathbf {X} }
  
 of observed draws of the other 
  
    
      
        m
      
    
    {\displaystyle m}
  
 non-red colors are distributed according to a 
  
    
      
        
          D
          N
          M
        
        (
        
          x
          
            0
          
        
        ,
        
          α
          
            0
          
        
        ,
        
          α
        
        )
      
    
    {\displaystyle \mathrm {DNM} (x_{0},\alpha _{0},{\boldsymbol {\alpha }})}
  
. Note, at the end of the experiment, the urn always contains the fixed number 
  
    
      
        
          x
          
            0
          
        
        +
        
          α
          
            0
          
        
      
    
    {\displaystyle x_{0}+\alpha _{0}}
  
 of red balls while containing the random number 
  
    
      
        
          X
        
        +
        
          α
        
      
    
    {\displaystyle \mathbf {X} +{\boldsymbol {\alpha }}}
  
 of the other 
  
    
      
        m
      
    
    {\displaystyle m}
  
 colors.
</p>
<h2 id="see-also">See also</h2>
<ul><li><a href="/facts/Beta_negative_binomial_distribution/FdNmEqFQ">Beta negative binomial distribution</a></li>
<li><a href="/facts/Negative_multinomial_distribution/L5F1qvyl">Negative multinomial distribution</a></li>
<li><a href="/facts/Dirichlet-multinomial_distribution/b5b8vqSS">Dirichlet-multinomial distribution</a></li></ul>

<h2 id="references">References</h2>

<ol>
<li id="fn:1"><p>Farewell, Daniel & Farewell, Vernon. (2012). Dirichlet negative multinomial regression for overdispersed correlated count data. Biostatistics (Oxford, England). 14. 10.1093/biostatistics/kxs050. <a href="#fnref:1" class="footnote-back-ref">↩</a></p></li>
</ol>

Dirichlet negative multinomial distribution open-in-new

Dirichlet negative multinomial distribution