Skip to contents

The negative binomial distribution describes the probability of seeing a given number of failtures failures before obtaining rr successes in iid Bernoulli trials with probability parameter pp. More generally, let XNB(r,p)X\sim \operatorname{NB}(r, p) for real parameters r>0r>0, p(0,1)p\in(0,1), then XX has distribution given by (X=x)=Γ(r+x)k!Γ(r)(1p)xpr,x0.\begin{align*} \mathbb{P}(X=x) = \frac{\Gamma(r+x)}{k!\Gamma(r)}(1-p)^x p^r, x\in\mathbb{N}_0. \end{align*} The stats::rbinom function uses this parametrization, and we have 𝔼X=r(1p)p1,𝕍ar(X)=r(1p)p2.\begin{align*} \mathbb{E}X = r(1-p)p^{-1}, \quad \mathbb{V}\!\!\operatorname{ar}(X) = r(1-p)p^{-2}. \end{align*} Further, if X1NB(r1,p)X_1\sim\operatorname{NB}(r_{1}, p) and X2NB(r2,p)X_2\sim\operatorname{NB}(r_{2}, p) are independent then X1+X2NB(r1+r2,p)X_1+X_2\sim\operatorname{NB}(r_{1}+r_{2}, p).

To simulate from a negative binomial distribution with specific mean and variance we can use the carts::rnb function

x <- carts::rnb(1e4, mean = 1, variance = 2)
c(mean(x), var(x))
## [1] 0.98320 1.92371

The negative binomial distribution can also be constructed as a gamma-poisson mixture. We write ZΓ(α,β)Z\sim\Gamma(\alpha, \beta) when ZZ is gamma-distributed with shape α>0\alpha>0 and rate parameter β>0\beta>0, and the density function is given by f(z)=βαΓ(α)zα1exp(βz),z>0.\begin{align*} f(z) = \frac{\beta^\alpha}{\Gamma(\alpha)}z^{\alpha-1}\exp(-\beta z), \quad z>0. \end{align*}

This parametrization leads to 𝔼Z=αβ1,𝕍ar(Z)=αβ2.\begin{align*} \mathbb{E}Z= \alpha\beta^{-1}, \quad \mathbb{V}\!\!\operatorname{ar}(Z) = \alpha\beta^{-2}. \end{align*}

We will exploit that the gamma distribution is closed under both convolution and scaling. Let Z1Γ(α1,β)Z_{1}\sim\Gamma(\alpha_{1}, \beta) and Z2Γ(α2,β)Z_{2}\sim\Gamma(\alpha_{2}, \beta) be independent and λ>0\lambda>0 then Z1+Z2Γ(α1+α2,β),λZ1Γ(α1,λ1β).\begin{align*} Z_1 + Z_2 \sim \Gamma(\alpha_1+\alpha_2, \beta), \quad \lambda Z_1 \sim \Gamma(\alpha_1, \lambda^{-1}\beta). \end{align*}

b <- 2
a1 <- 1
a2 <- 3
r <- 0.3
n <- 1e4
## Shape-rate parametrization
z1 <- rgamma(n, a1, b)
z2 <- rgamma(n, a2, b)
## r(z1+z2) ~  gamma(a1+a2, b/r)
plot(density(r * (z1 + z2)))
curve(dgamma(x, a1 + a2, b / r), add = TRUE, col = "red", lty = 2)

The negative binomial distribution now appears as the gamma-poisson mixture in the following way. If we let XλPois(λ)X\mid\lambda\sim\operatorname{Pois}(\lambda) with stochastic rate λΓ(α,β)\lambda\sim\Gamma(\alpha, \beta), then XNB(α,β(1+β)1)X\sim\operatorname{NB}(\alpha, \beta(1+\beta)^{-1}).

Consider now ZΓ(ν,ν)Z\sim\Gamma(\nu, \nu), hence 𝔼Z=1\mathbb{E}Z=1 and 𝕍ar(Z)=ν1\mathbb{V}\!\!\operatorname{ar}(Z)=\nu^{-1}, and let YZPois(λ0Z)Y\mid Z\sim\operatorname{Pois}(\lambda_{0}Z) for some fixed λ0>0\lambda_0>0, then direct calculations shows that YNB(ν,ν(ν+λ0)1)Y\sim\operatorname{NB}(\nu, \nu(\nu + \lambda_{0})^{-1}) and 𝔼Y=λ0,𝕍ar(Y)=λ02ν1+λ0.\begin{align*} \mathbb{E}Y = \lambda_0, \quad \mathbb{V}\!\!\operatorname{ar}(Y) = \lambda_0^2\nu^{-1} + \lambda_0. \end{align*}

z <- carts::rnb(1e5,
  mean = 2,
  gamma.variance = 3
)
c(mean(z), var(z))
## [1]  2.00573 14.21768
x <- seq(0, 30)
mf <- function(y, x) sapply(x, function(x) mean(y == x))
plot(x, mf(z, x), type = "h", ylab = "p(x)")
y <- rpois(length(z), 2 * rgamma(length(z), 1 / 3, 1 / 3))
lines(x + 0.1, mf(y, x), type = "h", col = "red")

Bibliography