¿Distribución que describe la diferencia entre variables distribuidas binomiales negativas?

18

Una distribución de Skellam describe la diferencia entre dos variables que tienen distribuciones de Poisson. ¿Existe una distribución similar que describa la diferencia entre las variables que siguen distribuciones binomiales negativas?

Mis datos son producidos por un proceso de Poisson, pero incluye una buena cantidad de ruido, lo que lleva a una sobredispersión en la distribución. Por lo tanto, modelar los datos con una distribución binomial negativa (NB) funciona bien. Si quiero modelar la diferencia entre dos de estos conjuntos de datos NB, ¿cuáles son mis opciones? Si ayuda, asuma medias y varianzas similares para los dos conjuntos.

— Chrisamiller
fuente

Hay muchas distribuciones que son fáciles de describir y que no tienen nombres estándar.

— Glen_b -Reinstale a Mónica el

22

No sé el nombre de esta distribución, pero puedes derivarla de la ley de probabilidad total. Supongamos que tienen distribuciones binomiales negativas con parámetros y , respectivamente. Estoy usando la parametrización donde representan el número de éxitos antes de las fallas 'th y ' th, respectivamente. Luego, $X, Y$ $(r_{1}, p_{1})$ $(r_{2}, p_{2})$ $X,Y$ $r_{1}$ $r_{2}$

P (X - Y = k) = E_{Y} (P (X - Y = k)) = E_{Y} (P (X = k + Y)) = \sum_{y = 0}^{\infty} P (Y = y) P (X = k + y)

$P(X - Y = k) = E_{Y} \Big( P(X-Y = k) \Big) = E_{Y} \Big( P(X = k+Y) \Big) = \sum_{y=0}^{\infty} P(Y=y)P(X = k+y)$

Sabemos

P (X = k + y) = (\binom{k + y + r_{1} - 1}{k + y}) (1 - p_{1})^{r_{1}} p_{1}^{k + y}

$P(X = k + y) = {k+y+r_{1}-1 \choose k+y} (1-p_{1})^{r_{1}} p_{1}^{k+y}$

y

P (Y = y) = (\binom{y + r_{2} - 1}{y}) (1 - p_{2})^{r_{2}} p_{2}^{y}

$P(Y = y) = {y+r_{2}-1 \choose y} (1-p_{2})^{r_{2}} p_{2}^{y}$

entonces

P (X - Y = k) = \sum_{y = 0}^{\infty} (\binom{y + r_{2} - 1}{y}) (1 - p_{2})^{r_{2}} p_{2}^{y} \cdot (\binom{k + y + r_{1} - 1}{k + y}) (1 - p_{1})^{r_{1}} p_{1}^{k + y}

$P(X-Y=k) = \sum_{y=0}^{\infty} {y+r_{2}-1 \choose y} (1-p_{2})^{r_{2}} p_{2}^{y} \cdot {k+y+r_{1}-1 \choose k+y} (1-p_{1})^{r_{1}} p_{1}^{k+y}$

Eso no es bonito (¡ay!). La única simplificación que veo de inmediato es

p_{1}^{k} (1 - p_{1})^{r_{1}} (1 - p_{2})^{r_{2}} \sum_{y = 0}^{\infty} (p_{1} p_{2})^{y} (\binom{y + r_{2} - 1}{y}) (\binom{k + y + r_{1} - 1}{k + y})

$p_{1}^{k} (1-p_{1})^{r_{1}} (1-p_{2})^{r_{2}} \sum_{y=0}^{\infty} (p_{1}p_{2})^{y} {y+r_{2}-1 \choose y} {k+y+r_{1}-1 \choose k+y}$

que sigue siendo bastante feo. No estoy seguro de si esto es útil, pero también se puede volver a escribir como

\frac{p_{1}^{k} (1 - p_{1})^{r_{1}} (1 - p_{2})^{r_{2}}}{(r_{1} - 1)! (r_{2} - 1)!} \sum_{y = 0}^{\infty} (p_{1} p_{2})^{y} \frac{(y + r_{2} - 1)! (k + y + r_{1} - 1)!}{y! (k + y)!}

$\frac{ p_{1}^{k} (1-p_{1})^{r_{1}} (1-p_{2})^{r_{2}} }{ (r_{1}-1)! (r_{2}-1)! } \sum_{y=0}^{\infty} (p_{1}p_{2})^{y} \frac{ (y+r_{2}-1)! (k+y+r_{1}-1)! }{y! (k+y)! }$

$p$ -values

I verified with simulation that the above calculation is correct. Here is a crude R function to calculate this mass function and carry out a few simulations

  f = function(k,r1,r2,p1,p2,UB)  
  {

  S=0
  const = (p1^k) * ((1-p1)^r1) * ((1-p2)^r2)
  const = const/( factorial(r1-1) * factorial(r2-1) ) 

  for(y in 0:UB)
  {
     iy = ((p1*p2)^y) * factorial(y+r2-1)*factorial(k+y+r1-1)
     iy = iy/( factorial(y)*factorial(y+k) )
     S = S + iy
  }

  return(S*const)
  }

 ### Sims
 r1 = 6; r2 = 4; 
 p1 = .7; p2 = .53; 
 X = rnbinom(1e5,r1,p1)
 Y = rnbinom(1e5,r2,p2)
 mean( (X-Y) == 2 ) 
 [1] 0.08508
 f(2,r1,r2,1-p1,1-p2,20)
 [1] 0.08509068
 mean( (X-Y) == 1 ) 
 [1] 0.11581
 f(1,r1,r2,1-p1,1-p2,20)
 [1] 0.1162279
 mean( (X-Y) == 0 ) 
 [1] 0.13888
 f(0,r1,r2,1-p1,1-p2,20)
 [1] 0.1363209

I've found the sum converges very quickly for all of the values I tried, so setting UB higher than 10 or so is not necessary. Note that R's built in rnbinom function parameterizes the negative binomial in terms of the number of failures before the $r$ 'th success, in which case you'd need to replace all of the $p_{1}, p_{2}$ 's in the above formulas with $1-p_{1}, 1-p_{2}$ for compatibility.

— Macro
fuente

Thanks. I'll need some time to digest this, but your help is much appreciated.

— chrisamiller

-2

Yes. skewed generalized discrete Laplace distribution is the difference of two negative binomial distributed random variables. For more clarifications refer the online available article "skewed generalized discrete Laplace distribution" by seetha Lekshmi.V. and simi sebastian

— simi sebastian
fuente

4

Can you provide a complete citation & a summary of the information in the paper so future readers can decide if it's something they want to pursue?

— gung - Reinstate Monica

The article mentioned by @simi-sebastian (the author?) is ijmsi.org/Papers/Volume.2.Issue.3/K0230950102.pdf. However, unless I'm mistaken, it only addresses the case of the Negative Binomial variables

X

$X$ and

Y

$Y$ both having the same dispersion parameter, rather than the more general case described by the original poster.

— Constantinos