La relación entre la distribución gamma y la distribución normal.

Recientemente encontré que era necesario derivar un pdf para el cuadrado de una variable aleatoria normal con media 0. Por cualquier razón, elegí no normalizar la varianza de antemano. Si hice esto correctamente, este pdf es el siguiente:

N^{2} (x; σ^{2}) = \frac{1}{σ \sqrt{2 π} \sqrt{x}} e^{\frac{- x}{2 σ^{2}}}

$N^2(x; \sigma^2) = \frac{1}{\sigma \sqrt{2 \pi} \sqrt{x}} e^{\frac{-x}{2\sigma^2}}$

Me di cuenta de que esto era solo una parametrización de una distribución gamma:

N^{2} (x; σ^{2}) = Gamma (x; \frac{1}{2}, 2 σ^{2})

$N^2(x; \sigma^2) = \operatorname{Gamma}(x; \frac{1}{2}, 2 \sigma^2)$

Y luego, por el hecho de que la suma de dos gammas (con el mismo parámetro de escala) es igual a otra gamma, se deduce que la gamma es equivalente a la suma de $k$ variables aleatorias normales al cuadrado.

N_{Σ}^{2} (x; k, σ^{2}) = Gamma (x; \frac{k}{2}, 2 σ^{2})

$N^2_\Sigma(x; k, \sigma^2) = \operatorname{Gamma}(x; \frac{k}{2}, 2 \sigma^2)$

Esto fue un poco sorprendente para mí. Aunque sabía que la distribución $\chi^2$ , una distribución de la suma de las RV normales estándar al cuadrado , era un caso especial de la gamma, no me di cuenta de que la gamma era esencialmente una generalización que permitía la suma de variables aleatorias normales de cualquier variación. Esto también conduce a otras caracterizaciones que no había visto antes, como la distribución exponencial equivalente a la suma de dos distribuciones normales al cuadrado.

Todo esto es algo misterioso para mí. ¿Es la distribución normal fundamental para la derivación de la distribución gamma, de la manera que describí anteriormente? La mayoría de los recursos que verifiqué no mencionan que las dos distribuciones están intrínsecamente relacionadas de esta manera, o incluso, de hecho, describen cómo se deriva la gamma. ¿Esto me hace pensar que está en juego una verdad de nivel inferior que simplemente he resaltado de una manera enrevesada?

normal-distribution gamma-distribution

— timxyz
fuente

Muchos libros de texto de pregrado sobre teoría de la probabilidad mencionan todos los resultados anteriores; pero tal vez los textos estadísticos no cubren estas ideas? En cualquier caso, una variable aleatoria

es solo

donde

es una variable aleatoria normal estándar, y entonces (para las variables iid)

es simplemente un escalado

N (0, σ^{2})

$N(0,\sigma^2)$

Y_{i}

$Y_i$

σ X_{i}

$\sigma X_i$

X_{i}

$X_i$

\sum_{i} Y_{i}^{2} = σ^{2} \sum_{i} X_{i}^{2}

$\sum_i Y_i^2 = \sigma^2 \sum_i X_i^2$

χ^{2}

$\chi^2$ La variable aleatoria no es sorprendente para aquellos que han estudiado la teoría de la probabilidad.

— Dilip Sarwate

Soy de un entorno de visión por computadora, así que normalmente no encuentro la teoría de la probabilidad. Ninguno de mis libros de texto (o Wikipedia) menciona esta interpretación. Supongo que también pregunto, ¿qué tiene de especial la suma del cuadrado de dos distribuciones normales que lo convierte en un buen modelo para el tiempo de espera (es decir, la distribución exponencial). Todavía se siente como si me faltara algo más profundo.

— timxyz

Dado que Wikipedia define la distribución de chi-cuadrado como una suma de normales al cuadrado en en.wikipedia.org/wiki/Chi-squared_distribution#Definition y menciona que el chi-cuadrado es un caso especial de Gamma (en en.wikipedia.org/wiki / Gamma_distribution # Others ), apenas se puede afirmar que estas relaciones no se conocen bien. La varianza misma simplemente establece la unidad de medida (un parámetro de escala) en todos los casos y, por lo tanto, no introduce ninguna complicación adicional.

— whuber

Si bien estos resultados son bien conocidos en el campo de la probabilidad y las estadísticas, felicidades @timxyz por redescubrirlos en su propio análisis.

— Restablece a Mónica

La conexión no es misteriosa, se debe a que son miembros de la familia exponencial de distribuciones cuya propiedad sobresaliente es que se puede llegar a ellos mediante la sustitución de variables y / o parámetros. Vea la respuesta más larga a continuación con ejemplos.

— Carl

Respuestas:

Como señaló el comentario del profesor Sarwate, las relaciones entre el cuadrado normal y el chi-cuadrado son un hecho muy difundido, como también debería ser el hecho de que un chi-cuadrado es solo un caso especial de la distribución Gamma:

X \sim N (0, σ^{2}) \Rightarrow X^{2} / σ^{2} \sim χ_{1}^{2} \Rightarrow X^{2} \sim σ^{2} χ_{1}^{2} = Gamma (\frac{1}{2}, 2 σ^{2})

$X \sim N(0,\sigma^2) \Rightarrow X^2/\sigma^2 \sim \mathcal \chi^2_1 \Rightarrow X^2 \sim \sigma^2\mathcal \chi^2_1= \text{Gamma}\left(\frac 12, 2\sigma^2\right)$

la última igualdad que sigue a la propiedad de escala de Gamma.

En cuanto a la relación con el exponencial, para ser exactos, es la suma de dos normales al cuadrado de media cero cada uno escalado por la varianza del otro , lo que conduce a la distribución exponencial:

X_{1} \sim N (0, σ_{1}^{2}), X_{2} \sim N (0, σ_{2}^{2}) \Rightarrow \frac{X_{1}^{2}}{σ_{1}^{2}} + \frac{X_{2}^{2}}{σ_{2}^{2}} \sim χ_{2}^{2} \Rightarrow \frac{σ_{2}^{2} X_{1}^{2} + σ_{1}^{2} X_{2}^{2}}{σ_{1}^{2} σ_{2}^{2}} \sim χ_{2}^{2}

$X_1 \sim N(0,\sigma^2_1),\;\; X_2 \sim N(0,\sigma^2_2) \Rightarrow \frac{X_1^2}{\sigma^2_1}+\frac{X_2^2}{\sigma^2_2} \sim \mathcal \chi^2_2 \Rightarrow \frac{\sigma^2_2X_1^2+ \sigma^2_1X_2^2}{\sigma^2_1\sigma^2_2} \sim \mathcal \chi^2_2$

\Rightarrow σ_{2}^{2} X_{1}^{2} + σ_{1}^{2} X_{2}^{2} \sim σ_{1}^{2} σ_{2}^{2} χ_{2}^{2} = Gamma (1, 2 σ_{1}^{2} σ_{2}^{2}) = Exp (\frac{1}{2 σ_{1}^{2} σ_{2}^{2}})

$\Rightarrow \sigma^2_2X_1^2+ \sigma^2_1X_2^2 \sim \sigma^2_1\sigma^2_2\mathcal \chi^2_2 = \text{Gamma}\left(1, 2\sigma^2_1\sigma^2_2\right) = \text{Exp}( {1\over {2\sigma^2_1\sigma^2_2}})$

But the suspicion that there is "something special" or "deeper" in the sum of two squared zero mean normals that "makes them a good model for waiting time" is unfounded: First of all, what is special about the Exponential distribution that makes it a good model for "waiting time"? Memorylessness of course, but is there something "deeper" here, or just the simple functional form of the Exponential distribution function, and the properties of $e$ ? Unique properties are scattered around all over Mathematics, and most of the time, they don't reflect some "deeper intuition" or "structure" - they just exist (thankfully).

Second, the square of a variable has very little relation with its level. Just consider $f(x) = x$ in, say, $[-2,\,2]$ :

enter image description here

...or graph the standard normal density against the chi-square density: they reflect and represent totally different stochastic behaviors, even though they are so intimately related, since the second is the density of a variable that is the square of the first. The normal may be a very important pillar of the mathematical system we have developed to model stochastic behavior - but once you square it, it becomes something totally else.

— Alecos Papadopoulos
fuente

Thanks for addressing in particular the questions in my last paragraph.

— timxyz

You 're welcome. I have to admit I am glad my answer reached the original OP 26 months after the question was posted.

— Alecos Papadopoulos

Let us address the question posed, This is all somewhat mysterious to me. Is the normal distribution fundamental to the derivation of the gamma distribution...? No mystery really, it is simply that the normal distribution and the gamma distribution are members, among others of the exponential family of distributions, which family is defined by the ability to convert between equational forms by substitution of parameters and/or variables. As a consequence, there are many conversions by substitution between distributions, a few of which are summarized in the figure below.

LEEMIS, Lawrence M.; Jacquelyn T. MCQUESTON (February 2008). "Univariate Distribution Relationships" (PDF). American Statistician. 62 (1): 45–53. doi:10.1198/000313008x270448 cite

Here are two normal and gamma distribution relationships in greater detail (among an unknown number of others, like via chi-squared and beta).

First A more direct relationship between the gamma distribution (GD) and the normal distribution (ND) with mean zero follows. Simply put, the GD becomes normal in shape as its shape parameter is allowed to increase. Proving that that is the case is more difficult. For the GD,

GD (z; a, b) = \begin{array}{cc} {\begin{cases} \frac{b^{- a} z^{a - 1} e^{- \frac{z}{b}}}{Γ (a)} & z > 0 \\ 0 & other \end{cases} . \end{array}

$\text{GD}(z;a,b)=\begin{array}{cc} & \begin{cases} \dfrac{b^{-a} z^{a-1} e^{-\dfrac{z}{b}}}{\Gamma (a)} & z>0 \\ 0 & \text{other} \\ \end{cases} \,. \\ \end{array}$

As the GD shape parameter $a\rightarrow \infty$ , the GD shape becomes more symmetric and normal, however, as the mean increases with increasing $a$ , we have to left shift the GD by $(a-1) \sqrt{\dfrac{1}{a}} k$ to hold it stationary, and finally, if we wish to maintain the same standard deviation for our shifted GD, we have to decrease the scale parameter ( $b$ ) proportional to $\sqrt{\dfrac{1}{a}}$ .

To wit, to transform a GD to a limiting case ND we set the standard deviation to be a constant ( $k$ ) by letting $b=\sqrt{\dfrac{1}{a}} k$ and shift the GD to the left to have a mode of zero by substituting $z=(a-1) \sqrt{\dfrac{1}{a}} k+x\ .$ Then

GD ((a - 1) \sqrt{\frac{1}{a}} k + x; a, \sqrt{\frac{1}{a}} k) = \begin{array}{cc} {\begin{cases} \frac{{(\frac{k}{\sqrt{a}})}^{- a} e^{- \frac{\sqrt{a} x}{k} - a + 1} {(\frac{(a - 1) k}{\sqrt{a}} + x)}^{a - 1}}{Γ (a)} & x > \frac{k (1 - a)}{\sqrt{a}} \\ 0 & other \end{cases} \end{array} .

$\text{GD}\left((a-1) \sqrt{\frac{1}{a}} k+x;\ a,\ \sqrt{\frac{1}{a}} k\right)=\begin{array}{cc} & \begin{cases} \dfrac{\left(\dfrac{k}{\sqrt{a}}\right)^{-a} e^{-\dfrac{\sqrt{a} x}{k}-a+1} \left(\dfrac{(a-1) k}{\sqrt{a}}+x\right)^{a-1}}{\Gamma (a)} & x>\dfrac{k(1-a)}{\sqrt{a}} \\ 0 & \text{other} \\ \end{cases} \\ \end{array}\,.$

Note that in the limit as $a\rightarrow\infty$ the most negative value of $x$ for which this GD is nonzero $\rightarrow -\infty$ . That is, the semi-infinite GD support becomes infinite. Taking the limit as $a\rightarrow \infty$ of the reparameterized GD, we find

lim_{a \to \infty} \frac{{(\frac{k}{\sqrt{a}})}^{- a} e^{- \frac{\sqrt{a} x}{k} - a + 1} {(\frac{(a - 1) k}{\sqrt{a}} + x)}^{a - 1}}{Γ (a)} = \frac{e^{- \frac{x^{2}}{2 k^{2}}}}{\sqrt{2 π} k} = ND (x; 0, k^{2})

$\lim_{a\to \infty } \, \frac{\left(\frac{k}{\sqrt{a}}\right)^{-a} e^{-\frac{\sqrt{a} x}{k}-a+1} \left(\frac{(a-1) k}{\sqrt{a}}+x\right)^{a-1}}{\Gamma (a)}=\dfrac{e^{-\dfrac{x^2}{2 k^2}}}{\sqrt{2 \pi } k}=\text{ND}\left(x;0,k^2\right)$

Graphically for $k=2$ and $a=1,2,4,8,16,32,64$ the GD is in blue and the limiting $\text{ND}\left(x;0,\ 2^2\right)$ is in orange, below

Second Let us make the point that due to the similarity of form between these distributions, one can pretty much develop relationships between the gamma and normal distributions by pulling them out of thin air. To wit, we next develop an "unfolded" gamma distribution generalization of a normal distribution.

Note first that it is the semi-infinite support of the gamma distribution that impedes a more direct relationship with the normal distribution. However, that impediment can be removed when considering the half-normal distribution, which also has a semi-infinite support. Thus, one can generalize the normal distribution (ND) by first folding it to be half-normal (HND), relating that to the generalized gamma distribution (GD), then for our tour de force, we "unfold" both (HND and GD) to make a generalized ND (a GND), thusly.

The generalized gamma distribution

GD (x; α, β, γ, μ) = \begin{array}{cc} {\begin{cases} \frac{γ e^{- {(\frac{x - μ}{β})}^{γ}} {(\frac{x - μ}{β})}^{α γ - 1}}{β Γ (α)} & x > μ \\ 0 & other \end{cases} \end{array},

$\text{GD}\left(x;\alpha ,\beta ,\gamma ,\mu \right)=\begin{array}{cc} & \begin{cases} \dfrac{\gamma e^{-\left(\dfrac{x-\mu }{\beta }\right)^{\gamma }} \left(\dfrac{x-\mu }{\beta }\right)^{\alpha \gamma -1}}{\beta \,\Gamma (\alpha )} & x>\mu \\ 0 & \text{other} \\ \end{cases} \\ \end{array}\,,$

Can be reparameterized to be the half-normal distribution,

GD (x; \frac{1}{2}, \frac{\sqrt{π}}{θ}, 2, 0) = \begin{array}{cc} {\begin{cases} \frac{2 θ e^{- \frac{θ^{2} x^{2}}{π}}}{π} & x > 0 \\ 0 & other \end{cases} \end{array} = HND (x; θ)

$\text{GD}\left(x;\frac{1}{2},\frac{\sqrt{\pi }}{\theta },2,0 \right)=\begin{array}{cc} & \begin{cases} \dfrac{2 \theta e^{-\dfrac{\theta ^2 x^2}{\pi }}}{\pi } & x>0 \\ 0 & \text{other} \\ \end{cases} \\ \end{array}\,\,\,=\text{HND}(x;\theta)$

Note that $\theta=\frac{\sqrt{\pi}}{\sigma\sqrt{2}}.$ Thus,

ND (x; 0, σ^{2}) = \frac{1}{2} HND (x; θ) + \frac{1}{2} HND (- x; θ) = \frac{1}{2} GD (x; \frac{1}{2}, \frac{\sqrt{π}}{θ}, 2, 0) + \frac{1}{2} GD (- x; \frac{1}{2}, \frac{\sqrt{π}}{θ}, 2, 0),

$\text{ND}\left(x;0,\sigma^2\right)=\frac{1}{2}\text{HND}(x;\theta)+\frac{1}{2}\text{HND}(-x;\theta)=\frac{1}{2}\text{GD}\left(x;\frac{1}{2},\frac{\sqrt{\pi }}{\theta },2,0 \right)+\frac{1}{2}\text{GD}\left(-x;\frac{1}{2},\frac{\sqrt{\pi }}{\theta },2,0 \right)\,,$

which implies that

\begin{aligned} GND (x; μ, α, β) & = \frac{1}{2} GD (x; \frac{1}{β}, α, β, μ) + \frac{1}{2} GD (- x; \frac{1}{β}, α, β, μ) \\ = \frac{β e^{- {(\frac{| x - μ |}{α})}^{β}}}{2 α Γ (\frac{1}{β})} \end{aligned},

$\begin{align} \text{GND}(x;\mu,\alpha,\beta) &= \frac{1}{2}\text{GD}\left(x;\frac{1}{\beta},\alpha,\beta,\mu \right)+\frac{1}{2}\text{GD}\left(-x;\frac{1}{\beta},\alpha,\beta,\mu \right)\\ &= \frac{\beta e^{-\left(\dfrac{\left|x-\mu\right|}{\alpha }\right)^{\mathrm{\Large{\beta}}}}}{2 \alpha \Gamma \left(\dfrac{1}{\beta }\right)}\\ \end{align} \,,$

is a generalization of the normal distribution, where $\mu$ is the location, $\alpha>0$ is the scale, and $\beta>0$ is the shape and where $\beta=2$ yields a normal distribution. It includes the Laplace distribution when $\beta=1$ . As $\beta\rightarrow\infty$ , the density converges pointwise to a uniform density on $(\mu-\alpha,\mu+\alpha)$ . Below is the generalized normal distribution plotted for $\alpha =\frac{\sqrt{\pi} }{2}\,,\beta=1/2,1,4$ in blue with the normal case $\alpha =\frac{\sqrt{\pi} }{2},\,\beta=2$ in orange.

The above can be seen as the generalized normal distribution Version 1 and in different parameterizations is known as the exponential power distribution, and the generalized error distribution, which are in turn one of several other generalized normal distributions.

— Carl
fuente

The derivation of the chi-squared distribution from the normal distribution is much analogous to the derivation of the gamma distribution from the exponential distribution.

We should be able to generalize this:

If the $X_i$ are independent variables from a generalized normal distribution with power coefficient $m$ then $Y = \sum_{i}^n {X_i}^m$ can be related to some scaled Chi-squared distribution (with "degrees of freedom" equal to $n/m$ ).

The analogy is as following:

Normal and Chi-squared distributions relate to the sum of squares

The joint density distribution of multiple independent standard normal distributed variables depends on $\sum x_i^2$
$f(x_1, x_2, ... ,x_n) = \frac{\exp \left( {-0.5\sum_{i=1}^{n}{x_i}^2}\right)}{(2\pi)^{n/2}}$
If $X_i \sim N(0,1)$

then $\sum_{i=1}^n {X_i}^2 \sim \chi^2(\nu)$

Exponential and gamma distributions relate to the regular sum

The joint density distribution of multiple independent exponential distributed variables depends on $\sum x_i$

$f(x_1, x_2, ... ,x_n) = \frac{\exp \left( -\lambda\sum_{i=1}^{n}{x_i} \right)}{\lambda^{-n}}$
If $X_i \sim Exp(\lambda)$

then $\sum_{i=1}^n X_i \sim \text{Gamma}(n,\lambda)$

The derivation can be done by a change of variables integrating not over all $x_1,x_2,...x_n$ but instead only over the summed term (this is what Pearson did in 1900). This unfolds very similar in both cases.

For the $\chi^2$ distribution:

\begin{array}{rcl} f_{χ^{2} (n)} (s) d s & = & \frac{e^{- s / 2}}{{(2 π)}^{n / 2}} \frac{d V}{d s} d s \\ = & \frac{e^{- s / 2}}{{(2 π)}^{n / 2}} \frac{π^{n / 2}}{Γ (n / 2)} s^{n / 2 - 1} d s \\ = & \frac{1}{2^{n / 2} Γ (n / 2)} s^{n / 2 - 1} e^{- s / 2} d s \end{array}

$\begin{array}{rcl} f_{\chi^2(n)}(s) ds &=& \frac{e^{-s/2}}{\left( 2\pi \right)^{n/2}} \frac{dV}{ds} ds\\ &=& \frac{e^{-s/2}}{\left( 2\pi \right)^{n/2}} \frac{\pi^{n/2}}{\Gamma(n/2)}s^{n/2-1} ds \\ &=& \frac{1}{2^{n/2}\Gamma(n/2)}s^{n/2-1}e^{-s/2} ds \\ \end{array}$

Where $V(s) = \frac{\pi^{n/2}}{\Gamma (n/2+1)}s^{n/2}$ is the n-dimensional volume of an n-ball with squared radius $s$ .

For the gamma distribution:

\begin{array}{rcl} f_{G (n, λ)} (s) d s & = & \frac{e^{- λ s}}{λ^{- n}} \frac{d V}{d s} d s \\ = & \frac{e^{- λ s}}{λ^{- n}} n \frac{s^{n - 1}}{n!} d s \\ = & \frac{λ^{n}}{Γ (n)} s^{n - 1} e^{- λ s} d s \end{array}

$\begin{array}{rcl} f_{G(n,\lambda)}(s) ds &=& \frac{e^{-\lambda s}}{\lambda^{-n}} \frac{dV}{ds} ds\\ &=& \frac{e^{-\lambda s}}{\lambda^{-n}} n \frac{s^{n-1}}{n!}ds \\ &=& \frac{\lambda^{n}}{ \Gamma(n)} s^{n-1} e^{-\lambda s} ds \\ \end{array}$

Where $V(s) = \frac{s^n}{n!}$ is the n-dimensional volume of a n-polytope with $\sum x_i < s$ .

The gamma distribution can be seen as the waiting time $Y$ for the $n$ -th event in a Poisson process which is the distributed as the sum of $n$ exponentially distributed variables.

As Alecos Papadopoulos already noted there is no deeper connection that makes sums of squared normal variables 'a good model for waiting time'. The gamma distribution is the distribution for a sum of generalized normal distributed variables. That is how the two come together.

But the type of sum and type of variables may be different. While the gamma distribution, when derived from the exponential distribution (p=1), gets the interpretation of the exponential distribution (waiting time), you can not go reverse and go back to a sum of squared Gaussian variables and use that same interpretation.

The density distribution for waiting time which falls of exponentially, and the density distribution for a Gaussian error falls of exponentially (with a square). That is another way to see the two connected.

— Sextus Empiricus
fuente