Un problema sobre la estimabilidad de los parámetros.


13

Sean Y 1 , Y 2 , Y 3Y1,Y2,Y3 e Y 4Y4 cuatro variables aleatorias tales que E ( Y 1 ) = θ 1 - θ 3 ; E ( Y 2 ) = θ 1 + θ 2 - θ 3 ; E ( Y 3 ) = θ 1 - θ 3 ; E ( Y 4 ) =      θ 1 - θ 2 - θ 3E(Y1)=θ1θ3;  E(Y2)=θ1+θ2θ3;  E(Y3)=θ1θ3;  E(Y4)=θ1θ2θ3 , donde θ 1 , θ 2 , θ 3θ1,θ2,θ3 son parámetros desconocidos. Supongamos también que V a r ( Y i ) = σ 2Var(Yi)=σ2 ,Entonces, ¿cuál es la verdad?i = 1 , 2 , 3 , 4.i=1,2,3,4.

A. son estimables.θ 1 , θ 2 , θ 3θ1,θ2,θ3

B. es estimable.θ 1 + θ 3θ1+θ3

C. es estimable y es la mejor estimación imparcial lineal de .θ 1 - θ 3 θ1θ312 (Y1+Y3)12(Y1+Y3)θ1-θ3θ1θ3

D. es estimable.θ 2θ2

La respuesta que se da es C, que me parece extraño (porque obtuve D).

¿Por qué tengo D? Como, .E ( Y 2 - Y 4 ) = 2 θ 2E(Y2Y4)=2θ2

¿Por qué no entiendo que C podría ser una respuesta? Ok, puedo ver, es un estimador imparcial de , y su 'varianza es menor que .Y 1 + Y 2 + Y 3 + Y 44Y1+Y2+Y3+Y44θ1θ3θ1θ3Y1+Y32Y1+Y32

Por favor, dime dónde estoy haciendo mal.

También publicado aquí: /math/2568894/a-problem-on-estimability-of-parameters


1
Ponga una self-studyetiqueta o alguien vendrá y cerrará su pregunta.
Carl

@Carl está hecho, pero ¿por qué?
Stat_prob_001

Son las reglas del sitio, no mis reglas, las reglas del sitio.
Carl

¿Es Y 1Y 3Y1Y3 ?
Carl

1
@Carl puedes pensar de esta manera: Y 1 = θ 1 - θ 3 + ϵ 1Y1=θ1θ3+ϵ1 donde ϵ 1ϵ1 es un rv con media 00 y varianza σ 2σ2 . Y, Y 3 = θ 1 - θ 3 + ϵ 3Y3=θ1θ3+ϵ3 donde ϵ 3ϵ3 es un rv con media 00 y varianza σ 2σ2
Stat_prob_001

Respuestas:


8

Esta respuesta enfatiza la verificación de la estimabilidad. La propiedad de varianza mínima es de mi consideración secundaria.

Para comenzar, resuma la información en términos de forma matricial de un modelo lineal de la siguiente manera: Y : = [ Y 1 Y 2 Y 3 Y 4 ] = [ 1 0 - 1 1 1 - 1 1 0 - 1 1 - 1 - 1 ] [ θ 1 θ 2 θ 3 ] + [ ε 1 ε 2 ε 3 ε 4 ] : =X β + ε , dondeE(ε)=0,Var(ε)=σ2I(para discutir la estimabilidad, el supuesto de esferidad no es necesario. Pero para discutir la propiedad de Gauss-Markov, necesitamos asumir la esferidad deε).

Y:=Y1Y2Y3Y4=111101011111θ1θ2θ3+ε1ε2ε3ε4:=Xβ+ε,(1)
E(ε)=0,Var(ε)=σ2Iε

Si la matriz de diseño X es de rango completo, entonces el parámetro original β admite un único mínimos cuadrados estiman β = ( X ' X ) - 1 X ' Y . En consecuencia, cualquier parámetro φ , definida como una función lineal φ ( β ) de β es estimable en el sentido de que puede ser inequívocamente estima por datos a través de los mínimos cuadrados estimar β como φ = p ' β .Xββ^=(XX)1XYϕϕ(β)ββ^ϕ^=pβ^

La sutileza surge cuando X no es de rango completo. Para tener una discusión exhaustiva, primero arreglamos algunas anotaciones y términos (sigo la convención de El enfoque libre de coordenadas para modelos lineales , Sección 4.8. Algunos de los términos suenan innecesariamente técnicos). Además, la discusión se aplica al modelo lineal general Y = X β + ε con X R n × k y β R k .XY=Xβ+εXRn×kβRk

  1. A regression manifold is the collection of mean vectors as ββ varies over RkRk: M={Xβ:βRk}.
    M={Xβ:βRk}.
  2. A parametric functional ϕ=ϕ(β)ϕ=ϕ(β) is a linear functional of ββ, ϕ(β)=pβ=p1β1++pkβk.
    ϕ(β)=pβ=p1β1++pkβk.

As mentioned above, when rank(X)<krank(X)<k, not every parametric functional ϕ(β)ϕ(β) is estimable. But, wait, what is the definition of the term estimable technically? It seems difficult to give a clear definition without bothering a little linear algebra. One definition, which I think is the most intuitive, is as follows (from the same aforementioned reference):

Definition 1. A parametric functional ϕ(β)ϕ(β) is estimable if it is uniquely determined by XβXβ in the sense that ϕ(β1)=ϕ(β2)ϕ(β1)=ϕ(β2) whenever β1,β2Rkβ1,β2Rk satisfy Xβ1=Xβ2Xβ1=Xβ2.

Interpretation. The above definition stipulates that the mapping from the regression manifold MM to the parameter space of ϕϕ must be one-to-one, which is guaranteed when rank(X)=krank(X)=k (i.e., when XX itself is one-to-one). When rank(X)<krank(X)<k, we know that there exist β1β2β1β2 such that Xβ1=Xβ2Xβ1=Xβ2. The estimable definition above in effect rules out those structural-deficient parametric functionals that result in different values themselves even with the same value on MM, which don't make sense naturally. On the other hand, an estimable parametric functional ϕ()ϕ() does allow the case ϕ(β1)=ϕ(β2)ϕ(β1)=ϕ(β2) with β1β2β1β2, as long as the condition Xβ1=Xβ2Xβ1=Xβ2 is fulfilled.

There are other equivalent conditions to check the estimability of a parametric functional given in the same reference, Proposition 8.4.

After such a verbose background introduction, let's come back to your question.

A. ββ itself is non-estimable for the reason that rank(X)<3rank(X)<3, which entails Xβ1=Xβ2Xβ1=Xβ2 with β1β2β1β2. Although the above definition is given for scalar functionals, it is easily generalized to vector-valued functionals.

B. ϕ1(β)=θ1+θ3=(1,0,1)βϕ1(β)=θ1+θ3=(1,0,1)β is non-estimable. To wit, consider β1=(0,1,0)β1=(0,1,0) and β2=(1,1,1)β2=(1,1,1), which gives Xβ1=Xβ2Xβ1=Xβ2 but ϕ1(β1)=0+0=0ϕ1(β2)=1+1=2ϕ1(β1)=0+0=0ϕ1(β2)=1+1=2.

C. ϕ2(β)=θ1θ3=(1,0,1)βϕ2(β)=θ1θ3=(1,0,1)β is estimable. Because Xβ1=Xβ2Xβ1=Xβ2 trivially implies θ(1)1θ(1)3=θ(2)1θ(2)3θ(1)1θ(1)3=θ(2)1θ(2)3, i.e., ϕ2(β1)=ϕ2(β2)ϕ2(β1)=ϕ2(β2).

D. ϕ3(β)=θ2=(0,1,0)βϕ3(β)=θ2=(0,1,0)β is also estimable. The derivation from Xβ1=Xβ2Xβ1=Xβ2 to ϕ3(β1)=ϕ3(β2)ϕ3(β1)=ϕ3(β2) is also trivial.

After the estimability is verified, there is a theorem (Proposition 8.16, same reference) claims the Gauss-Markov property of ϕ(β)ϕ(β). Based on that theorem, the second part of option C is incorrect. The best linear unbiased estimate is ˉY=(Y1+Y2+Y3+Y4)/4Y¯=(Y1+Y2+Y3+Y4)/4, by the theorem below.

Theorem. Let ϕ(β)=pβϕ(β)=pβ be an estimable parametric functional, then its best linear unbiased estimate (aka, Gauss-Markov estimate) is ϕ(ˆβ)ϕ(β^) for any solution ˆββ^ to the normal equations XXˆβ=XYXXβ^=XY.

The proof goes as follows:

Proof. Straightforward calculation shows that the normal equations is [404020404]ˆβ=[111101011111]Y,

404020404β^=101111101111Y,
which, after simplification, is [ϕ(ˆβ)ˆθ2/2ϕ(ˆβ)]=[ˉY(Y2Y4)/4ˉY],
ϕ(β^)θ^2/2ϕ(β^)=Y¯(Y2Y4)/4Y¯,
i.e., ϕ(ˆβ)=ˉYϕ(β^)=Y¯.

Therefore, option D is the only correct answer.


Addendum: The connection of estimability and identifiability

When I was at school, a professor briefly mentioned that the estimability of the parametric functional ϕϕ corresponds to the model identifiability. I took this claim for granted then. However, the equivalance needs to be spelled out more explicitly.

According to A.C. Davison's monograph Statistical Models p.144,

Definition 2. A parametric model in which each parameter θθ generates a different distribution is called identifiable.

For linear model (1)(1), regardless the spherity condition Var(ε)=σ2IVar(ε)=σ2I, it can be reformulated as E[Y]=Xβ,βRk.

E[Y]=Xβ,βRk.(2)

It is such a simple model that we only specified the first moment form of the response vector YY. When rank(X)=krank(X)=k, model (2)(2) is identifiable since β1β2β1β2 implies Xβ1Xβ2Xβ1Xβ2 (the word "distribution" in the original definition, naturally reduces to "mean" under model (2)(2).).

Now suppose that rank(X)<krank(X)<k and a given parametric functional ϕ(β)=pβϕ(β)=pβ, how do we reconcile Definition 1 and Definition 2?

Well, by manipulating notations and words, we can show that (the "proof" is rather trivial) the estimability of ϕ(β)ϕ(β) is equivalent to that the model (2)(2) is identifiable when it is parametrized with parameter ϕ=ϕ(β)=pβϕ=ϕ(β)=pβ (the design matrix XX is likely to change accordingly). To prove, suppose ϕ(β)ϕ(β) is estimable so that Xβ1=Xβ2Xβ1=Xβ2 implies pβ1=pβ2pβ1=pβ2, by definition, this is ϕ1=ϕ2ϕ1=ϕ2, hence model (3)(3) is identifiable when indexing with ϕϕ. Conversely, suppose model (3)(3) is identifiable so that Xβ1=Xβ2Xβ1=Xβ2 implies ϕ1=ϕ2ϕ1=ϕ2, which is trivially ϕ1(β)=ϕ2(β)ϕ1(β)=ϕ2(β).

Intuitively, when XX is reduced-ranked, the model with ββ is parameter redundant (too many parameters) hence a non-redundant lower-dimensional reparametrization (which could consist of a collection of linear functionals) is possible. When is such new representation possible? The key is estimability.

To illustrate the above statements, let's reconsider your example. We have verified parametric functionals ϕ2(β)=θ1θ3ϕ2(β)=θ1θ3 and ϕ3(β)=θ2ϕ3(β)=θ2 are estimable. Therefore, we can rewrite the model (1)(1) in terms of the reparametrized parameter (ϕ2,ϕ3)(ϕ2,ϕ3) as follows E[Y]=[10111011][ϕ2ϕ3]=˜Xγ.

E[Y]=11110101[ϕ2ϕ3]=X~γ.

Clearly, since ˜XX~ is full-ranked, the model with the new parameter γγ is identifiable.


If you need a proof for the second part of option C, I will supplement my answer.
Zhanxiong

2
thanks! for such a detailed answer. Now, about the second part of C: I know that "best" relates to minimum variance. So, why not 14(Y1+Y2+Y3+Y4)14(Y1+Y2+Y3+Y4) is not "best"?
Stat_prob_001

2
Oh, I don't know why I thought it is the estimator in C. Actually (Y1+Y2+Y3+Y4)/4(Y1+Y2+Y3+Y4)/4 is the best estimator. Will edit my answer
Zhanxiong

6

Apply the definitions.

I will provide details to demonstrate how you can use elementary techniques: you don't need to know any special theorems about estimation, nor will it be necessary to assume anything about the (marginal) distributions of the YiYi. We will need to supply one missing assumption about the moments of their joint distribution.

Definitions

All linear estimates are of the form tλ(Y)=4i=1λiYi

tλ(Y)=i=14λiYi
for constants λ=(λi)λ=(λi).

An estimator of θ1θ3θ1θ3 is unbiased if and only if its expectation is θ1θ3θ1θ3. By linearity of expectation,

θ1θ3=E[tλ(Y)]=4i=1λiE[Yi]=λ1(θ1θ3)+λ2(θ1+θ2θ3)+λ3(θ1θ3)+λ4(θ1θ2θ3)=(λ1+λ2+λ3+λ4)(θ1θ3)+(λ2λ4)θ2.

θ1θ3=E[tλ(Y)]=i=14λiE[Yi]=λ1(θ1θ3)+λ2(θ1+θ2θ3)+λ3(θ1θ3)+λ4(θ1θ2θ3)=(λ1+λ2+λ3+λ4)(θ1θ3)+(λ2λ4)θ2.

Comparing coefficients of the unknown quantities θiθi reveals λ2λ4=0 and λ1+λ2+λ3+λ4=1.

λ2λ4=0 and λ1+λ2+λ3+λ4=1.(1)

In the context of linear unbiased estimation, "best" always means with least variance. The variance of tλtλ is

Var(tλ)=4i=1λ2iVar(Yi)+4ijλiλjCov(Yi,Yj).

The only way to make progress is to add an assumption about the covariances: most likely, the question intended to stipulate they are all zero. (This does not imply the Yi are independent. Furthermore, the problem can be solved by making any assumption that stipulates those covariances up to a common multiplicative constant. The solution depends on the covariance structure.)

Since Var(Yi)=σ2, we obtain

Var(tλ)=σ2(λ21+λ22+λ23+λ24).

The problem therefore is to minimize (2) subject to constraints (1).

Solution

The constraints (1) permit us to express all the λi in terms of just two linear combinations of them. Let u=λ1λ3 and v=λ1+λ3 (which are linearly independent). These determine λ1 and λ3 while the constraints determine λ2 and λ4. All we have to do is minimize (2), which can be written

σ2(λ21+λ22+λ23+λ24)=σ24(2u2+(2v1)2+1).

No constraints apply to (u,v). Assume σ20 (so that the variables aren't just constants). Since u2 and (2v1)2 are smallest only when u=2v1=0, it is now obvious that the unique solution is

λ=(λ1,λ2,λ3,λ4)=(1/4,1/4,1/4,1/4).

Option (C) is false because it does not give the best unbiased linear estimator. Option (D), although it doesn't give full information, nevertheless is correct, because

θ2=E[t(0,1/2,0,1/2)(Y)]

is the expectation of a linear estimator.

It is easy to see that neither (A) nor (B) can be correct, because the space of expectations of linear estimators is generated by {θ2,θ1θ3} and none of θ1,θ3, or θ1+θ3 are in that space.

Consequently (D) is the unique correct answer.

Al usar nuestro sitio, usted reconoce que ha leído y comprende nuestra Política de Cookies y Política de Privacidad.
Licensed under cc by-sa 3.0 with attribution required.