Hay algunas formas posibles para mantener el ficticio de género en una regresión de efectos fijos.
Dentro del Estimador
Suponga que tiene un modelo similar en comparación con su modelo OLS agrupado que es
yit=β1+∑t=210βtdt+γ1(malei)+∑t=110γt(dt⋅malei)+X′itθ+ci+ϵit
where the variables are as before. Now note that
β1 and
β1+γ1(malei) cannot be identified because the within estimator cannot distinguish them from the fixed effect
ci. Given that
β1 is the intercept for the base year
t=1,
γ1 is the gender effect on earnings in this period. What we can identify in this case are
γ2,...,γ10 because they are interacted with your time dummies and they measure the differences in the partial effects of your gender variable relative to the first time period. This means if you observe an increase in your
γ2,...,γ10 over time this is an indication for a widening of the earnings gap between men and women.
First-Difference Estimator
If you want to know the overall effect of the difference between men and women over time, you can try the following model:
yit=β1+∑t=210βtdt+γ(t⋅malei)+X′itθ+ci+ϵit
where the variable
t=1,2,...,10 is interacted with the time-invariant gender dummy. Now if you take first differences
β1 and
ci drop out and you get
yit−yi(t−1)=∑t=310βt(dt−d(t−1))+γ(t⋅malei−[(t−1)malei])+(X′it−X′i(t−1))θ+ϵit−ϵi(t−1)
Then
γ(t⋅malei−[(t−1)malei])=γ[(t−(t−1))⋅malei]=γ(malei) and you can identify the gender difference in earnings
γ. So the final regression equation will be:
Δ yyo t= ∑t = 310βtΔ dt+ γ( m a l eyo) + Δ X′yo tθ + Δ ϵyo t
and you get your effect of interest. The nice thing is that this is easily implemented in any statistical software but you lose a time period.
Cyo1 denote variables that are uncorrelated with Cyo and 2 those who are and let's say your gender variable is the only time-invariant variable. The Hausman-Taylor estimator then applies the random effects transformation:
y~it=X~′1it+X~′2it+γ(male˜i2)+c~i+ϵ~it
where tilde notation means
X~1it=X1it−θ^iX¯¯¯¯1i where
θ^i is used for the random effects transformation and
X¯¯¯¯1i is the time-average over each individual. This isn't like the usual random effects estimator that you wanted to avoid because group
2 variables are instrumented for in order to remove the correlation with
ci. For
X~2it the instrument is
X2it−X¯¯¯¯2i. The same is done for the time-invariant variables, so if you specify the gender variable to be potentially correlated with the fixed effect it gets instrumented with
X¯¯¯¯1i, so you must have more time-varying than time-invariant variables.
All of this might sound a little complicated but there are canned packages for this estimator. For instance, in Stata the corresponding command is xthtaylor
. For further information on this method you could read Cameron and Trivedi (2009) "Microeconometrics Using Stata". Otherwise you can just stick with the two previous methods which are a bit easier.
Inference
For your hypothesis tests there is not much that needs to be considered other than what you would need to do anyway in a fixed effects regression. You need to take care for the autocorrelation in the errors, for example by clustering on the individual ID variable. This allows for an arbitrary correlation structure among clusters (individuals) which deals with autocorrelation. For a reference see again Cameron and Trivedi (2009).