Linear GMM

An explicit solution for minimizing the criterion function is available when the moment conditions take the special form

\[\mathrm{E}\left[(Y_i - \theta'\mathbf{X}_i) \mathbf{Z}_i\right] = \mathbf{0}\]

where the residuals from linear equations are orthogonal to a vector of instruments. In this case, an optimization solver is unnecessary for the estimation. If the parameters are just-identified, we have

\[\hat{\theta} = (\mathbf{Z}'\mathbf{X})^{-1}\mathbf{Z}'\mathbf{Y}\]

where $\mathbf{Z}$, $\mathbf{X}$ and $\mathbf{Y}$ are matrices formed by stacking $\mathbf{Z}_i'$, $\mathbf{X}_i'$ and $Y_i'$ across observations. The parameter estimates can be obtained in one step by directly solving the linear problems. They are the IV estimates from linear regressions. If the parameters are over-identified, we have

\[\hat{\theta} = (\mathbf{X}'\mathbf{Z}\mathbf{W}\mathbf{Z}'\mathbf{X})^{-1} \mathbf{X}'\mathbf{Z}\mathbf{W}\mathbf{Z}'\mathbf{Y}\]

where $\mathbf{W}$ is a weight matrix. The parameter estimates can be obtained iteratively as for the general nonlinear scenario. However, we directly evaluate $\hat{\theta}$ in each iteration. The two-stage least squares (2SLS) estimator can be viewed as a special case where the weight matrix is specified as

\[\mathbf{W} = \left(\mathbf{Z}'\mathbf{Z}\right)^{-1}\]

with no further iteration conducted upon the first evaluation.

Example: Two-Stage Least Squares

We replicate Example 2 from Stata manual for gmm:

using MethodOfMoments, CSV, DataFrames
# exampledata loads data from CSV files bundled with MethodOfMoments.jl
exampledata(name::Union{Symbol,String}) =
    DataFrame(CSV.read(MethodOfMoments.datafile(name), DataFrame), copycols=true)

data = exampledata(:hsng2)
vce = RobustVCE(3, 6, size(data,1))
eq = (:rent, (:hsngval, :pcturban), (:pcturban, :faminc, Symbol.(:reg, 2:4)...))
r = fit(IteratedLinearGMM, vce, data, eq, maxiter=1)

LinearGMM with 6 moments and 3 parameters over 50 observations:
  Iterated Linear GMM estimator:
    iter   1  =>  Q(θ) = 1.10916e+02  max|θ-θlast| = 1.20707e+02
                  Jstat = NaN        Pr(>J) = NaN
  Heteroskedasticity-robust covariance estimator
────────────────────────────────────────────────────────────────────────────────
              Estimate    Std. Error     z  Pr(>|z|)     Lower 95%     Upper 95%
────────────────────────────────────────────────────────────────────────────────
hsngval     0.00223983   0.000672003  3.33    0.0009   0.000922731    0.00355693
pcturban    0.081516     0.444594     0.18    0.8545  -0.789872       0.952904
cons      120.707       15.2555       7.91    <1e-14  90.8064       150.607
────────────────────────────────────────────────────────────────────────────────

Notice that we have specified the estimator by passing IteratedLinearGMM as the first argument. However, for 2SLS estimation, we restrict maxiter=1 to avoid further iterations. The regression equation is specified as a Tuple of three elements where the first one is the column name of the outcome variable in data. The second and third elements are the names of endogenous variables and IVs respectively. Unless a keyword of nocons is specified, a constant term named cons is added automatically for the endogenous variables and IVs. The default initial weight matrix for IteratedLinearGMM is the one that gives us 2SLS estimates.

Warning

The implementation of the estimator does not handle multicollinearity issues. In case some of the regressors are highly correlated, the estimator may fail to generate a result.

Example: GMM IV Estimation with Clustering

We replicate Example 4 from Stata manual for ivregress:

data = exampledata(:nlswork)
data[!,:age2] = data.age.^2
dropmissing!(data, [:ln_wage, :age, :birth_yr, :grade, :tenure, :union, :wks_work, :msp])
vce = ClusterVCE(data, :idcode, 6, 8)
eq = (:ln_wage, (:tenure=>[:union, :wks_work, :msp], :age, :age2, :birth_yr, :grade))
r = fit(IteratedLinearGMM, vce, data, eq, maxiter=2)

LinearGMM with 8 moments and 6 parameters over 18625 observations:
  Iterated Linear GMM estimator:
    iter   2  =>  Q(θ) = 6.38275e-04  max|θ-θlast| = 5.04467e-02
                  Jstat = 11.89        Pr(>J) = 0.0026
  Cluster-robust covariance estimator: idcode
────────────────────────────────────────────────────────────────────────────────
              Estimate   Std. Error      z  Pr(>|z|)     Lower 95%     Upper 95%
────────────────────────────────────────────────────────────────────────────────
tenure     0.099221     0.00377642   26.27    <1e-99   0.0918194     0.106623
age        0.0171146    0.00668953    2.56    0.0105   0.00400338    0.0302259
age2      -0.000519104  0.000110954  -4.68    <1e-05  -0.000736571  -0.000301637
birth_yr  -0.00859937   0.00219321   -3.92    <1e-04  -0.012898     -0.00430076
grade      0.071574     0.0029938    23.91    <1e-99   0.0657062     0.0774417
cons       0.857507     0.161627      5.31    <1e-06   0.540723      1.17429
────────────────────────────────────────────────────────────────────────────────

Here we make use of the cluster-robust VCE via ClusterVCE and specify idcode as the variable that identifies the clusters. The regression equation is specified in an alternative format with a Tuple only containing two elements. The first element is still the name of the outcome variable. The second element contains the names of all the regressors, with the endogenous variable tenure being paired with a vector of IVs.

Example: System of Simultaneous Equations

It is possible to specify a system of equations that are estimated jointly. We consider a modified Example 16 from Stata manual for gmm, replacing the variance-covariance estimator with RobustVCE:^[1]

data = exampledata(:klein)
vce = RobustVCE(7, 8, nrow(data))
eqs = [(:consump, (:wagepriv, :wagegovt), (:wagegovt, :govt, :capital1)),
    (:wagepriv, (:consump, :govt, :capital1), (:wagegovt, :govt, :capital1))]
r = fit(IteratedLinearGMM, vce, data, eqs, maxiter=2)

LinearGMM with 8 moments and 7 parameters over 22 observations:
  Iterated Linear GMM estimator:
    iter   2  =>  Q(θ) = 5.60704e-02  max|θ-θlast| = 4.39993e+00
                  Jstat = 1.23        Pr(>J) = 0.2667
  Heteroskedasticity-robust covariance estimator
───────────────────────────────────────────────────────────────────────────────────
                     Estimate  Std. Error      z  Pr(>|z|)    Lower 95%   Upper 95%
───────────────────────────────────────────────────────────────────────────────────
consump_wagepriv    0.778481    0.0660542  11.79    <1e-31    0.649018    0.907945
consump_wagegovt    0.974761    0.23845     4.09    <1e-04    0.507407    1.44212
consump_cons       20.5013      2.05553     9.97    <1e-22   16.4726     24.5301
wagepriv_consump    0.427942    0.198266    2.16    0.0309    0.0393469   0.816537
wagepriv_govt       1.11404     0.388362    2.87    0.0041    0.352861    1.87521
wagepriv_capital1  -0.0255532   0.0547334  -0.47    0.6406   -0.132829    0.0817223
wagepriv_cons      12.8435     11.6789      1.10    0.2715  -10.0468     35.7338
───────────────────────────────────────────────────────────────────────────────────

For multiple equations, we simply need to collect the specification for each equation in a vector. The name of each parameter is prepended by its corresponding outcome variable in order to distinguish the equation it belongs to.

Example: Just-Identified IV Regression

Lastly, we illustrate the use of the specialized estimator JustIdentifiedLinearGMM for the just-identified linear GMM using the auto dataset from Stata:

data = exampledata(:auto)
vce = RobustVCE(3, 3, size(data,1))
eq = (:mpg, [:weight, :length=>:trunk])
r = fit(JustIdentifiedLinearGMM, vce, data, eq)

LinearGMM with 3 moments and 3 parameters over 74 observations:
  Just-identified linear GMM estimator
  Heteroskedasticity-robust covariance estimator
──────────────────────────────────────────────────────────────────────────
           Estimate   Std. Error      z  Pr(>|z|)   Lower 95%    Upper 95%
──────────────────────────────────────────────────────────────────────────
weight  -0.00298026   0.00454921  -0.66    0.5124  -0.0118966   0.00593603
length  -0.111738     0.156728    -0.71    0.4759  -0.41892     0.195444
cons    51.2953      15.7791       3.25    0.0012  20.3689     82.2217
──────────────────────────────────────────────────────────────────────────

Note

The familiar OLS regression is estimated if we omit the IV in this example.

1Identical estimates can be produced in Stata by changing the wmatrix option to wmatrix(robust).