Simple Tests for Exogeneity of a Binary Explanatory Variable in Count Data Regression Models

Abstract

Abstract This article investigates power and size of some tests for exogeneity of a binary explanatory variable in count models by conducting extensive Monte Carlo simulations. The tests under consideration are Hausman contrast tests as well as univariate Wald tests, including a new test of notably easy implementation. Performance of the tests is explored under misspecification of the underlying model and under different conditions regarding the instruments. The results indicate that often the tests that are simpler to estimate outperform tests that are more demanding. This is especially the case for the new test. Keywords: Dummy variableEndogeneityPoissonTestingMathematics Subject Classification: 62H1562F03 Acknowledgments The author wishes to thank João M.C. Santos Silva and an anonymous referee for helpful comments on this article. Special thanks to Rainer Winkelmann for extensive discussion and advise which significantly improved this article. Any remaining errors are the author's sole responsability. Notes For example, there are routines for both Mullahy's (Citation1997) NLIV/GMM estimator and Terza's (Citation1998) full information maximum likelihood estimator in STATA. See Nichols (Citation2007) and Miranda (Citation2004), respectively. Evidently, exponential conditional mean functions are not limited to count data, and many of the procedures and results discussed here are in principle applicable to continuous data as well. An alternative justification for this representation is by means of the interpretability of the model in terms of ceteris paribus marginal effects (cf. (Winkelmann, Citation2008;, p. 160)). While the partial effects do not depend on the first element of β, predictions of CEF are consistent because is consistent for x′β +ln E[exp (ϵ)]. Creel's (Citation2004) approach is optimal GMM, while Weesie (Citation1999) does not use a second step weighting matrix. Clearly, in the just identified case under consideration both amount to the same as the choice of the weighting matrix does not affect the estimates. Terza et al. (Citation2008) showed that residual inclusion in nonlinear models is inconsistent in general. Discussions of consistency of residual inclusion in Poisson PML models with continuous endogenous regressors and inconsistency with binary regressors can be found inter alia in Wooldridge (Citation1997) and Winkelmann (Citation2008). An important aspect of leaving f(y | d, x, z, ϵ) unspecified is that it broadens the class of models this estimator is applicable to other non count exponential CEF models. See, for instance, Egger et al. (Citation2009) who applied such a model to bilateral trade. The argument for Poisson pseudo-MLE against NLS is presented extensively by Santos Silva and Tenreyro (Citation2006) in the context of non count exponential CEF models. This technique has also been used by Angrist (Citation2001) to approximate a Tobit MLE. Monte Carlo studies of count data models with unit coefficient on endogenous variables include Creel (Citation2004), Romeu and Vera-Hernandez (Citation2005), and Chmelarova (Citation2007). *Marginal distributions of the copulae: ϵ ∼expGamma(1, 1), v ∼Logistic(0, 3/π). Notes: Number of replications =10,000 (FIML: 2,000 replications). Nominal test. The corresponding confidence interval for 2,000 replications is approximately [0.405, 0.595]. Some authors prefer to use what is called size-corrected power to make comparisons across tests. Here, no size-corrected power is presented, since the question addressed is how these tests work in practice and which are useful under given characteristics of the data generating process. Notes: Number of replications =10,000 (FIML: 2,000 replications). Nominal test size =0.05. IV-strength as detailed in text or Table 1. Notes: Number of replications =10,000 (FIML: 2,000 replications for N = 500, 1,000 replications for N = 2,000). Nominal test size = 0.05. IV-strength of columns (1) and (2) as detailed in text or Table 1. Monfardini and Radice (Citation2008) investigated exogeneity testing with no instruments in the bivariate probit model, which is related to the model under consideration through the bivariate normality assumption. The present results are in line with theirs, as they report high overrejection rates for Wald tests. They find likelihood ratio tests to have appropriate empirical size. Notes: Number of replications =10,000 (FIML: 2,000 replications). Nominal test size =0.05. Sample size =500. IV-strength as detailed in text or Table 1. Corollary 1 in Romeu and Vera-Hernandez (Citation2005) established consistency of excluding the constant element, which is shifted. The estimate is inconsistent for ρ but equals 0 whenever ρ does, securing consistency of the exogeneity test. Second-stage tests do not use RI-TSA and GRI-TSA estimates as these are inconsistent unless ρ = 0. Notes: Number of replications =10,000 (FIML: 2,000 replications). Nominal test size =0.05. Sample size =500. IV-strength of columns (1) and (2) as detailed in text or Table 1.

References

Page 1

	Year	Citations

Page 1