Spatial differencing for sample selection models

Keynes College

by Alex Klein and Guy Tchuente, discussion paper KDPE 1701, December 2016.

Non-technical summary

This paper offers an identification strategy in the situation when researchers work with crosssectional data, face unobserved heterogeneity causing endogeneity problem, lack instrumental variables and, on top of it, face sample selection problem. To accomplish that, we take advantage of recent advances of spatial econometrics. What motives us to consider the case of cross-sectional data which data generating process involves sample selection and seemingly unsolvable problem of endogeneity and no instrumental variables?

Recent decades have witnessed a rise of panel data sets which was accompanied by the proliferation of estimation techniques attempting to take advantage of the time and cross section dimension to identify the causal effect of regressors on the variables of interest. Similarly, considerable advances were made in the areas of weak instrumental variable estimation techniques and imperfect instruments. All of this offers researchers various identification strategies which help them to identify vast variety of empirical models even in the situations when strong instrumental variables are not available or exclusion restrictions would not necessarily hold. But what if panel data sets or instrumental variables are not readily available to researchers?

There are three broad possibilities. One is to dispense of causality claim and consider the regression results as sophisticated correlations. Second solution is offered by the literature identifying causal effect with higher moments. Third solution is spatial differencing in which empirical model takes advantage of the spatial dimension of the data to control for unobserved heterogeneity that might render estimator biased and inconsistent. Our paper contributes to that literature. Spatial differencing has been used only in the context of linear regressions so far. We extend this approach to cross-section data with sample-selection. Specifically, we offer a solution to the problem of differencing out spatial unobserved effects when nonlinear element – in our case Mill’s ratio – is present, propose estimation procedure, and derive formula for estimating standard errors.