Lorena Moreno
100
Analiti a, Revista de análisis estadístico, Vol. 13 (1), 2017
the work simplified and represented in equation 3. Recent empirical literature on the matter
has focused in nonparametric local polynomial estimators with complementary bandwidth
choice procedures. These estimators are the results of weighted polynomial regressions above
and below the threshold.
To perform these regression approximations a choice of bandwidth is required, generally
based on selectors obtained by balancing the squared-bias and variance of the effect estima-
tions. The mean weakness of this procedures is that the window selected is too “large”, so
that the validity of the assumptions of the distributional approximations cannot be ensured.
This increases the probability of biased confidence intervals, which leads to over-rejecting the
no treatment effect null hypothesis. The Cross-Validation (CV) bandwidth choice method,
developed by Ludwig and Miller (2007), and the Mean Square Error optimal (MSE) by
Imbens and Kalyanaraman (2012) are affected by this problem.
To address this issue, novelty work has been developed in Calonico et al. (2014), Calonico
et al. (2016a); and Calonico et al. (2016b). The authors implemented a data-driven local
polynomial RDD point estimator with bias-corrected confidence intervals. In the light of the
exposed, the present study justifies the use of the lastly mentioned procedure. A simplified
process of the proposal is as follows:
1. Bias-correction of the FRDD z-score estimator: instead of using the large-sample
approximation for the standardised t-statistic, the procedure re-centers this statistic
with an estimate of the leading bias.
2. Re-scaling the t-statistic: to complement the conventional bias correction performed
in step 1 (which suffers from poor-finite sample performance due to low quality dis-
tributional approximation), the corrected t-statistic is re-scaled with a novel standard
error specification attempting to account for the variability added by the estimated
bias.
Regarding computation, I used the companion Stata commands, rdrobust, rdwselect
and rdplot for different specifications mimicking the parametric analysis and checking for
differences between them. For both approaches, I included up to quadratic polynomial
transformations of the forcing variable as recommended in the research by Gelman and
Imbens (2014).
The identification strategy of the RDD design is not directly testable since we never get
to evidence the conditional expectation of the counterfactual outcomes, though there are
indirect ways to address this. Specifically, falsification tests which stem from two general
concerns; effects due to reasons other than the treatment, and manipulation of the forcing
variable. Particular attention is given to balance checks of jumps in covariates, for which I
reproduced the data-driven regression discontinuity plots developed by Calonico et al. (2015);
and to density checks around the cut-off or sorting, for which I present McCrary (2008) and
the Cattaneo et al. (2015) tests. In the next section, I show the main findings of the reviewed
empirical design.
18