Month: March 2023

Upcoming RMME/STAT Colloquium (4/7): Luke Miratrix, “A Bayesian Nonparametric Approach to Geographic and two-Dimensional Regression Discontinuity Designs”

RMME/STAT Joint Colloquium

A Bayesian Nonparametric Approach to Geographic and two-Dimensional Regression Discontinuity Designs

Dr. Luke Miratrix
Harvard University

Friday, April 7, at 11AM ET

https://tinyurl.com/rmme-Miratrix

Geographical and two-dimensional regression discontinuity designs (RDDs) extend the classic, univariate RDD to multivariate, spatial contexts. We propose a framework for analyzing such designs with Gaussian process regression. This yields a Bayesian posterior distribution of the treatment effect at every point along the border, allowing for impact heterogeneity. We can then aggregate along the border to obtain an overall local average treatment effect (LATE) estimate. We address nuances of having a functional estimand defined on a border with potentially intricate topology, particularly with respect to defining the target estimand of interest. The Bayesian estimate of the LATE can also be used as a test statistic in a hypothesis test with good frequentist properties, which we validate using simulations and placebo tests. We demonstrate our methodology with a dataset of property sales in New York City, to assess whether there is a discontinuity in housing prices at the border between school districts. We also discuss application of this method to the context of treatment as a function of two forcing variables, such as falling below a threshold for either a reading or math test.

 

Loader Loading...
EAD Logo Taking too long?

Reload Reload document
| Open Open in new tab

Upcoming RMME/STAT Colloquium (3/24): Joseph L. Schafer, “Modeling Coarsened Categorical Variables: Techniques and Software”

RMME/STAT Joint Colloquium

Modeling Coarsened Categorical Variables: Techniques and Software

Dr. Joseph L. Schafer
U.S. Census Bureau

Friday, March 24, at 11AM ET

https://tinyurl.com/rmme-Schafer

Coarsened data can express intermediate states of knowledge between fully observed and fully missing. For example, when classifying survey respondents by cigarette smoking behavior as 1=never smoked, 2=former smoker, or 3=current smoker, we may encounter some who reported having smoked in the past but whose current activity is unknown (either 2 or 3, but not 1). Software for categorical data modeling typically provides codes for missing values but lacks convenient ways to convey states of partial  knowledge. A new R package cvam: Coarsened Variable Modeling, extends R’s implementation of categorical variables (factors) and fits log-linear and latent-class models to incomplete datasets containing coarsened and missing values. Methods include maximum likelihood estimation using an expectation-maximization algorithm, approximate Bayesian and Bayesian inference via Markov chain Monte Carlo. Functions are also provided for comparing models, predicting missing values, creating multiple imputations, and generating partially or fully synthetic data. In the first major application of this software, data from the U.S. Decennial Census and administrative records were combined to predict citizenship status for 309 million residents of the United States.

 

Loader Loading...
EAD Logo Taking too long?

Reload Reload document
| Open Open in new tab