Assistant Professor, Department of Information Science, Cornell University
I study topics in stratification and inequality. I strive to produce substantive findings that are conceptually precise and which rely on credible assumptions. These principles often lead me to computational and machine learning methods and the development of new approaches.
Cornell students: Consider taking my PhD seminar on causal inference!
My CV contains links to all papers and replication files. Scroll down for an overview of my research targeting estimands that are predictive, descriptive, and causal.
Estimands are the starting point for methodological choices
In a vision of social science research laid out with Rebecca Johnson and Brandon Stewart, I argue that social scientists should set the research goal in precise terms that do not involve regression coefficients. This frees us to state the goal of greatest interest, even if our methods to achieve that goal involve a model that is only a rough approximation.
A pivot away from regression coefficients and toward more general nonparametric estimands empowers social scientists to convey more information to readers under more credible assumptions. For example, a methodological comment written with Brandon Stewart proposes a new visualization to summarize economic mobility.
Predictive estimands call for the direct application of data science
It is well known that social science models do not predict well. But is this just for lack of trying?
A new way of doing science. We collaborated with hundreds of social scientists and data scientists in a research design optimized for prediction. Teams trained predictive models on a standard social science dataset. We evaluated them on a holdout set locked away until the end.
We learned new things. The best predictive performance observed holds new weight because (1) it was evaluated on holdout data and (2) it represents the best out of many diverse attempts.
Descriptive estimands show where policies are failing
Housing eviction is more common than you think
Demographers often summarize events per person-year. By this metric, eviction is rare: only 2-3 % of households per year.
A new goal. But often we care whether someone ever experiences an event over a longer period, such as at any point in childhood. It only takes one eviction to upend a child's life.
The goal matters. More than 1 in 4 children born into poverty in a large U.S. city from 1998 to 2000 experienced eviction by age 15.
Causal estimands prescribe policy solutions
Public housing protects families from eviction
Public housing provides tenants with reduced rent as well as an internal grievance procedure to resolve conflicts with the housing authority. Does public housing reduce eviction?
A new goal. An explicitly causal estimand clarifies precise assumptions for observational data point to a policy solution.
The goal matters. In our target population, public housing reduces eviction from 11 percent to 3 percent. It is difficult to argue that this large difference arises from confounding alone: a causal effect is more plausible.
Seemingly descriptive quantities often involve a hypothetical intervention
Demographers frequently study disparities across social categories such as race, gender, and class conditional on policy-amenable variables.
A new goal. Drawing on work in epidemiology, one can formalize the goal as a post-intervention gap: the expected gap if a random individual was sampled from each category and assigned to a single treatment value. This approach sidesteps the problem of hypothesizing an intervention to a characteristic that might be immutable or a complex social construction.
The goal matters. The goal guides adaptation of double machine learning this demographic goal.
Interpretation of complex empirical quantities often requires sharpened theory
Cousins' incomes are sometimes similar. This does not imply a direct grandparent effect.
The cousins' incomes are sometimes remarkably similar. This might suggest something about how family background constrains life chances.
A new goal. What do we really mean by the "influence" of family background? We can formalize our theoretical model mathematically.
The goal matters. Several plausible theoretical models could generate any given set of sibling and cousin correlations.
2020 Graduate Student Paper Award, American Sociological Association Section on Inequality, Poverty, and Mobility.