Hi! I am a data scientist at BESTSELLER. Here you can find my research papers, presentations, and teaching material from my previous work in academia.
2024. Post-Instrument Bias (with Adam Glynn & Miguel Rueda)
Conditionally accepted at American Journal of Political Science.
When using instrumental variables, researchers often assume that causal effects are only identified conditional on covariates. We show that the role of these covariates in applied research is often unclear and that there exists confusion regarding their ability to mitigate violations of the exclusion restriction. We explain when and how existing adjustment strategies may lead to bias. We then discuss assumptions that are sufficient to identify various treatment effects, some of which are new, when the exclusion restriction only holds conditionally. In general, these assumptions are highly restrictive, albeit they sometimes are testable. We also show that other existing tests are generally misleading. Then, we introduce an alternative sensitivity analysis that uses information on variables influenced by the instrument to gauge the effect of potential violations of the exclusion restriction. We illustrate it in two replications of existing analyses and summarize our results in easy-to-understand guidelines.
2024. Identification and Sensitivity Analysis for Teacher Bias Designs Based on Administrative Data
R&R at Sociological Methods & Research.
A series of papers uses administrative data on school students' grades to assess whether teachers discriminate against certain demographic groups. Often, standardized test grades are subtracted from teacher grades and then regressed on student-level variables. However, it is unclear under what circumstances such an estimation strategy is valid. We conceptualize teacher bias as a direct causal effect of student-level attributes on teacher grades, fixing student ability. Standardized tests merely proxy for student ability; additionally, there may be confounders of ability and teacher grade. Accordingly, teacher bias is nonparametrically unidentified. However, we suggest substantive and parametric assumptions that ensure identification using difference-in-grades estimators. Estimators based on regression control for test grades are shown to be inconsistent even under these strong assumptions. We then develop a parametric sensitivity analysis that allows researchers to investigates the consequences of departures from critical assumptions. We illustrate our methodology using administrative data from Denmark.
2024. Post-Instrument Bias in Linear Models (with Adam Glynn & Miguel Rueda)
Sociological Methods & Research. Published paper Working paper
Post-instrument covariates are often included as controls in IV analyses to address a violation of the exclusion restriction. However, we show that such analyses are subject to biases unless strong assumptions hold. Using linear constant-effects models, we present asymptotic bias formulas for three estimators (with and without measurement error): IV with post-instrument covariates, IV without post-instrument covariates, and OLS. In large samples and when the model provides a reasonable approximation, these formulas sometimes allow the analyst to bracket the parameter of interest with two estimators and allow the analyst to choose the estimator with the least asymptotic bias. We illustrate these points with a discussion of Acemoglu, Johnson, and Robinson (2001).
Forthcoming. Graphical Causal Models for Survey Inference (with Peter Selb)
Sociological Methods & Research. Published paper Working paper
Directed acyclic graphs (DAGs) are an increasingly popular tool to inform causal inferences in observational research. We demonstrate how DAGs can be used to encode and communicate theoretical assumptions about nonprobability samples and survey nonresponse, determine whether typical population parameters of interest to survey researchers can be identified from a sample, and support the choice of adjustment strategies. Following an introduction to basic concepts in graph and probability theory, we discuss sources of bias and assumptions for eliminating it in selection scenarios familiar from the missing data literature. We then introduce and analyze graphical representations of the multiple selection stages in the survey data collection process, in line with the Total Survey Error approach. Finally, we identify areas for future survey methodology research that can benefit from advances in causal graph theory.
Forthcoming. Facial Finetuning: Using Pretrained Image Classification Models to Predict Politicians' Success (with Asbjørn Lindholm & Christian Hjorth)
Political Science Research & Methods. Working paper
There is a long-standing interest in how the visual appearance of politicians predict their success. Usually, the scope of such studies is limited by the need for human-rated facial features. We instead fine-tune pre-trained image classification models based on convolutional neural networks to predict facial features of multiple thousand Danish politicians. Attractiveness and trustworthiness scores correlate positively and robustly with both ballot paper placement (proxying for intra-party success) and the number of votes gained in local and national elections, while dominance scores correlate inconsistently. Effect sizes are at times substantial. We find no moderation by politician gender or election type. However, dominance scores correlate significantly with outcomes for conservative politicians. We discuss possible causal mechanisms behind our results.
2023. Compensating Discrimination: Behavioral Evidence from Danish School Registers (with Kim Mannemar Sønderskov)
We suggest that discriminatory practices may vary significantly across decision-makers, which allows for deeper insights into the mechanisms behind discrimination. We study this in the context of biased grading in schools. We develop a theory of teacher biases driven by heuristic beliefs stemming from concrete classroom experiences. Because teachers may also care about grade equality, such a mechanism can lead to either inequality-reinforcing or compensating biases in grading. Based on large-scale administrative data on Danish students, we find strong evidence for highly heterogeneous teacher biases---up to 45\% of teachers exhibit a bias that is of the opposite sign as the average bias. Furthermore, there is a robust and substantively large compensation effect. Teachers that experienced a visible demographic group (defined by gender or migration background) academically under-performing relative to a reference group show more positive bias towards that group than teachers where the same group over-performed. We find little evidence for alternative explanations of bias. To fully grasp discrimination, we must go beyond averages and consider the wide variety of biases shaped by individual experiences.
2020. Power Analysis for Conjoint Experiments (with Markus Freitag)
Conjoint experiments aiming to estimate average marginal component effects and related quantities have become a standard tool for social scientists. However, existing solutions for power analyses to find appropriate sample sizes for such studies have various shortcomings and accordingly, explicit sample size planning is rare. Based on recent advances in statistical inference for factorial experiments, we derive simple yet generally applicable formulae to calculate power and minimum required sample sizes for testing average marginal component effects (AMCEs), conditional AMCEs, as well as interaction effects in forced-choice conjoint experiments. The only input needed are expected effect sizes. Our approach only assumes random sampling of individuals or randomization of profiles and avoids any parametric assumption. Furthermore, we show that clustering standard errors on individuals is not necessary and does not affect power. Our results caution against designing conjoint experiments with small sample sizes, especially for detecting heterogeneity and interactions. We provide an R package that implements our approach.
2024 Workshop at InFER, Gothe-Universität Frankfurt am Main, on Causal Graphs. Slides
2024 Talk at InFER, Gothe-Universität Frankfurt am Main, on "Rethinking Discrimination: Counterfactuals, Measurement, and Inequality". Slides
2023 Talk at AU Interacting Minds Centre on on analyzing teacher bias
2023 Talk at AU Centre for Educational Development EdTech Group on using LLMs when teaching data science
2023 Talk at Constructive Institute about the "Present and Future of AI": Slides. Visual summary by Mette Stentoft
2023 Talk on analyzing teacher bias at Wissenschaftszentrum Berlin: Slides
2023 Workshop on causal graphs at Wissenschaftszentrum Berlin: Slides
2022 Talk on causal graphs at Data Science Darmstadt (in German): Slides
2022 Talk on analyzing teacher bias at Danish Data Science 2022: Slides
2022 Slides on basic power analysis: Slides
2022 Two-day course on causal graphs at the German Centre for Higher Education Research and Science Studies: Slides Day 1, Slides Day 2
2021 Four-hour workshop on Research Design and Causal Analysis with R at the Data Science Summer School, Hertie School Berlin: Slides
2021 Three-day course on causal graphs at GESIS Summer School: Syllabus, Slides Day 1, Slides Day 2, Slides Day 3
2021 Two-day course on causal mediation analysis at UPF Barcelona: Slides Day 1, Slides Day 2
2021 Talk on using causal graphs for econometric applications at Econometrics and Business Statistics Seminar, Aarhus University: Slides
2020 Talk on selection bias and external validity/transportability at DGS Methoden / University of Potsdam: Slides
2020 Presentation on Knox et al. "Administrative records mask racially biased policing" at the LMU DAG Reading group: Slides
2019 1h intro workshop on causal graphs at MZES Mannheim: Youtube Video, Slides
2018 Full-term course on causal graphs: here. This course was awarded the "Causality in Statistics Education Award" 2019 by the American Statistical Association
The style of this website was originally inspired by Cosma Shalizi's homepage, but now looks different. Imprint.