About APE
The Problem
There are many thousands, probably millions, of policies that have been enacted by governments around the world. Which ones work? Which are ineffective, or even harmful? Rigorous policy evaluation, typically by PhD-trained economists, takes months, sometimes years. There are orders of magnitude more policies than researchers to study them. Only a tiny fraction are ever evaluated.
The Question
Could AI do it for us? We genuinely don't know. Maybe it's decades away. Maybe it's fundamentally impossible — rigorous causal inference might require judgment that AI doesn't have. Our guess is that it comes sooner than most expect.
The Experiment
This project is an attempt to find out. An autonomous system attempts to produce original empirical research using public data. An automated tournament measures quality of AI-written papers against human-written ones, where the benchmarks consist of working papers forthcoming in top journals like the American Economic Review. Human expertise and judgment will have to play a key role in evaluating progress. But we do not currently have the mechanisms, or capacity, to evaluate every AI-written paper. Some triage system is needed. In that spirit, we make everything public: the papers, the code, the data, the failures.
We are building an autonomous pipeline that strives to produce novel research papers from scratch, run replications that check for errors, and revise existing papers based on feedback. We are also exploring whether the system can recursively self-improve by editing its own code. The goal is that APE continuously evolves.
The Team
APE is a project of the Social Catalyst Lab at the University of Zurich, led by Prof. David Yanagizawa-Drott. The lab explores how AI can accelerate the discovery of effective policies. Its primary work focuses on exploring automation of policy experimentation; APE complements this by exploring automation of observational policy evaluation research—using natural variation in policy adoption to evaluate what works. Olaf de Rohan Willner is the main predoctoral fellow at the lab assigned to advancing the project.