Can we automate
policy evaluation?
AI may soon be capable of producing rigorous economic research. If that happens, policy evaluation could scale dramatically: highlighting what works, what fails, and what harms, far faster than human researchers alone.
We want to find out whether an autonomous system can generate, replicate, and revise empirical policy research, with everything made public.
This is an experiment in building reliable AI research systems. For a global overview, click here.
Last updated: April 14, 2026
Most policies — probably millions of them globally — are never rigorously evaluated. Data is plenty but there aren't enough researchers. Could AI help? We genuinely don't know. So we're running an experiment. An AI system attempts to produce economics research at scale, , using publicly available data. Will any be good? How would we even know? Ideally, we'd want PhDs or editors of top journals to evaluate all of them. But they are busy. We run an automated tournament evaluating the papers against human benchmarks from top journals. This could help triage. Get to a "you know it when you see it" moment, faster. Most importantly, everything is : papers, code, data, failures. The more people look, the faster mistakes get caught. And we want feedback! In fact, the core thesis is that recursive self-improvement is possible and can be enhanced by human feedback. The next milestone: generate a 1000 papers, evaluate, and share lessons in a report. Can policy evaluation be automated? Or is hallucinated slop unavoidable? Let's find out!
Ranking Metrics
▼Tap chart to expand
Data as of 2026-04-14 02:12:54 CET
Review Status
▼Swipe to see more columns
| Rank | 48hRank change over the last 48 hours. | Paper | EloElo rating. Standard chess-like rating where 400 points difference = 90% win probability. | Status✅ Peer reviewed · 🔎 Awaiting review · 🧐 Issues detected · 🚫 Critical errors |
|---|---|---|---|---|
| 1 | — | 2104 | ✅ | |
| 2 | — | 1989 | ✅ | |
| 3 | — | 1920 | ✅ | |
| 4 | — | 1932 | ✅ | |
| 5 | ▲1 | 1914 | ✅ | |
| 6 | ▲3 | 1900 | ✅ | |
| 7 | — | 1896 | ✅ | |
| 8 | ▲3 | 1897 | ✅ | |
| 9 | ▼4 | 1892 | ✅ | |
| 10 | — | 1893 | ✅ | |
| 11 | ▼3 | 1883 | ✅ | |
| 12 | — | 1856 | ✅ | |
| 13 | — | 1851 | ✅ | |
| 14 | ▲2 | 1830 | ✅ | |
| 16 | ▼1 | 1820 | ✅ | |
| 17 | — | 1803 | ✅ | |
| 18 | — | 1802 | ✅ | |
| 19 | ▲1 | 1803 | ✅ | |
| 20 | ▼1 | AEJ: Policy | 1795 | ✅ |
| 21 | ▲2 | 1792 | ✅ | |
| 22 | ▲2 | AEJ: Policy | 1788 | ✅ |
| 25 | ▲2 | 1761 | ✅ | |
| 26 | ▲2 | 1750 | ✅ | |
| 27 | ▲2 | 1742 | ✅ | |
| 28 | ▲2 | 1741 | ✅ | |
| 29 | ▲4 | 1727 | ✅ | |
| 32 | ▲4 | AEJ: Policy | 1710 | ✅ |
| 34 | ▲5 | 1681 | ✅ | |
| 35 | ▲10 | 1672 | ✅ | |
| 36 | ▲8 | 1677 | ✅ | |
| 37 | ▲11 | 1669 | ✅ | |
| 38 | ▲3 | 1663 | ✅ | |
| 40 | — | 1657 | ✅ | |
| 44 | ▲11 | AEJ: Policy | 1625 | ✅ |
| 46 | ▲10 | AEJ: Policy | 1605 | ✅ |
| 47 | ▲13 | 1593 | ✅ | |
| 61 | ▲19 | 1545 | ✅ | |
| 67 | ▼2 | 1533 | ✅ | |
| 74 | ▲18 | AEJ: Policy | 1515 | ✅ |
| 104 | ▲26 | 1471 | ✅ | |
| 117 | ▲28 | 1461 | ✅ | |
| 153 | ▲28 | 1403 | ✅ | |
| 421 | ▲43 | 1169 | ✅ |
Total tokens used for tournament (excludes paper generation tokens): 1,476,570,316