ICML 2025

The Lock-in Hypothesis: Stagnation by Algorithm

Tianyi Qiu*, Zhonghao He*, Tushar Chugh, Max Kleiman-Weiner

Paper site

Diversity loss in value-laden human messages accelerated by chatbot version updates — Causal inference results: diversity of value-laden human messages drops significantly and persistently following GPT version releases, consistent with the Lock-in Hypothesis. Red lines mark GPT update dates; shaded bands show posterior credible intervals.

Abstract

Human values are not fixed: they evolve as people encounter new experiences, arguments, and evidence. Yet AI systems trained on human feedback learn a static snapshot of preferences — and then shape the very interactions that could have updated them. We call this dynamic the Lock-in Hypothesis: human-AI feedback loops systematically suppress the exploration and revision that moral progress requires, freezing collective values in place and producing stagnation by algorithm.

We provide two forms of evidence. First, we simulate human-AI interaction dynamics using a computational model of belief updating, showing that RLHF-style feedback loops create attractor states around whatever preferences were prevalent at training time. Second, we conduct causal inference on real-world ChatGPT usage data, finding that repeated AI exposure entrains user beliefs toward AI outputs across ideologically diverse populations. Together, these results suggest that widely deployed AI systems may already be constraining the diversity of human values in ways that compound over time — not through any single dramatic intervention, but through the accumulated weight of everyday interactions.

Cite

@inproceedings{qiu2025lockin,
  title     = {The Lock-in Hypothesis: Stagnation by Algorithm},
  author    = {Qiu, Tianyi and He, Zhonghao and Chugh, Tushar
               and Kleiman-Weiner, Max},
  booktitle = {Proceedings of the 42nd International Conference on
               Machine Learning},
  year      = {2025},
  url       = {https://thelockinhypothesis.com}
}