NeurIPS 2025

Stay True to the Evidence: Martingale Score for Bayesian Rationality in LLM Reasoning

Zhonghao He*, Tianyi Qiu*, Hirokazu Shirado, Maarten Sap

arXiv

Martingale Score measures predictability of belief updates from prior alone — Illustration of Martingale Score: a Bayesian-rational reasoner's beliefs should be unpredictable from its prior. When a model's belief updates are predictable from the prior alone, it is exhibiting confirmation bias rather than responding to evidence.

Abstract

A rational reasoner should update beliefs in proportion to the evidence: each step of reasoning should move beliefs toward the truth, not away from it. Formally, beliefs should form a martingale with respect to the evidence sequence — expected future beliefs should equal current beliefs, conditional on all information so far. We introduce the Martingale Score, an unsupervised, regression-based metric that measures how far a language model's sequential reasoning deviates from this Bayesian ideal.

Applying the Martingale Score to a range of LLMs on diverse reasoning tasks, we find that iterative reasoning — chain-of-thought, self-critique, multi-turn dialogue — often deepens confirmation bias rather than advancing truth-seeking: models become more confident in their initial conclusions regardless of what the evidence warrants. The Martingale Score predicts ground-truth accuracy where labels are available without requiring them, making it applicable as an unsupervised proxy for reasoning quality. Our results suggest that more reasoning steps do not uniformly improve epistemic outcomes, and that standard benchmarks may miss systematic biases in how models integrate new information.

Cite

@inproceedings{he2025martingale,
  title     = {Stay True to the Evidence: Martingale Score for
               Bayesian Rationality in {LLM} Reasoning},
  author    = {He, Zhonghao and Qiu, Tianyi and Shirado, Hirokazu and Sap, Maarten},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://arxiv.org/abs/2512.02914}
}