Self-Consistency: sample multiple reasoning paths for better answers

In one sentence Wang et al. (Google Brain) show that sampling N diverse reasoning paths and taking the most frequent answer beats greedy decoding on all reasoning benchmarks.

Verified Official source

ShareLinkedIn X

When you ask a model to solve a problem, it usually takes the "most likely" path to the answer. Self-Consistency does something different: it asks the model to solve the same problem many times in different ways, then picks the answer that shows up most often.

It's like asking ten people to solve a math problem independently and then voting on the result: even if someone makes a mistake, the majority gets it right.

The outcome is surprising: without changing anything in the model, just by sampling multiple times, you get large improvements on arithmetic and logical reasoning.