Abstract
Large language models have achieved remarkable capabilities across domains, yet mechanisms underlying sophisticated reasoning continue to be explored1,2. Recent reasoning-reinforced models, including OpenAI’s o-series and DeepSeek-r1, outperform other merely instruction-tuned models on complex cognitive tasks3,4, attributed to extended test-time computation through longer chains of thought5. Here we show that enhanced reasoning emerges not from extended computation alone, but from the systematic simulation of complex, multi-agent interactions—a society of thought—which enables the deliberate diversification and debate among internal cognitive perspectives characterized by distinct personality traits and domain expertise. Through quantitative analysis using classified outputs and mechanistic interpretability methods applied to reasoning traces6–8, we find that reasoning models like DeepSeek-r1 exhibit much greater perspective diversity than baseline models, activating broader and more conflict between heterogeneous personality- and expertise-related features during reasoning. This multi-agent structure manifests in conversational behaviors including question-answering sequences, perspective shifts, and reconciliation of conflicting views, as well as in socio-emotional roles that characterize back-and-forth conversation, which together account for over 60% of the accuracy advantage in reasoning tasks through both direct and indirect facilitation of cognitive strategies9,10. Controlled reinforcement learning experiments further reveal that priming models with conversational scaffolding—even when dialogues lead to incorrect solutions—substantially accelerates reasoning improvement compared to answer-only training. These findings indicate that the social organization of thought, rather than correctness alone, enables effective exploration of solution spaces. We suggest that reasoning models establish a computational parallel to collective intelligence in human groups11–13, where diversity enables superior problem-solving when systematically structured and suggest new opportunities for agent organization to harness the wisdom of crowds.