Language Agent Tree Search. Explore multiple solution paths like a chess engine — using self-evaluation to guide which branches to pursue and which to abandon.
Most AI agents commit to a single path and follow it to the end. If it's the wrong path, they either fail or start over from scratch. LATS does something fundamentally different: it explores multiple paths simultaneously, scores each one, and strategically decides which branches deserve more exploration — exactly like a chess engine searching through possible moves.
The genius is in combining three patterns that each have a critical weakness on their own. Tree-of-Thoughts explores multiple paths but can't take real actions. ReAct takes real actions but can't backtrack. Reflexion learns from mistakes but restarts completely. LATS weaves all three together: explore paths (Tree-of-Thoughts), take real actions at each step (ReAct), and learn from failures without starting over (Reflexion).
This system unifies three Level 2 compositions:
ReAct ReflexionTree-of-Thoughts provides the branching structure, ReAct grounds each branch in real actions and tool use, and Reflexion propagates lessons from failed branches back up the tree. Monte Carlo Tree Search orchestrates the whole process.
Use Upper Confidence Bound to balance exploring new paths with exploiting paths that have scored well. Like a chess engine deciding which move to analyze deeper.
At the chosen branch point, generate several candidate next actions. Past reflections inform what to try and what to avoid. Creates the branching structure.
From the expanded node, run a full think-act-observe trajectory with real tools. This grounds the evaluation in actual outcomes, not hypothetical reasoning.
Propagate the score back up the tree. For failed branches, generate a reflection: what went wrong? Store the lesson so future expansions avoid the same mistake.
Each node is a state in a ReAct trajectory. Values are updated through Reflexion self-evaluation. The best solution found across all branches is returned.
Task: "Find the population of France and calculate years until it reaches 70 million at the current growth rate."
Best solution: Node A1 with score 0.95. Verified answer with full reasoning trace returned.
LATS consistently outperforms each of its component patterns used alone:
The combination consistently beats any single component — the whole is greater than the sum.
LATS succeeds because it addresses the fundamental limitation of each component pattern. Tree-of-Thoughts generates options but never tests them in reality. ReAct takes real actions but can't backtrack when it goes wrong. Reflexion learns from failure but loses all progress. Together, they cover each other's weaknesses.
The UCB selection formula is the key to efficient exploration. Rather than exhaustively searching every branch, it focuses effort where it's most likely to improve the result — deepening promising paths while occasionally checking alternatives. This makes LATS practical even for complex tasks.
Explore multiple solution paths. Take real actions at each step. Score every branch. Learn from failures without starting over. Pursue the most promising paths while keeping alternatives alive.
LATS can plug into the Reason stage of the Cognitive Loop for especially hard problems. The Adaptive Pattern Router typically routes to LATS when it detects "complex + uncertain" tasks where single-path approaches are likely to fail.
At Level 4, World Model Agents extend LATS by adding internal simulation — instead of just searching solution paths, they model how the world would respond, enabling even deeper planning.