Don't just say "try again." Find the exact step that went wrong, isolate it as a simpler problem, fix it, and rebuild. Surgical correction, not blind retrying.
Here's a surprising finding: when AI gets a multi-step problem wrong and you simply ask "try again," performance actually gets worse with each retry. Vague feedback like "that's wrong, fix it" causes the model to second-guess correct steps while failing to fix the actual error.
Recursive Chain-of-Feedback takes a targeted approach: find the exact step that went wrong, extract it as its own simpler problem, solve that smaller problem (which is much easier to get right), then plug the corrected answer back into the full solution. If the sub-problem is still too hard, decompose it further. It's self-correction that actually works.
This composition builds on:
Check Your Work Loop Until DoneR-CoF combines self-evaluation (finding errors) with iterative improvement, but adds a crucial element: recursive decomposition. Instead of retrying the whole problem, it isolates and fixes the specific broken piece.
Attempt 1: 70% correct
"Try again" → Attempt 2: 65% correct
"Try again" → Attempt 3: 58% correct
Performance degrades because the model overthinks correct steps while missing the real error.
Attempt 1: 70% correct
Find error → Fix step → 85% correct
Find error → Fix step → 92% correct
Performance improves because each correction is targeted and precise.
Question: "A store sells apples for $2 each. John buys 5 apples and pays with a $20 bill. How much change does he get?"
This simpler question is much easier to get right than retrying the whole problem.
What if the sub-problem is also too hard? The technique applies itself recursively. Imagine a complex physics problem where the error is in a calculus step, and the calculus sub-problem has an algebra error. R-CoF would:
Each level of recursion makes the problem simpler, until it's easy enough to solve correctly.
The insight is that AI errors are usually local, not global. When a 10-step solution goes wrong, typically one or two specific steps contain mistakes while the rest are fine. Retrying the whole thing risks breaking the good steps. Targeted correction fixes only what's broken.
Making the sub-problem simpler is equally important. AI is much more reliable on easy, focused questions than on complex multi-step ones. By extracting "What is 5 × 2?" from a larger problem, you're playing to the model's strengths.
Find the exact step that's wrong. Extract it as a simpler problem. Solve it. Plug the fix back in. If the sub-problem is still too hard, go deeper. Surgical precision instead of blind retrying.
R-CoF is a more structured approach to the same goal as Reflexion: improving through self-correction. Reflexion works across entire episodes (attempt, reflect, retry), while R-CoF works within a single solution (find the broken step, fix just that piece). They can even be combined: Reflexion for episode-level learning, R-CoF for within-episode step-level correction.
It also relates to Least-to-Most prompting, which decomposes problems into easier sub-problems from the start. R-CoF uses decomposition after failure — only breaking things down when and where errors actually occur, rather than preemptively.