Recursive Chain-of-Feedback

The Idea

Here's a surprising finding: when AI gets a multi-step problem wrong and you simply ask "try again," performance actually gets worse with each retry. Vague feedback like "that's wrong, fix it" causes the model to second-guess correct steps while failing to fix the actual error.

Recursive Chain-of-Feedback takes a targeted approach: find the exact step that went wrong, extract it as its own simpler problem, solve that smaller problem (which is much easier to get right), then plug the corrected answer back into the full solution. If the sub-problem is still too hard, decompose it further. It's self-correction that actually works.

Building Blocks

This composition builds on:

Check Your Work Loop Until Done

R-CoF combines self-evaluation (finding errors) with iterative improvement, but adds a crucial element: recursive decomposition. Instead of retrying the whole problem, it isolates and fixes the specific broken piece.

Why "Try Again" Doesn't Work

Naive Retry

Attempt 1: 70% correct
"Try again" → Attempt 2: 65% correct
"Try again" → Attempt 3: 58% correct

Performance degrades because the model overthinks correct steps while missing the real error.

Attempt 1: 70% correct
Find error → Fix step → 85% correct
Find error → Fix step → 92% correct

Performance improves because each correction is targeted and precise.

See It in Action

Question: "A store sells apples for $2 each. John buys 5 apples and pays with a $20 bill. How much change does he get?"

AI attempts a solution

Initial attempt

Step 1: Cost = 5 × $2 = $12
Step 2: Change = $20 − $12 = $8

↓ evaluate — find the error

Pinpoint the exact broken step

Error identified

Step 1 is wrong: "5 × $2 = $12" — the multiplication is incorrect.

↓ isolate as a simpler problem

Solve the sub-problem

Sub-problem

What is 5 × 2?

Sub-answer

5 × 2 = 10

This simpler question is much easier to get right than retrying the whole problem.

↓ plug the fix back in

Reconstruct the corrected solution

Corrected solution

Step 1: Cost = 5 × $2 = $10
Step 2: Change = $20 − $10 = $10

The Recursive Part

What if the sub-problem is also too hard? The technique applies itself recursively. Imagine a complex physics problem where the error is in a calculus step, and the calculus sub-problem has an algebra error. R-CoF would:

1. Identify the calculus step as wrong
2. Isolate it as a calculus sub-problem
3. Attempt the calculus — find an algebra error within it
4. Isolate the algebra as an even simpler sub-problem
5. Solve the algebra correctly (simple enough now)
6. Rebuild the calculus, then rebuild the physics solution

Each level of recursion makes the problem simpler, until it's easy enough to solve correctly.

Why This Works

The insight is that AI errors are usually local, not global. When a 10-step solution goes wrong, typically one or two specific steps contain mistakes while the rest are fine. Retrying the whole thing risks breaking the good steps. Targeted correction fixes only what's broken.

Making the sub-problem simpler is equally important. AI is much more reliable on easy, focused questions than on complex multi-step ones. By extracting "What is 5 × 2?" from a larger problem, you're playing to the model's strengths.

The Composition

Find the exact step that's wrong. Extract it as a simpler problem. Solve it. Plug the fix back in. If the sub-problem is still too hard, go deeper. Surgical precision instead of blind retrying.

When to Use This

• Multi-step reasoning problems where errors are localized to specific steps
• Math and logic tasks where individual steps can be verified
• When you need interpretable corrections — showing what was wrong and how it was fixed
• Zero-shot self-correction without needing example data

When to Skip This

• Single-step problems — there's nothing to decompose if the problem is already atomic
• Creative tasks — when there's no objectively "correct" answer, error identification doesn't apply
• Fundamentally wrong approach — if the entire reasoning strategy is off, fixing individual steps won't help
• Real-time applications — recursive correction adds multiple LLM calls and latency

How It Relates

R-CoF is a more structured approach to the same goal as Reflexion: improving through self-correction. Reflexion works across entire episodes (attempt, reflect, retry), while R-CoF works within a single solution (find the broken step, fix just that piece). They can even be combined: Reflexion for episode-level learning, R-CoF for within-episode step-level correction.

It also relates to Least-to-Most prompting, which decomposes problems into easier sub-problems from the start. R-CoF uses decomposition after failure — only breaking things down when and where errors actually occur, rather than preemptively.