Stop writing prompts. Declare what you need, build it like software, and let a compiler find the best prompts and examples for you.
Traditional prompt engineering is like writing assembly code: you're hand-crafting exact text strings, testing them by feel, and hoping they work. When the model changes, your prompts break. When the task changes, you start over from scratch.
DSPy treats AI like software. You declare what each step should do ("given context and a question, produce an answer"), compose steps into modules, and then a compiler automatically discovers the best prompts, examples, and configurations. You never write a prompt string. You write a program, and DSPy figures out how to talk to the model.
This composition builds on:
Chain It APEDSPy takes prompt chaining (composing multi-step pipelines) and automatic prompt optimization (APE), and wraps them in a full programming framework with a compiler that optimizes everything together.
Declare what each step does: "context, question → answer." No prompt text — just input and output descriptions. The compiler handles the wording.
Compose signatures into pipelines. Chain-of-thought, retrieval, ReAct agents — all available as building blocks you snap together like LEGO.
Give it examples and a quality metric. It automatically finds the best prompts, selects the best few-shot examples, and optimizes the whole pipeline together.
Building a question-answering system that searches a knowledge base and then reasons about what it found.
No prompt text written. Just a semantic description of the transformation.
Two building blocks snapped together. Like calling functions — not crafting prompts.
Write prompt strings by hand
Test by eye — "does this look right?"
Break when models change
Each pipeline is a one-off
Optimization = intuition + trial-and-error
Declare signatures, never write prompts
Test with metrics — measured accuracy
Portable across models (recompile)
Modules are reusable building blocks
Optimization = automated compiler search
Hand-writing prompts conflates two things: what you want (the semantic goal) and how to get it (the exact wording). DSPy separates them. You declare the what; the compiler discovers the how. This means when you switch models, you just recompile — the compiler finds new optimal prompts for the new model automatically.
The compiler also has a major advantage over manual tuning: it can search systematically across thousands of prompt/example combinations, finding configurations a human would never try. Small compiled models can even match the performance of expert-prompted large models.
Declare what each step should do. Compose steps into modules. Let a compiler find the best prompts and examples automatically. Programming for AI, not prompt-crafting for AI.
DSPy is the full realization of what APE started: automated prompt optimization. Where APE optimizes a single instruction, DSPy optimizes entire multi-step pipelines — prompts, examples, and module configurations all together.
It also provides a different approach from frameworks like LangChain. LangChain gives you tools to chain prompt calls together, but you still write the prompts. DSPy replaces prompt-writing with declaration and compilation — a higher level of abstraction that trades control for systematic optimization.