The Magic Words That Make AI Code Better

Deep domain knowledge, expressed through precise terminology, dramatically improves AI output. Here are the magic spells that actually work.

I've spent months talking about how AI models aren't magic. They're statistics, pattern matching, next-token prediction at scale. But today we're going to talk about something that feels like magic.

People have started calling this metatextual literacy, the ability to shape AI output through precise language about language itself. Metatextual literacy is also the ability to talk about the shape, style, and constraints of what you want, not just the topic.

Here's the observation that shouldn't be surprising but somehow still is: the quality of code you get from an AI agent depends less on how much context you stuff into the prompt and more on how precisely you can name what you want.

This drives people crazy. I've watched developers dismiss coding agents as "dice rolling" after a few frustrating attempts. They're not entirely wrong, if you approach it like you're ordering from a drive-through menu, you'll get drive-through quality results. But here's the thing: the difference between useful and useless output isn't the model. It's whether you can speak the language of what you're building.

Naming Things Is Still the Hard Problem

This pattern extends beyond code.

A friend who works in architecture watched someone generate building concepts with Nano-Banana Pro. The amateur prompts produced generic "modern house with large windows." The professional prompts referenced Brutalist mass, Metabolist modularity, specific material palettes from Case Study House #22. Same model, radically different outputs. The difference wasn't technical knowledge of AI, it was fluency in architectural language.

In music production, a bedroom producer might ask Suno for "electronic drums, kind of industrial sounding." Someone who's actually worked in that space will specify 909 kicks, Simmons pads, gated reverb on the snare, specific dB reduction on the gate threshold. They're not prompting better because they understand transformers. They're prompting better because they can name what they want.

Deep domain knowledge, expressed through precise terminology, dramatically improves AI output quality. This isn't hyperbole. It's the difference between "make this code better" and "apply the strangler pattern with a thin adapter to keep the blast radius low."

The Code Problem: Everyone Thinks They Know What They Want

Developers should be better at this than anyone else. We've spent decades insisting that naming things is one of the two hard problems in computer science. We write style guides. We bikeshed variable names in code review. We understand that getUserData() and fetchUserProfile() communicate different intentions.

Yet somehow, when talking to an AI coding agent, that discipline evaporates. Developers who would never write a ticket saying "make the user thing work better" will prompt with "refactor this to be cleaner" and then act surprised when they get a generic response.

The evidence is everywhere. Browse any forum where people complain about AI coding tools:

"It rewrote my entire service layer when I just wanted to add one field"
"It invented three new abstractions I don't need"
"It broke existing tests that were working fine"
"It added dependencies I explicitly don't want"

Very often these are specification failures rather than model failures. The agent did exactly what an intelligent assistant would do when given vague instructions and incomplete constraints: it made reasonable assumptions based on common patterns, optimized for what usually works, and delivered something that technically satisfies the request.

The problem is "what usually works" isn't "what I specifically need in this specific codebase right now."

Magic Spells Are Just Precise Constraints

So here's what actually works: treat coding prompts like you're defining the search space for an optimization problem. Because that's largely what you're doing.

Every piece of terminology you use, every named pattern, every specific constraint narrows the solution space. It rules out entire categories of approaches the model might otherwise explore.

When you say "surgical change," you're not being poetic. You nudge the model toward small diffs and away from broad refactors.

When you say "respect existing boundaries," you're eliminating all solutions that introduce cross-layer coupling.

Think of these as compile-time constraints for natural language. The model is going to search its training data for patterns that match your description. Your job is to describe the shape of the solution with enough precision that the search space doesn't include a bunch of stuff you definitely don't want.

Here's the toolbox. These aren't new concepts, they're patterns you already know from code review, refactoring books, and painful production incidents. The difference is that naming them explicitly in your prompts makes them actionable constraints instead of implicit tribal knowledge.

Note: The rest of this piece is a reference. You do not need all of it at once. Skim and steal the two or three spells that match what you are working on today.

1. Scope and Blast Radius

1.1 Low blast radius

What it does: Keeps changes confined to a small part of the codebase. Prevents wide-reaching refactors.

When to use it: Modifying legacy code, touching critical paths, or working near things you don't fully understand.

The spell:

Implement this with a low blast radius.
Only change code that is strictly needed for the new behavior.

Why it works: Without this, models default to "clean up while I'm here" behavior. Fine in greenfield projects. Catastrophic in production code where you don't know all the dependencies.

1.2 Surgical change / surgical edit

What it does: Forces the agent to find the minimal viable edit point. Discourages cascading modifications.

When to use it: You already know roughly where the change belongs and just need the implementation details right.

The spell:

Perform a surgical change in the existing implementation
rather than introducing new modules.

Why it works: Models trained on open source repos see a lot of "add new file" patterns. Sometimes you genuinely need that. Often you just need three lines in an existing function. This makes the latter explicit.

1.3 Change budget

What it does: Puts explicit numerical limits on scope, files touched, lines changed.

When to use it: When you want quantifiable boundaries. Useful when working with agents prone to over-engineering.

The spell:

Respect a change budget of at most 3 files
and about 80 changed lines total.

Why it works: Humans do this instinctively when reviewing PRs. We have internal alarms that go off when a "small bugfix" touches 47 files. Making it explicit pushes the agent into PR-review thinking instead of "fresh repo" thinking. The model will not hit these numbers exactly every time, but the constraint pulls its solution closer to what you want.

1.4 Localized impact

What it does: Keeps behavior and data flow changes contained. Prevents threading concerns across the application.

When to use it: Hot paths, critical modules, anywhere stability matters more than elegance.

The spell:

Keep the impact localized to this package.
Do not add dependencies from other layers.

Why it works: It's the difference between "this feature needs authentication" and "this feature should call the existing auth module." The former invites architectural changes. The latter enforces architectural boundaries.

1.5 Respect existing boundaries

What it does: Prevents shortcuts that bypass established layers, service layers, abstraction barriers, module boundaries.

When to use it: Your architecture already has clean separation and you want to keep it that way.

The spell:

Respect existing boundaries and layering.
Do not introduce cross-layer calls that break the current architecture.

Why it works: This stops the "oops I just had the controller write directly to the database because it was faster" problem. Models know common layering patterns. This tells them to detect and follow yours.

1.6 Vertical slice, not framework

What it does: Implements a thin end-to-end feature instead of inventing reusable infrastructure.

When to use it: When you need something that works today, not a generic platform for hypothetical future use cases.

The spell:

Implement this as a vertical slice, not a reusable framework.
Solve only this concrete use case.

Why it works: It's YAGNI as a constraint. You're explicitly ruling out "but what if we need to..." thinking, which is where half of all unnecessary complexity comes from.

2. Complexity and Abstraction Control

2.1 No new abstractions unless unavoidable

What it does: Stops premature abstraction. Prevents the agent from creating new base classes, interfaces, or patterns "for extensibility."

When to use it: Your codebase already has enough patterns. You don't need more.

The spell:

Prefer the current patterns.
Introduce no new abstractions unless unavoidable.

Why it works: Models are trained on a lot of "clean code" examples that love abstraction. Sometimes that's right. Often it's overkill. This sets the default to "use what exists."

2.2 Prefer existing patterns and conventions

What it does: Enforces local idioms over global best practices that might clash with the codebase style.

When to use it: Your project has its own conventions, error handling, logging, configuration, whatever, and you want consistency.

The spell:

Prefer existing patterns and conventions in this module
over "best practice" rewrites.

Why it works: "Best practice" language is a warning sign. It usually means "I'm going to import my favorite pattern from another ecosystem regardless of whether it fits here." This blocks that.

2.3 YAGNI friendly

What it does: Cuts off speculative features before they start.

When to use it: Scope creep is likely. Requirements are fuzzy. You want the minimum.

The spell:

Keep the solution YAGNI friendly.
Implement only what is required for today's feature.

Why it works: You Ain't Gonna Need It is one of those principles everyone agrees with in theory and violates constantly in practice. Making it explicit in the prompt actually enforces it.

2.4 Minimal public surface area

What it does: Avoids expanding the public API. Keeps new helpers private or internal.

When to use it: You're designing new classes or modules and want to preserve future flexibility.

The spell:

Keep a minimal public surface area.
Make new helpers private or internal unless they must be public.

Why it works: Every public method is a contract. Every contract is technical debt. This minimizes both.

2.5 Keep reasoning local

What it does: Encourages cohesive functions and classes. Reduces tracing logic across multiple files.

When to use it: When you care about maintainability, onboarding, and not making future-you hate past-you.

The spell:

Keep reasoning local.
A reader should understand this feature by reading at most two files.

Why it works: Distributed reasoning is cognitive overhead. This caps it. Simple constraint, massive impact on code comprehension.

3. Safety, Tests, and Risk Management

3.1 No behavior change refactor

What it does: Separates cleanup from feature work. Forces explicit two-step changes.

When to use it: When you want to make code better without changing what it does, refactor first, then add behavior.

The spell:

First do a no behavior change refactor to clarify the code, with tests,
then add the new behavior.

Why it works: Mixed-mode changes are review nightmares. Separating them makes both easier to verify. This also surfaces whether your tests actually cover behavior or just structure.

3.2 Characterization tests first

What it does: Snapshots current behavior before modification. Creates a safety net in legacy code.

When to use it: Touching fragile code. Working in areas with poor test coverage. High-risk changes.

The spell:

Add characterization tests around the current behavior before changing it.
Then modify the code while keeping those tests passing.

Why it works: This is straight from Michael Feathers' "Working Effectively with Legacy Code." Agents respond well when you name the pattern. It transforms "scary legacy code" into "code with known behavior."

3.3 Test seam

What it does: Creates a stable boundary where behavior can be probed or mocked without massive refactoring.

When to use it: When code is hard to test as-is. You need isolation but can't afford a rewrite.

The spell:

Introduce a test seam so this logic can be tested in isolation,
but keep the rest of the module unchanged.

Why it works: Another Feathers concept. A seam is a place you can alter behavior without editing code. Dependency injection points, strategy patterns, configuration, all seams. Naming it makes it a specific architectural requirement.

3.4 Guardrails and failure modes

What it does: Forces consideration of error paths, not just happy paths.

When to use it: When the feature touches IO, payments, external APIs, or anything where failure is expensive.

The spell:

Add simple guardrails and clear failure modes.
Fail safe, log clearly, do not crash the process.

Why it works: Default AI code tends toward optimistic execution. This adds defensive pessimism as a requirement.

3.5 Feature flagged path

What it does: Makes new behavior optional and reversible. Enables gradual rollout.

When to use it: When you want easy rollback, A/B testing, or risk mitigation.

The spell:

Implement the new behavior behind a feature flag,
keeping the old path as the default.

Why it works: It's not just about the flag, it's about forcing the agent to design for coexistence. Old path and new path both need to work. That constraint shapes the entire implementation.

4. Integration and Architecture Safety

4.1 Thin adapter

What it does: Isolates external system details. Prevents SDK types from leaking through your application.

When to use it: Integrating with external APIs, queues, databases, third-party services.

The spell:

Use a thin adapter around the external API
so that the rest of the codebase only sees our own types.

Why it works: Dependency inversion, anti-corruption layer, whatever you want to call it, the concept is keeping your domain clean. This makes it a structural requirement instead of an aspiration.

4.2 Strangler pattern

What it does: Lets new implementation coexist with old code. Routes traffic gradually. Caps risk.

When to use it: When replacing a subsystem, doing heavy refactors, or running migrations.

The spell:

Apply a strangler pattern. Keep the old implementation,
route only the new use case to the new code,
and keep the interface stable.

Why it works: Named after Martin Fowler's pattern. It's the opposite of big-bang rewrites. Agents understand incremental migration when you give them the vocabulary for it.

4.3 Keep it in process

What it does: Prevents over-engineering with new services, queues, or infrastructure.

When to use it: When agents love suggesting microservices and you don't need microservices.

The spell:

Keep it in process. Do not introduce new services or infrastructure.
This change must stay within the current application.

Why it works: Distributed systems are complex. This rules them out. Sometimes you need them. Most of the time you don't. Be explicit.

4.4 Implementation detail, not new contract

What it does: Keeps new logic private. Preserves future flexibility to change it.

When to use it: When you're uncertain about the design, requirements might shift, or you want room to iterate.

The spell:

Treat the new logic as an implementation detail, not a new public contract.
Avoid exporting it beyond this module.

Why it works: Public APIs are commitments. Private implementation is negotiable. This sets the visibility before any code is written.

How to Actually Use This

Here's a compact pattern you can drop into any coding agent prompt:

Implement X in the existing codebase as a vertical slice with a low blast radius.
Respect a change budget of at most 3 files and about 80 changed lines.
Prefer existing patterns, no new abstractions unless unavoidable,
and keep reasoning local.
Add characterization tests first and keep the change backwards compatible.

Adjust based on context. High-risk change? Add "feature flagged path" and "guardrails and failure modes." External APIs? Add "thin adapter." Legacy code? Use "surgical change" and "respect existing boundaries."

The point isn't to memorize all twenty patterns. The point is to recognize that you already know these concepts, you've just been leaving them implicit. Making them explicit transforms vague dissatisfaction with AI output into actionable constraints.

The Learning Curve Is Real

None of this is new. Surgical changes, YAGNI, strangler patterns, test seams, these are established ideas from refactoring books, design pattern catalogs, and painful production incidents.

What's new is their usefulness gets amplified through large language models. An experienced developer can look at code and intuitively know "this change should touch three files, max." That intuition is accumulated pattern matching from years of code review. You can't explain it to a junior developer in a single sentence.

But you can explain it to a language model in a single sentence. "Respect a change budget of at most 3 files." That's the entire intuition, compressed into a constraint the model can actually optimize against.

This is why developers who dismiss coding agents as useless are often the ones who'd be best positioned to use them well. They have the domain knowledge. They can name the patterns. They just haven't realized that the naming itself is the interface.

The learning curve isn't about how transformers work or how context windows scale. It's about articulating what you already know with enough precision that a model can act on it. That's a different skill than writing code. Closer to writing good tickets, conducting effective code reviews, or mentoring junior developers.

If you think coding agents are dice rolling, you're probably right, you just haven't realized you can load the dice. The magic words already exist. You just have to say them.

Want to go deeper? The Coding Agents meistern workshop teaches the full repertoire, from naming patterns to building custom MCP servers that encode your team's constraints directly into the agent's workflow.

The Magic Words That Make AI Code Better

Naming Things Is Still the Hard Problem

The Code Problem: Everyone Thinks They Know What They Want

Magic Spells Are Just Precise Constraints

1. Scope and Blast Radius

1.1 Low blast radius

1.2 Surgical change / surgical edit

1.3 Change budget

1.4 Localized impact

1.5 Respect existing boundaries

1.6 Vertical slice, not framework

2. Complexity and Abstraction Control

2.1 No new abstractions unless unavoidable

2.2 Prefer existing patterns and conventions

2.3 YAGNI friendly

2.4 Minimal public surface area

2.5 Keep reasoning local

3. Safety, Tests, and Risk Management

3.1 No behavior change refactor

3.2 Characterization tests first

3.3 Test seam

3.4 Guardrails and failure modes

3.5 Feature flagged path

4. Integration and Architecture Safety

4.1 Thin adapter

4.2 Strangler pattern

4.3 Keep it in process

4.4 Implementation detail, not new contract

How to Actually Use This

The Learning Curve Is Real

The Case for a Programmable Desktop

Who Owns the Means of Computation?

The Magic Words That Make AI Code Better

Naming Things Is Still the Hard Problem

The Code Problem: Everyone Thinks They Know What They Want

Magic Spells Are Just Precise Constraints

1. Scope and Blast Radius

1.1 Low blast radius

1.2 Surgical change / surgical edit

1.3 Change budget

1.4 Localized impact

1.5 Respect existing boundaries

1.6 Vertical slice, not framework

2. Complexity and Abstraction Control

2.1 No new abstractions unless unavoidable

2.2 Prefer existing patterns and conventions

2.3 YAGNI friendly

2.4 Minimal public surface area

2.5 Keep reasoning local

3. Safety, Tests, and Risk Management

3.1 No behavior change refactor

3.2 Characterization tests first

3.3 Test seam

3.4 Guardrails and failure modes

3.5 Feature flagged path

4. Integration and Architecture Safety

4.1 Thin adapter

4.2 Strangler pattern

4.3 Keep it in process

4.4 Implementation detail, not new contract

How to Actually Use This

The Learning Curve Is Real

Continue reading

The Case for a Programmable Desktop

Who Owns the Means of Computation?