Context Engineering

Advanced Guide · Context Engineering

Get the right actions from your Agent — not just the right answers — by giving it the right files to work from.

Context mindset

Move repeatable instructions out of chat

You can get a long way with your Agent by typing questions into the chat. At some point, though, you'll find yourself retyping the same instructions every session: the same definitions, the same data caveats, the same notes about how a report should read.

That repetition is the signal that you're ready for context engineering. By the end of this page, you'll know which files make up context engineering, when to reach for each, and how to keep them lean.

What should my Agent always know?

What instructions should live in a reusable plan?

What mistakes should the Agent avoid every time?

Context engineering is the practice of curating what your Agent sees, and in what order, so it produces the output you want without being told everything from scratch each time.

Context, as Phil Schmid puts it, is "everything the model can see before it answers" — not just the sentence you typed, but the instructions, examples, reference data, and prior turns around it. You shape that by putting durable instructions into files your Agent loads and references, rather than packing them into a single prompt.

What this looks like in practice

Consider the customer-experience digest a CX lead wants every month.

Without context engineering

Every month starts with the same long chat instruction.

“Build me the May customer experience digest. Use the system_session_prompt and unstructured_analysis tables. Compare to April. Pull three verbatim quotes for each top theme. Write it as a full analysis, not bullet points. Don't lean on the chart to do the storytelling…”

With context engineering

The same job lives in a small set of files: a plan file and a failure-modes file.

“Execute the customer experience digest plan for May.”

The instructions no longer depend on whoever happens to run the chat. Same output, every month, regardless of who runs it. That contrast is the whole point. Below is what those files actually are.

From prompt engineering to context engineering

Prompt engineering was about wording your way to the right answer. It lived in the chat box — phrasing, framing, the occasional "imagine you're an expert." Context engineering is the successor to that idea, and it works at a different level: getting the right actions from an agent that does work on your behalf, not just the right text back.

This isn't only our framing; it's where the field landed through 2025. Shopify's Tobi Lütke described context engineering as "the art of providing all the context for the task to be plausibly solvable by the LLM." Andrej Karpathy called it "the delicate art and science of filling the context window with just the right information for the next step." The practical upshot, in LangChain's Harrison Chase's words: "most agent failures are not model failures anymore, they are context failures." When an answer comes back wrong, the first place to look is what your Agent could and couldn't see — not the model.

In practice, less is more. You don't need to talk your Agent into being good at its job, and you don't need to repeat yourself.

Role-playing instructions like "imagine you're an expert" mostly shape tone, not correctness, so we skip them. A well-built set of files means your Agent stops relearning your vocabulary and your data on every run.

When it's worth the effort

None of this is required on day one. Your Agent has a low floor: ask a plain question and you'll get something useful immediately. Context engineering is how you reach the higher ceiling, and it earns its keep specifically when a process is repeatable — a monthly report, a weekly digest, an analysis you'll run again with new dates.

Use chat when the question is one-off

Ask directly when you need a quick answer, a first look, or an exploratory thread.

Use context files when the process repeats

Move durable instructions into files when you're producing a monthly report, weekly digest, recurring analysis, or handoff another person needs to run.

Keep the context small

A model works within a finite attention budget, and as its working memory fills, recall degrades — Anthropic calls this "context rot." Good context engineering uses the smallest set of high-signal material that does the job.

The building blocks

Almost everything your Agent works from is, structurally, just a file. The industry has converged on a common shape for these — plain Markdown rather than JSON or YAML, scoped from broad to narrow, versioned over time, and editable before, during, and after a run. You don't need all of them to start. Reach for three first, in this order, and grow into the rest.

Start with three, in the order you'll actually need them. First, a plan file — the analysis instructions for a report you'll run more than once. Then a failure-modes file — added the first time you notice your Agent defaulting to a shape you don't want. Then an agent file — the extraction you make when you realize you're typing the same business context into every plan. The rest are supporting players you add when a specific need shows up.

Start here

  1. Plan file

Use when you'll run the same analysis more than once.

  1. Failure-modes file

Use when output keeps defaulting to a shape you don't want.

  1. Agent file

Use when you're repeating the same business context across plans.

File
What it holds
When to reach for it
Scope
Plan file
The full instructions for a report — objective, fields, join keys, output structure.
You'll run the same analysis more than once.
Per report
Failure-modes file
Behaviors to block — bullet overuse, charts carrying the story, drifting into dashboard aesthetics.
Output keeps defaulting to a shape you don't want.
Reusable across reports
Agent file
Standing business context the Agent should always assume — key terms, what matters, and the orientation it would otherwise ask for every time.
You're duplicating the same context into every plan, or the Agent keeps relearning your business.
Per workspace
Report execution guidelines
How a report should look — visualization choices, formatting, output expectations.
You want consistent presentation across runs.
Per report or workspace
Schema file
What your data fields mean.
The Agent needs to read your tables correctly.
Per workspace
Ontology file
Your domain vocabulary — the specific language your organization uses.
Your terminology is non-obvious or easy to misread.
Per workspace
Skill
A named, invocable file that can carry its own reference attachments, loaded only when needed.
A process is worth packaging and reusing — by you, your team, or the org.
Personal / team / org

A useful way to picture the agent file: your Agent is a brilliant new analyst with no memory of yesterday — every conversation starts cold (Karpathy likens it to the amnesia in Memento) — except it will follow written standing instructions perfectly when you give them. The agent file is the standing brief you hand that analyst each morning so it doesn't have to relearn your business from scratch. Keep it lean: frontier models follow on the order of 150 to 200 instructions reliably, so an agent file should hold as few as possible, ideally only the ones that are universally true for your workspace. An ontology, for its part, is quick to build: have your Agent read your website and past conversations and assemble the vocabulary for you.

Failure patterns

Prompt stuffing

Cramming every instruction into the chat every time, instead of moving durable instructions into files.

Instruction sprawl

Letting an agent file grow into a niche encyclopedia no one keeps current — the failure the 150-to-200-instruction limit is meant to prevent.

Cargo-cult plans

Running a complex plan file you didn't build and can't maintain, the "wall of text" failure described at the bottom of this page.

Full templates — a real plan-file skeleton, an example failure-modes file, a sample report execution guidelines file — live in the Context Files Reference, which has field-by-field walkthroughs of each file type rather than the overview here.

Working with these files

A few patterns make context engineering smoother in practice.

Practice 1

Iterate the plan before you run it

Spend ten or fifteen minutes reading the draft your Agent produces and asking for a round or two of revisions. Catching a wrong join key or a missing field upfront is far cheaper than discovering it mid-run.

Practice 2

Show, don't tell

Examples are the highest-leverage instruction you can give — a worked "here's what good looks like" steers output far more reliably than describing it in the abstract. A failure-modes file blocks the patterns you don't want; a couple of example outputs pin down the pattern you do.

Practice 3

Structure plans so updates only touch the top

Define your key terms in the first sections, then reference them by name everywhere else. When something changes, you edit the definition once and the rest plugs in automatically. It works a bit like algebra.

Practice 4

Let your Agent edit the files for you

You don't hand-edit a plan to change the month or pull out a deprecated section. You tell your Agent what to change — "change this plan file to be May," "remove all references to the short-file output" — and it rewrites the file in place.

Practice 5

Keep heavy output in files, not the chat

For anything that generates a lot of text, instruct your Agent to offload long summaries to a working file and assemble the report from that at the end. The thread stays readable, the intermediate work stays auditable, and you spend less of that finite attention budget on material you don't need in front of you.

Practice 6

Move work cleanly between threads

When a plan is ready to run, ask your Agent to write the short execution preamble you should paste into the new thread alongside it. When you want to hand a piece of work to a colleague, ask for a handoff file capturing what's been learned in the thread; they upload it in a new chat and pick up where you left off.

What's available now, and what's coming

Toolkit status

Most of the toolkit is usable today

Plan files, schema and ontology files, failure-modes files, uploading a reference file your Agent always consults, and having your Agent edit any of these in place are all usable today.

A formal agent.md — a persistent, per-workspace instructions file your Agent always references automatically — is in progress. It's modeled on the way coding agents use a standing instructions file (Claude's CLAUDE.md), and it aligns with AGENTS.md, an emerging cross-industry standard used by more than 60,000 projects and supported across tools like OpenAI Codex and Cursor.

Today you can get most of the effect by uploading a reference file and telling your Agent to consult it. A list skills command, mirroring the way you can already ask your Agent to list dimensions, is planned so you can discover available skills, and invoking a skill by a short command is where skills are headed.

A note on the "wall of text"

The first time you open a finished plan file, it can look intimidating — a long document dense with instructions. That reaction is normal, and worth naming.

A plan feels much easier when you built it yourself, because you remember the reasoning behind each section; handed a complete one cold, anyone sees a wall of text. The fix isn't to study the system deeply before you start. It's to make a couple of passes through the file, recognize the few moving parts that actually matter, and let your mental map fill in.

Picture your Agent as a market-research agency you've hired for a serious, bespoke project. You wouldn't turn them loose without a clear plan of what they're going to deliver. Context engineering is how you write that plan down once — and then never have to write it again.

Related

  • Ask vs Plan — when to use a single-thread question vs. a plan file.
  • Your First Session — if you're just opening a workspace and haven't asked anything yet.
  • Context Files Reference (coming soon) — full templates and a field-by-field walkthrough of each file type.


What’s Next