Context Engineering

The Context Engineering Stack: How to Structure Data for AI Agents

Hector PettersenMarch 31, 20265 min read

There’s a growing gap between teams that use AI agents and teams that get real value from them. The difference almost never comes down to which model they’re using. It comes down to what they’re feeding it.

Most teams dump a mix of documents, scraped web pages, and internal notes into a prompt and hope the agent figures it out. Sometimes it does. More often, it produces output that sounds coherent but is built on a shaky foundation of unverified claims, stale data, and missing context.

Context engineering is the discipline of structuring data so that AI agents can reliably reason with it. It’s not prompt engineering. It’s the layer underneath — the data architecture that determines whether your prompts even have the right material to work with.

The four layers

A well-structured context stack has four distinct layers. Each serves a different purpose, and collapsing them together is where most implementations go wrong.

Layer 1: Raw facts. These are verifiable, timestamped data points. “Competitor X raised $18M Series A on March 3, 2026.” “They have 47 open positions on LinkedIn as of this week.” “Their pricing page shows three tiers starting at $49/month.” No interpretation, no spin. Just what’s observable and when it was observed.

Layer 2: Structured interpretations. This is where facts get connected. “Competitor X’s 12 new enterprise AE postings, combined with their recent Series A, suggest an upmarket push.” The key difference from layer 1: these are explicitly labeled as interpretations, not facts. An agent that knows it’s working with an inference treats it differently than one that thinks it has confirmed data.

Layer 3: Confidence metadata. Every claim — whether fact or interpretation — carries a confidence indicator. “Confirmed: scraped from public pricing page, verified March 28.” “Inferred: based on job posting patterns, moderate confidence.” “Unverified: mentioned in a single industry blog post.” This lets the agent weight its responses appropriately instead of treating everything with equal certainty.

Layer 4: Source provenance. Where did each data point come from? A company’s own website, a news article, a job board scrape, an SEC filing? Source provenance lets your team trace any agent output back to its origin. When your sales rep uses an agent-generated competitive brief, they can check whether a specific claim came from a reliable source or a random forum post.

Why the layers matter

Without this separation, you get what most teams have today: a blob of text where facts, opinions, and outdated information are all mixed together. The agent processes it, produces output, and nobody can tell which parts are grounded in reality and which parts are the model filling gaps with plausible-sounding guesses.

Consider two scenarios. In the first, you give an agent a paragraph about a competitor: “Acme Corp is a fast-growing B2B SaaS company that recently raised funding and is expanding its enterprise offering.” The agent will use this confidently. But “fast-growing” is vague, “recently” could mean last month or last year, and “expanding its enterprise offering” could mean anything from a new pricing tier to a complete platform rebuild.

In the second scenario, the agent gets structured data: headcount grew from 34 to 52 in Q1 2026 (fact, confirmed, LinkedIn scrape). They posted 8 enterprise sales roles in March (fact, confirmed, job board). Revenue estimated at $4–6M ARR (interpretation, moderate confidence, based on headcount and pricing). The agent now has specific, weighted, traceable data to reason with. The output quality difference is dramatic.

{
"claim": "Acme Corp posted 8 enterprise sales roles in March 2026",
"type": "fact"
}

The format question

Structured context for AI agents should be machine-readable. JSON works well because every major model can parse it natively, and it naturally supports nested structures, arrays, and key-value pairs that map cleanly to the four-layer model.

The structure doesn’t need to be complex. For each competitor or market signal, you need: the claim itself, whether it’s a fact or interpretation, a confidence level, the source, and a timestamp. That’s five fields per data point. Nest them under logical categories — product, team, funding, market position — and you have a context file that any agent can work with effectively.

PDF reports, slide decks, and Google Docs are formats designed for human consumption. They contain valuable information, but an agent can’t reliably distinguish between a header, a footnote, and a key finding. The format matters because it determines whether the agent is reasoning with data or guessing from text.

Common mistakes

Mixing facts and opinions in the same field. “Competitor X has strong enterprise traction and recently raised $20M.” The funding amount is a fact. “Strong enterprise traction” is an opinion. When these live in the same string, the agent treats both with equal weight.

No timestamps. Data without a collection date is data you can’t trust. An agent should treat a pricing observation from last week differently than one from six months ago. Without timestamps, it can’t.

Over-engineering the schema. Some teams build elaborate ontologies with dozens of field types. This adds complexity without improving agent performance. Start with the four layers. Add fields only when you have a specific use case that requires them.

Treating context engineering as a one-time setup. Context files need to be updated continuously. A beautifully structured JSON file from three months ago is just as misleading as an unstructured one. The architecture is only valuable if the data flowing through it stays current.

The bottom line

The teams getting the most value from AI agents in 2026 aren’t the ones with the best prompts or the most expensive models. They’re the ones that solved the data problem first. They built structured, layered, continuously updated context that gives their agents real material to work with.

Prompt engineering gets you 20% of the way there. Context engineering gets you the other 80%.

← All insights

Related insights

Context Engineering

What Your AI Agent Doesn't Know Is Costing You Deals

Most AI agents answer questions about your competitors with confident-sounding guesses. Here's what happens when you give them actual data.

Apr 2, 20264 min read