Chapter 2: Context Engineering

"Context engineering is the delicate art and science of filling the context window with just the right information for the next step." - Andrej Karpathy

This is the single most important skill for getting good results from AI tools. Master this chapter and everything else gets easier.

What Context Actually Means

In LLMs, context is everything the model "sees" before generating a response. Think of it as the AI's working memory.

Unlike a human, an AI doesn't truly remember past conversations unless you include them again. It only knows:

What you feed it right now
What it learned during training

//Context = Short-Term Memory

The context window is like RAM for the AI - it's what the model can actively consider when responding. This includes:

Your current prompt and instructions
Any code, files, or data you provide
Conversation history (if included)

Provide clear, relevant context = focused, accurate answers. Provide poor or vague context = confused responses or hallucinations.

The Two Types of Context

Understanding this distinction will immediately improve your prompts:

//1. Intent Context (What You Want)

This is prescriptive - it tells the model what you're trying to accomplish:

"Explain why this function is slow"
"Refactor this to use async/await"
"Write tests for this class"

//2. State Context (What Exists)

This is descriptive - it shows the model the current situation:

The code you're working with
Error messages and stack traces
Configuration or environment details

//The Common Mistakes

Missing state context: You ask "why is this failing?" but don't include the error message or the code. The AI hallucinates a solution using generic knowledge.

Missing intent context: You dump a bunch of code but don't say what you want. The AI doesn't know what to do with it.

Good prompts have both. Show what exists, tell what you want.

Token Limits and Context Windows

Every LLM has a limit to how much context it can handle - the context window, measured in tokens (roughly 3/4 of a word each).

//Current Limits (approximate)

Model	Context Window	~Words
GPT-4 Turbo	128k tokens	~96,000
Claude 3	200k tokens	~150,000
Gemini 1.5	1M+ tokens	~750,000

Big windows mean you CAN include more context. But you still need to be smart about WHAT you include.

//More Isn't Always Better

Research shows that model accuracy can degrade with longer contexts - called "context rot." Even with huge windows:

Details in the middle can get overlooked
Irrelevant context dilutes the signal
The model may fixate on the wrong parts

The goal is relevant context, not maximum context.

The 10 Rules of Context Engineering

Based on Anthropic's framework for effective prompts. Use this as your checklist:

//1. Set the Stage with Task Context

Tell the model its role and goal upfront.

You are a senior Python developer helping debug a Django application.
Your goal is to identify why the API endpoint is returning 500 errors.

This anchors the model in the right mindset before you give it anything else.

//2. Define Tone and Boundaries

Tell it HOW to behave:

Be concise and technical. Only be confident when the evidence is clear.
If you're not sure, say so. Don't make assumptions about business logic.

//3. Provide Background Knowledge

Include stable facts the AI should know:

Your tech stack and versions
Architectural patterns you use
Conventions in your codebase

This is perfect for Cursor rules or ChatGPT custom instructions.

//4. Spell Out Task Steps and Rules

Don't assume the AI will infer the procedure:

1. Read the error message and identify the exception type
2. Look at the stack trace to find the failing line
3. Check the function for obvious issues
4. If not obvious, suggest what additional context would help

//5. Show Examples (Few-Shot Prompting)

Examples are powerful. If you want a specific format or style, show it:

When you find an issue, format your response like this:

**Problem:** [one-line description]
**Location:** [file:line]
**Fix:** [code snippet]
**Why:** [brief explanation]

//6. Include Relevant History

If this task builds on previous work, include what matters:

Previous conversation messages
Decisions already made
Context from earlier in the session

But only what's relevant - don't drag in unrelated history.

//7. Restate the Request at the End

After all your context, clearly state what you want:

[... lots of context above ...]

Now, given all the above, why is the `process_payment` function
returning None when the payment succeeds?

This focuses the model on the current task after absorbing the context.

//8. Let the Model Think Step-by-Step

For complex tasks, ask for reasoning:

Think through this step by step before giving your final answer.

Chain-of-thought prompting improves accuracy on difficult problems.

//9. Enforce a Clear Output Format

If you need structured output:

Respond with a JSON object containing:
- "issue": string describing the problem
- "severity": "low" | "medium" | "high"
- "fix": string with the code change

//10. Prefill When Appropriate

You can start the response for the model:

Based on my analysis, the bug is caused by:

The model will continue from there, skipping preamble.

How Coding Tools Manage Context for You

Tools like Cursor handle a lot of context automatically:

//Automatic File Inclusion

Cursor pulls in files it thinks are relevant. If you're editing LoginForm.jsx, it includes that file and related code.

//@ References

You can explicitly include context:

@file utils/helpers.js - include a specific file
@folder src/components - include a folder
@codebase - search the whole project
@docs - include documentation

//System Messages and Rules

Cursor has built-in instructions that set the stage. Your custom rules add to this.

//The Takeaway

Tools help, but you still control what context goes in. Being explicit with @ references beats hoping the tool guesses right.

Practical Context Management

//Start Fresh for Each Task

Don't continue a long conversation across different tasks. Each new task should get a fresh chat with clean context.

Why? Because old, irrelevant context:

Takes up tokens
Can confuse the model
May override your new instructions

//Be Selective

Ask yourself: "Does the model NEED this to answer my question?"

Include:

The specific code being discussed
Relevant error messages
Necessary type definitions or interfaces

Skip:

Unrelated files
Entire codebases when you only need one function
Long conversation history about other topics

//Summarize When Needed

For complex projects, summarize rather than include everything:

This is a React app using Redux for state management. The auth flow
uses JWTs stored in localStorage. Here's the relevant component:

[specific code]

Key Takeaways

Context is everything the model sees. It's the AI's working memory for your task.
Include both intent (what you want) and state (what exists). Missing either leads to poor results.
More context isn't always better. Relevance beats quantity.
Use the 10-rule framework. It works for any AI tool, not just coding assistants.
Start fresh for new tasks. Don't let old context pollute new conversations.
Be explicit with what you include. @ references beat hoping the tool guesses right.

Next: Getting productive with Cursor - your primary AI coding tool.