Chapter 2: Context Engineering
"Context engineering is the delicate art and science of filling the context window with just the right information for the next step." - Andrej Karpathy
This is the single most important skill for getting good results from AI tools. Master this chapter and everything else gets easier.
What Context Actually Means
In LLMs, context is everything the model "sees" before generating a response. Think of it as the AI's working memory.
Unlike a human, an AI doesn't truly remember past conversations unless you include them again. It only knows:
- What you feed it right now
- What it learned during training
//Context = Short-Term Memory
The context window is like RAM for the AI - it's what the model can actively consider when responding. This includes:
- Your current prompt and instructions
- Any code, files, or data you provide
- Conversation history (if included)
Provide clear, relevant context = focused, accurate answers. Provide poor or vague context = confused responses or hallucinations.
The Two Types of Context
Understanding this distinction will immediately improve your prompts:
//1. Intent Context (What You Want)
This is prescriptive - it tells the model what you're trying to accomplish:
- "Explain why this function is slow"
- "Refactor this to use async/await"
- "Write tests for this class"
//2. State Context (What Exists)
This is descriptive - it shows the model the current situation:
- The code you're working with
- Error messages and stack traces
- Configuration or environment details
//The Common Mistakes
Missing state context: You ask "why is this failing?" but don't include the error message or the code. The AI hallucinates a solution using generic knowledge.
Missing intent context: You dump a bunch of code but don't say what you want. The AI doesn't know what to do with it.
Good prompts have both. Show what exists, tell what you want.
Token Limits and Context Windows
Every LLM has a limit to how much context it can handle - the context window, measured in tokens (roughly 3/4 of a word each).
//Current Limits (approximate)
| Model | Context Window | ~Words |
|---|---|---|
| GPT-4 Turbo | 128k tokens | ~96,000 |
| Claude 3 | 200k tokens | ~150,000 |
| Gemini 1.5 | 1M+ tokens | ~750,000 |
Big windows mean you CAN include more context. But you still need to be smart about WHAT you include.
//More Isn't Always Better
Research shows that model accuracy can degrade with longer contexts - called "context rot." Even with huge windows:
- Details in the middle can get overlooked
- Irrelevant context dilutes the signal
- The model may fixate on the wrong parts
The goal is relevant context, not maximum context.
The 10 Rules of Context Engineering
Based on Anthropic's framework for effective prompts. Use this as your checklist:
//1. Set the Stage with Task Context
Tell the model its role and goal upfront.
You are a senior Python developer helping debug a Django application.
Your goal is to identify why the API endpoint is returning 500 errors.
This anchors the model in the right mindset before you give it anything else.
//2. Define Tone and Boundaries
Tell it HOW to behave:
Be concise and technical. Only be confident when the evidence is clear.
If you're not sure, say so. Don't make assumptions about business logic.
//3. Provide Background Knowledge
Include stable facts the AI should know:
- Your tech stack and versions
- Architectural patterns you use
- Conventions in your codebase
This is perfect for Cursor rules or ChatGPT custom instructions.
//4. Spell Out Task Steps and Rules
Don't assume the AI will infer the procedure:
1. Read the error message and identify the exception type
2. Look at the stack trace to find the failing line
3. Check the function for obvious issues
4. If not obvious, suggest what additional context would help
//5. Show Examples (Few-Shot Prompting)
Examples are powerful. If you want a specific format or style, show it:
When you find an issue, format your response like this:
**Problem:** [one-line description]
**Location:** [file:line]
**Fix:** [code snippet]
**Why:** [brief explanation]
//6. Include Relevant History
If this task builds on previous work, include what matters:
- Previous conversation messages
- Decisions already made
- Context from earlier in the session
But only what's relevant - don't drag in unrelated history.
//7. Restate the Request at the End
After all your context, clearly state what you want:
[... lots of context above ...]
Now, given all the above, why is the `process_payment` function
returning None when the payment succeeds?
This focuses the model on the current task after absorbing the context.
//8. Let the Model Think Step-by-Step
For complex tasks, ask for reasoning:
Think through this step by step before giving your final answer.
Chain-of-thought prompting improves accuracy on difficult problems.
//9. Enforce a Clear Output Format
If you need structured output:
Respond with a JSON object containing:
- "issue": string describing the problem
- "severity": "low" | "medium" | "high"
- "fix": string with the code change
//10. Prefill When Appropriate
You can start the response for the model:
Based on my analysis, the bug is caused by:
The model will continue from there, skipping preamble.
How Coding Tools Manage Context for You
Tools like Cursor handle a lot of context automatically:
//Automatic File Inclusion
Cursor pulls in files it thinks are relevant. If you're editing LoginForm.jsx, it includes that file and related code.
//@ References
You can explicitly include context:
@file utils/helpers.js- include a specific file@folder src/components- include a folder@codebase- search the whole project@docs- include documentation
//System Messages and Rules
Cursor has built-in instructions that set the stage. Your custom rules add to this.
//The Takeaway
Tools help, but you still control what context goes in. Being explicit with @ references beats hoping the tool guesses right.
Practical Context Management
//Start Fresh for Each Task
Don't continue a long conversation across different tasks. Each new task should get a fresh chat with clean context.
Why? Because old, irrelevant context:
- Takes up tokens
- Can confuse the model
- May override your new instructions
//Be Selective
Ask yourself: "Does the model NEED this to answer my question?"
Include:
- The specific code being discussed
- Relevant error messages
- Necessary type definitions or interfaces
Skip:
- Unrelated files
- Entire codebases when you only need one function
- Long conversation history about other topics
//Summarize When Needed
For complex projects, summarize rather than include everything:
This is a React app using Redux for state management. The auth flow
uses JWTs stored in localStorage. Here's the relevant component:
[specific code]
Key Takeaways
-
Context is everything the model sees. It's the AI's working memory for your task.
-
Include both intent (what you want) and state (what exists). Missing either leads to poor results.
-
More context isn't always better. Relevance beats quantity.
-
Use the 10-rule framework. It works for any AI tool, not just coding assistants.
-
Start fresh for new tasks. Don't let old context pollute new conversations.
-
Be explicit with what you include. @ references beat hoping the tool guesses right.
Next: Getting productive with Cursor - your primary AI coding tool.