Context Engineering: Sessions and Memory


“Context engineering is what separates agents that forget mid-conversation from agents that remember you for years.”


The Problem

Your agent works great in testing. Single-turn queries? Perfect answers.

Then users have conversations:

  • “What did we discuss yesterday?”
  • “Update the recommendations based on what I told you earlier.”
  • “Remember my preferences for next time.”

Your agent draws a blank. Every conversation starts from zero.

The Failure ModeRoot Cause
🧠 Mid-Conversation AmnesiaNo session management
📅 No Cross-Session MemoryNo persistent storage
🔀 Context OverflowConversation exceeds token limit
🎭 Lost PersonalizationUser preferences not retained

The Shift: Prompt Engineering → Context Engineering

Key Insight: What information reaches the model matters more than how you phrase the prompt.

Prompt Engineering focuses on crafting the perfect instruction.

Context Engineering focuses on curating the optimal information for each moment:

  • What does the model need to know right now?
  • What should be loaded on-demand vs. pre-loaded?
  • What should persist across conversations?
flowchart TD subgraph PromptEng["❌ Prompt Engineering"] P["Craft perfect prompt"] end subgraph ContextEng["✅ Context Engineering"] S["📋 Session State"] M["🧠 Long-term Memory"] T["🔧 Tool Results"] R["📚 Retrieved Knowledge"] end P --> LLM1["🤖 Model"] S --> C["Context Window"] M --> C T --> C R --> C C --> LLM2["🤖 Model"]

Part 1: Sessions — Short-Term Memory

What is a Session?

A session is the complete context for a single conversation:

  • User messages
  • Agent responses
  • Tool calls and results
  • Working state (e.g., items in a cart)
flowchart TD subgraph Session["📋 Session"] E1["Event 1: User message"] E2["Event 2: Agent response"] E3["Event 3: Tool call"] E4["Event 4: Tool result"] E5["Event 5: Agent response"] ST["State: cart items, preferences"] end E1 --> E2 --> E3 --> E4 --> E5

The Session Lifecycle

stateDiagram-v2 [*] --> Created: User starts conversation Created --> Active: First message Active --> Active: Messages exchanged Active --> Paused: User inactive (timeout) Paused --> Active: User returns Active --> Archived: TTL expires or user ends Archived --> [*]

Production Session Requirements

RequirementWhy It Matters
Strict IsolationUser A cannot see User B’s session
PersistenceSurvive server restarts
OrderingEvents must be chronological
TTL PolicySessions expire after inactivity
PII RedactionRemove sensitive data before storage

Part 2: Memory Types — Long-Term Knowledge

Google’s research defines three types of long-term memory:

The Memory Taxonomy

Memory TypeWhat It StoresExampleTime Horizon
🧠 SemanticFacts, knowledge”The user is a vegetarian”Permanent
📋 ProceduralHow-to knowledge”How to deploy to production”Stable
📔 EpisodicPast experiences”Last week we debugged the login issue”Decaying
flowchart TD subgraph Memory["🧠 Long-Term Memory"] SEM["📚 Semantic
(Facts & Knowledge)"] PROC["📋 Procedural
(How-To)"] EPIS["📔 Episodic
(Past Events)"] end subgraph Examples["Examples"] S1["User preferences"] S2["Company policies"] P1["Coding standards"] P2["Deploy procedures"] E1["Past conversations"] E2["Previous decisions"] end SEM --> S1 SEM --> S2 PROC --> P1 PROC --> P2 EPIS --> E1 EPIS --> E2

Semantic Memory (Facts)

What the agent knows about the world and the user.

SourceExamples
User ProfileName, role, preferences, timezone
Domain KnowledgeProduct catalog, company policies
External KnowledgeVia RAG from documents

Storage: User profiles, vector databases, knowledge graphs.

Procedural Memory (How-To)

What the agent knows how to do.

This maps directly to Skills (see Article 3):

  • Coding standards
  • Review procedures
  • Deployment workflows

Storage: Skill files (.agent/skills/), runbooks, SOPs.

Episodic Memory (Past Events)

What the agent remembers from past interactions.

PatternImplementation
Conversation SummariesCompress old sessions into key points
Decision Logs”On Jan 15, we chose option B because…”
Preference Learning”User consistently prefers concise answers”

Storage: Summarized session archives, decision logs.


Part 3: Managing the Context Window

The Context Budget

Every model has a finite context window. You must budget it:

pie title Context Window Budget (32K tokens) "System Prompt" : 500 "Recent History" : 2000 "Retrieved Knowledge" : 1500 "Tool Definitions" : 800 "Working Memory" : 500 "Available for Response" : 26700

Context Overflow Strategies

When history exceeds your budget:

StrategyHow It WorksTrade-off
TruncationKeep last N messagesLoses early context
SummarizationLLM summarizes old messagesLoses detail, costs tokens
Sliding WindowFixed window that movesSimple, may miss key context
Semantic SelectionKeep most relevant messagesComplex, more accurate
Query-Aware CompressionCompress based on current task relevanceBest quality, requires planning

💡 2025 Update: The Sentinel Framework (May 2025) introduces lightweight, query-aware context compression that outperforms simple summarization. Key insight: compress based on what the model needs now, not just recency.

The Summarization Pattern

flowchart LR H["📜 Full History
(10,000 tokens)"] --> S["🤖 Summarize"] S --> C["📝 Compressed
(500 tokens)"] C --> N["➕ New Messages"] N --> CTX["📋 Context Window"]

When to Summarize:

  • When history reaches 70% of context budget
  • At conversation milestones (topic changes)
  • Before archiving a session

Part 4: Multi-Agent Context Sharing

In multi-agent systems, context becomes more complex.

Shared vs. Private Context

Context TypeWho Sees ItExamples
GlobalAll agentsUser identity, session goals
SharedAgent subsetsResearch results, intermediate data
PrivateSingle agentInternal reasoning, tool credentials
flowchart TD subgraph Global["🌐 Global Context"] G1["User ID"] G2["Session Goal"] end subgraph Shared["🔗 Shared Context"] S1["Research Results"] S2["Draft Document"] end subgraph Private["🔒 Private"] P1["Agent A Reasoning"] P2["Agent B Credentials"] end A1["🤖 Agent A"] --> Global A1 --> Shared A1 --> P1 A2["🤖 Agent B"] --> Global A2 --> Shared A2 --> P2

The Handoff Pattern

When Agent A hands off to Agent B:

  1. Summarize Agent A’s work
  2. Transfer relevant context (not everything)
  3. Preserve the user’s original intent
  4. Clear Agent A’s private state

Part 5: Production Best Practices

Security & Privacy

PracticeImplementation
PII RedactionRemove before storage (Model Armor)
Strict IsolationACLs per user session
EncryptionAt rest and in transit
Audit LoggingTrack all context access

Data Lifecycle

StagePolicy
Active SessionFull context in working memory
Paused SessionPersist to durable storage
Archived SessionSummarize + move to cold storage
Expired SessionDelete per retention policy

Performance Optimization

TechniqueBenefit
Lazy LoadingLoad memories only when needed
CachingCache frequent retrievals
PrefetchingAnticipate likely context needs
CompressionSummarize before archiving

The Context Engineering Checklist

For Every Agent

  • Session Management: How is conversation history persisted?
  • Memory Strategy: What’s stored permanently vs. session-scoped?
  • Overflow Handling: What happens when context exceeds limits?
  • Privacy Controls: Is PII redacted before storage?
  • TTL Policies: When do sessions expire?

For Multi-Agent Systems

  • Shared State: What context do agents share?
  • Handoff Protocol: How is context transferred between agents?
  • Isolation: What’s private to each agent?

Industry Applications

Context engineering patterns apply across all domains:

Memory Types by Industry

Memory Type🏦 Banking🛒 Retail🎓 Education
SemanticAccount preferences, risk profilePurchase history, size preferencesLearning style, accessibility needs
ProceduralKYC verification steps, dispute resolutionReturn processing, loyalty rewardsGrading rubrics, lesson planning
Episodic”Last month we discussed refinancing""You bought this item before""We covered fractions last week”

Session Examples

🏦 Banking: Customer returns after 3 days. Session restored with: prior questions, account context, and the loan application they started. No need to re-authenticate intent.

🛒 Retail: Shopper returns to abandoned cart. Session recalls: items, applied coupons, shipping preference. Seamless checkout resume.

🎓 Education: Student returns to tutoring session. Context includes: current topic, recent mistakes, learning pace. Agent picks up exactly where they left off.


Key Takeaways

  • Sessions = Short-term: Current conversation state.
  • Memory = Long-term: Semantic (facts), Procedural (how-to), Episodic (past events).
  • Budget your context: Allocate tokens intentionally across system prompt, history, and knowledge.
  • Summarize, don’t truncate: Preserve important context by compressing, not cutting.
  • In multi-agent systems: Define what’s global, shared, and private.
  • Security first: Redact PII, enforce isolation, encrypt storage.

What’s Next


References

  1. Google Cloud ResearchContext Engineering: Sessions & Memory (2025). The primary reference for memory types and session management.

  2. AnthropicBuilding Effective Agents (2024). Emphasizes context curation over prompt crafting.

  3. Google Cloud ResearchIntroduction to Agents (2025). Defines the role of context in the agentic loop.

  4. Tulving, E.Episodic and Semantic Memory (1972). The foundational cognitive science research on memory types.

❓ Frequently Asked Questions

What is context engineering for AI agents?

Context engineering is the discipline of managing an agent's entire context window—including conversation history, tool outputs, retrieved documents, and long-term memory—to optimize reasoning quality across multi-turn sessions.

What are the three types of long-term memory for agents?

Semantic Memory (facts and knowledge via RAG), Procedural Memory (how-to skills via SKILL.md), and Episodic Memory (past interactions for personalization).

How do I handle context window overflow?

Use strategies like summarization (compress old context), sliding window (keep recent N turns), or selective pruning (remove low-relevance content). Never silently truncate important information.

💬 Join the Discussion