Skills: Progressive Context Disclosure


Your system prompt is 5,000 tokens and growing. Every new feature makes your agent slower, more expensive, and dumber. There’s a better way.


The Problem

You start with a simple agent. A few rules. A persona. It works.

Then requirements grow:

  • “Add database query syntax.”
  • “Add our coding standards.”
  • “Add the API documentation.”
  • “Add error handling patterns.”

Before you know it, you’ve created The Prompt Blob Monster—a 10,000-token system prompt that tries to do everything.

The Enterprise RiskWhat Happens
💸 Cost ExplosionEvery request pays for tokens the agent doesn’t need right now.
🐢 Latency CreepMore tokens = slower first-token-time, especially at scale.
🧠 Context RotResearch shows LLMs lose reasoning quality in the “middle” of long contexts.
🎯 Instruction FogToo many rules = the model forgets which ones matter now.

The villain isn’t the LLM. It’s the architecture.


The Concept

Skills are procedural memory—loaded on demand.

Instead of stuffing everything into one system prompt, you organize knowledge into discrete Skill files. The agent loads only what it needs, when it needs it.

💡 The Key Insight: Google’s Context Engineering guide defines this as Procedural Memory—“How-to” knowledge that’s retrieved just-in-time, not pre-loaded.

Think of it like a senior engineer’s bookshelf:

  • They don’t memorize every API doc.
  • They know where to look when they need it.
  • The knowledge is available, not active.

This is Progressive Context Disclosure.


How It Works

The Skills Architecture

flowchart TD subgraph System["🎯 Always Active"] P["🎭 Persona"] end subgraph OnDemand["📚 Loaded When Needed"] S1["Skill: Git"] S2["Skill: Database"] S3["Skill: Code Review"] end U["👤 User: 'Review this PR'"] --> A["🤖 Agent"] A --> P A -.->|loads| S3 S3 --> R["✅ Review Complete"]

The SKILL.md Pattern

Each skill lives in its own folder with a SKILL.md file:

.agent/skills/
├── code-review/
│   └── SKILL.md
├── database/
│   └── SKILL.md
└── git-operations/
    └── SKILL.md

Anatomy of a SKILL.md

---
name: code-review
description: Guidelines for reviewing pull requests
---

# Code Review Skill

## When to Apply
- User asks to review code, a PR, or a diff.

## Key Guidelines
1. Check for security vulnerabilities first.
2. Verify error handling is present.
3. Look for performance anti-patterns.

## Output Format
- Use inline comments for specific issues.
- Summarize overall assessment at the end.

The frontmatter (name, description) helps the agent decide when to load the skill.
The body contains the actual procedural knowledge.

The Loading Pattern

  1. User makes a request → “Review this pull request.”
  2. Agent checks available skills → Sees code-review matches.
  3. Agent loads the skill → SKILL.md content joins the context.
  4. Agent executes with specialized knowledge → Review follows guidelines.

The key: Most skills stay unloaded most of the time.


When to Use It

The Skill Threshold

Litmus Test: If knowledge is needed sometimes but not always, it’s a Skill.

Context TypeWhen NeededWhere It Goes
Core identity, valuesAlways🎭 Persona (System Prompt)
Procedures, workflowsSometimes📚 Skills (On-demand)
Facts, documentsPer-query📖 RAG (Retrieved)
Live system stateReal-time🔌 MCP (Connected)

Real-World Examples

Scenario❌ Blob Approach✅ Skills Approach
Multi-language support5,000 tokens of Python + TypeScript + Go syntax in every requestLoad only the language skill matching the current file
Database operationsAll SQL dialects in the promptLoad postgres.md or mysql.md based on detected connection
Code reviewReview guidelines always presentLoad code-review skill only when reviewing

The Token Math

Consider an agent with 10 specialized capabilities:

  • Blob approach: 10 × 500 tokens = 5,000 tokens every request
  • Skills approach: 500 tokens base + 500 tokens loaded = 1,000 tokens average

Result: 80% token reduction. Faster. Cheaper. Sharper focus.


Industry Applications

Skills work the same way across domains—load procedural knowledge only when needed:

IndustrySkills ExamplesWhen Loaded
🏦 Bankingfraud-investigation.md, loan-underwriting.md, kyc-verification.mdWhen task type detected
🛒 Retailreturn-processing.md, price-matching.md, shipping-policy.mdWhen customer request matches
🎓 Educationgrading-rubric.md, lesson-planning.md, accessibility-guidelines.mdWhen teaching scenario identified

The Pattern Repeats

🏦 Banking Agent: Customer asks about wire transfer → loads wire-transfer-procedure.md with compliance steps and limits. Doesn’t load mortgage skills.

🛒 Retail Agent: Customer wants to return an item → loads return-processing.md with policy rules. Doesn’t load inventory or shipping skills.

🎓 Education Agent: Student struggles with fractions → loads remedial-math.md skill with step-by-step teaching approach. Doesn’t load advanced calculus.


Key Takeaways

  • Skills = Procedural Memory: “How-to” knowledge loaded on demand, not pre-stuffed.
  • Folder structure > prompt engineering: Organize knowledge into files, not longer prompts.
  • Progressive disclosure reduces cost: Only pay for context you’re actually using.
  • Focus sharpens reasoning: Fewer instructions = clearer execution.
  • Scalability by design: Add 100 skills without bloating every request.

What’s Next


References

  1. Google Cloud ResearchContext Engineering: Sessions & Memory (2025). Defines Procedural Memory as “How-to” knowledge distinct from Semantic Memory (facts).

  2. AnthropicCLAUDE.md Pattern. The inspiration for skill-based context organization.

  3. GalileoThe “Lost in the Middle” Phenomenon. Research on context degradation in long prompts.

❓ Frequently Asked Questions

What is progressive context disclosure for AI agents?

Loading only the knowledge an agent needs for the current task, rather than stuffing everything into the system prompt. This reduces costs, improves focus, and scales efficiently.

What is the SKILL.md pattern?

A structured markdown file that defines procedural knowledge with YAML frontmatter (name, description, triggers) and step-by-step instructions. Skills are loaded on-demand based on task context.

When should I use Skills vs RAG?

Use Skills for stable, procedural HOW-TO knowledge (e.g., deployment steps). Use RAG for dynamic, factual WHAT knowledge that changes frequently (e.g., product catalogs).

💬 Join the Discussion