Skip to content

Memory Types and Scopes

Memory in AI agents varies by persistence, storage mechanism, and scope. Understanding these distinctions helps choose the right approach for different problems.

By Persistence

Short-Term Memory (Context Window)

The context window holds everything the model "sees" during a conversation: your messages, tool outputs, file contents, and its own responses. Current frontier models support 128K-200K tokens with full attention over the entire window.

This is working memory. It disappears when the session ends, so project context must be reloaded each time.

Long-Term Memory (External Storage)

Anything that persists outside the context window and can be read into it. Two main approaches:

File-based: Markdown files in your repo (AGENTS.md, skills, specs). Version-controlled, portable across tools, loads automatically.

Non-file: Vector databases, knowledge graphs, memory services. Better for large-scale retrieval or cross-user personalization, but requires additional infrastructure.

Parametric Memory (Model Weights)

Knowledge encoded in weights during training. The model "knows" Python because billions of parameters learned it from training data. This knowledge shapes every generation without consuming context tokens.

The tradeoff: weights are frozen at training time. The model can't incorporate anything it learns from your project into its weights.

Long-Term Memory Approaches

File-Based Memory

Markdown files stored in your repository:

Type Examples When Loaded
Rules AGENTS.md, CLAUDE.md, .cursorrules Session start
Skills .skills/deploy/SKILL.md On invocation
Specs specs/auth.md When referenced

Advantages: version-controlled with your code, portable across tools via the AGENTS.md standard, no external dependencies. Your team can review changes like any other code.

For coding agents, this is the primary approach. See File-Based Memory for organization patterns.

Vector Databases

Store embeddings of documents or past conversations. Retrieve via semantic similarity search. Examples: Pinecone, Weaviate, Chroma, pgvector.

Knowledge Graphs

Store entities and relationships. Agents traverse connections rather than searching flat text. Examples: Neo4j, Amazon Neptune.

Memory Services

Managed layers that handle memory extraction, consolidation, and retrieval:

Mem0: Extracts relevant information from conversations, consolidates across sessions, retrieves context automatically.

Letta (formerly MemGPT): Implements memory hierarchy inspired by operating systems. Agents manage their own memory, moving data between in-context "RAM" and external "disk" storage.

These services handle memory management but add infrastructure dependencies.

For detailed evaluation of when these approaches are justified, see Advanced Memory Approaches.

Memory Scopes

For file-based memory, scope determines where rules apply and how specific they are.

User-Level

Preferences that apply across all projects for a specific user.

~/.claude/CLAUDE.md

Put personal preferences here: testing frameworks, communication style, frequently used tools.

Project-Level

Team-shared conventions in version control.

project/
├── AGENTS.md
├── CLAUDE.md
└── .cursorrules

Project-level memory captures architecture decisions, naming conventions, deployment procedures, domain vocabulary.

Folder-Level

Service-specific context for monorepos.

project/
├── AGENTS.md              # Project-wide
├── packages/
│   ├── api/
│   │   └── AGENTS.md      # API-specific
│   └── web/
│       └── AGENTS.md      # Frontend-specific

When working in packages/api/, the agent loads both the root and folder-specific rules.

Session-Level

What the model learns during the current conversation. Files read, errors seen, decisions made. Exists only in the context window and disappears when the session ends.

Choosing Scope

Information Type Scope Rationale
Personal coding style User Same across all projects
Team conventions Project Shared via version control
Service-specific APIs Folder Only relevant in that directory
Current task state Session Ephemeral by nature

Use the narrowest scope that covers the use case. Service-specific context at project level adds noise elsewhere.