Advanced Memory Approaches¶

This page covers vector databases, knowledge graphs, and memory services. For coding agents, these approaches are rarely justified. The problems they solve are either not problems coding agents have, or are better solved with file-based tools.

Core Problem¶

These technologies emerged to solve retrieval at scale: millions of documents, thousands of users, queries where keyword search fails. Coding agents operate in a different context:

Single codebase, single team
Code is searchable by pattern (function names, imports, error messages), and context windows now hold 128K-200K tokens
The agent can reason about what to search for

An agent with grep and glob can search most codebases effectively. Adding vector indices or graph databases introduces infrastructure overhead without proportional retrieval improvement.

Vector Databases and RAG¶

Vector databases store embeddings and retrieve via semantic similarity. RAG (retrieval-augmented generation) uses this retrieval to augment prompts.

RAG Underperforms for Code¶

RAG was a workaround for small context windows and static retrieval. Neither constraint applies to modern coding agents.

Chunking destroys context. Functions split across chunk boundaries. Import statements separate from the code that uses them.

Embeddings miss code semantics. Code meaning lives in structure, not prose. "authenticate_user" and "verify_credentials" may do the same thing but embed differently.

An agent can grep for an error message, read the file, notice it's a wrapper, grep for the underlying function, and continue. RAG retrieves based on embedding similarity to the original query - it cannot adapt mid-search.

Reindexing creates friction too. Code changes constantly. Keeping embeddings current adds CI overhead and compute cost.

Large Documentation?¶

A common justification: "What if documentation exceeds context limits?"

Consider what this means. With 200K tokens, you can fit roughly 150K words - several books worth of documentation. If your docs exceed this, the problem is documentation organization, not retrieval technology.

An agent can search documentation with grep, read relevant sections, and search again based on what it learns. This iterative approach adapts to the query. Vector retrieval returns a fixed set of chunks based on initial embedding similarity.

When Vector Search Serves a Purpose¶

Vector databases solve real problems - they're not useless technology. But those problems are different from what coding agents face:

Cross-user personalization: Products serving thousands of users need to retrieve user-specific context efficiently
Semantic search over unstructured prose (customer support tickets, research papers, legal documents)
Multi-tenant knowledge bases with isolated data per customer

These are product engineering problems, not coding agent problems.

Examples: Pinecone, Weaviate, Chroma, pgvector.

Knowledge Graphs¶

Store entities and relationships. Traverse connections rather than searching flat text.

Multi-Hop Reasoning?¶

A common justification: "Knowledge graphs allow multi-hop reasoning - finding services that depend on services that use deprecated API X."

But an agent already does multi-hop reasoning:

Grep for deprecated API usage
Read files to find which services use it
Grep for imports of those services, then read to find dependents

This is slower than a graph query, but it requires no infrastructure, no schema maintenance, no entity extraction pipeline. The agent reasons about each step rather than relying on pre-computed relationships.

Maintenance Burden¶

Knowledge graphs require:

Schema design upfront, plus ongoing evolution
Entity extraction (often error-prone for code) and relationship inference
Query language expertise (Cypher, SPARQL)
Dedicated infrastructure

Code changes constantly. Keeping graph relationships current means running extraction on every commit. Stale graphs return wrong answers with high confidence.

When Graphs Serve a Purpose¶

Like vector databases, knowledge graphs solve real problems:

Compliance and audit: Regulatory queries requiring provable relationship chains
Large enterprise architectures with hundreds of services, maintained by dedicated platform teams
Domain modeling where entity relationships are the primary concern (healthcare records, financial instruments)

These require dedicated teams maintaining the graph as a first-class artifact. A coding agent on a single codebase doesn't have this, and building it would cost more than the retrieval benefits.

Examples: Neo4j, Amazon Neptune.

Memory Services¶

Managed layers that extract and consolidate memory automatically.

Mem0¶

Extracts relevant information from conversations, consolidates across sessions.

Built for products serving many users who need personalized experiences across conversations. A coding agent serving a single team gets the same personalization from an AGENTS.md file that the team maintains directly.

Letta (formerly MemGPT)¶

Implements memory hierarchy where agents manage their own memory, moving data between in-context and external storage.

Built for long-running agents with evolving state over extended periods. Coding agents typically run in sessions - context reloads each time. The session-based model means file-based memory (which persists between sessions) already provides what Letta's architecture offers.

Benchmark Results¶

Letta's own benchmarks (Dec 2025) tested memory approaches on the LoCoMo benchmark:

Approach	Score
Filesystem storage	74%
Mem0	68.5%

Simple file-based tools outperformed specialized memory infrastructure. Their conclusion: "Memory is more about how agents manage context than the exact retrieval mechanism."

Summary¶

Vector databases, knowledge graphs, and memory services solve real problems. Those problems are:

Retrieval across millions of documents, or queries where keyword search genuinely fails
Multi-tenant products with user-specific context
Enterprise architectures with dedicated platform teams maintaining the graph

Coding agents on a single codebase face different problems. An agent with file-based search tools and the ability to reason about results handles most retrieval needs. The infrastructure cost of these advanced approaches exceeds their retrieval benefit.

Default to file-based memory. Add infrastructure only when you have measured evidence that file-based tools are insufficient: actual failures in practice, not hypothetical scenarios.

File-Based Memory - Organization patterns
Memory Types - Overview of memory categories