Sessions vs Memory
| Aspect | Session | Memory | |--------|---------|--------| | Scope | Single conversation | ALL conversations | | Duration | While chatting | Forever | | Content | Raw messages | Extracted facts | | Search | Sequential | Semantic | | Token cost | Grows linearly | Fixed per query |
Analogy:
- Session = What you talked about in the current meeting
- Memory = Your notes from all past meetings
Why Memory Matters
Sessions have limits:
- One conversation only
- Token costs grow with conversation length
- Cannot reference past conversations
Memory solves this:
- Persists across all conversations
- Fixed cost to retrieve relevant memories
- Agent can reference anything from the past
Simple Memory Store
A basic memory system with keyword search:
memory_store.py
from dataclasses import dataclass, field
from datetime import datetime
@dataclass
class Memory:
content: str
timestamp: datetime
category: str
importance: float = 0.5 # 0 to 1
@dataclass
class MemoryStore:
memories: list[Memory] = field(default_factory=list)
def add(self, content: str, category: str, importance: float = 0.5):
"""Add a new memory."""
self.memories.append(Memory(
content=content,
timestamp=datetime.now(),
category=category,
importance=importance
))
def search(self, query: str, limit: int = 5) -> list[Memory]:
"""Search memories by keyword."""
query_words = set(query.lower().split())
scored = []
for mem in self.memories:
mem_words = set(mem.content.lower().split())
overlap = len(query_words & mem_words)
if overlap > 0:
score = overlap * mem.importance
scored.append((score, mem))
scored.sort(reverse=True, key=lambda x: x[0])
return [m for _, m in scored[:limit]]
def get_by_category(self, category: str) -> list[Memory]:
"""Get all memories in a category."""
return [m for m in self.memories if m.category == category]Agent with Memory Tools
Give the agent tools to remember and recall across sessions:
agent_memory.py
from pydantic_ai import Agent, RunContext
from dataclasses import dataclass, field
@dataclass
class AgentContext:
user_id: str
memory: MemoryStore = field(default_factory=MemoryStore)
memory_agent = Agent(
'openai:gpt-4o',
deps_type=AgentContext,
instructions='''You have long-term memory.
Use remember_fact to store important information about the user.
Use recall to find relevant memories before answering.
Personalize responses based on what you remember.'''
)
@memory_agent.tool
def remember_fact(
ctx: RunContext[AgentContext],
fact: str,
category: str,
importance: float = 0.5
) -> str:
"""Store an important fact in long-term memory.
Args:
fact: The fact to remember
category: Category (personal, preference, work, relationship)
importance: How important (0.0 to 1.0)
"""
ctx.deps.memory.add(fact, category, importance)
return f"Remembered: {fact}"
@memory_agent.tool
def recall(ctx: RunContext[AgentContext], query: str) -> list[str]:
"""Search memories for relevant information.
Args:
query: What to search for
"""
memories = ctx.deps.memory.search(query)
if not memories:
return ["No relevant memories found."]
return [f"[{m.category}] {m.content}" for m in memories]
# Usage across multiple sessions
context = AgentContext(user_id="pablo")
# Session 1
result1 = memory_agent.run_sync(
"I'm Pablo, I work in fintech in Dubai. I prefer dark mode.",
deps=context
)
# Session 2 (later, new conversation)
result2 = memory_agent.run_sync(
"What do you remember about me?",
deps=context
# No message_history - this is a NEW conversation
)
print(result2.output)
# Output: I remember you are Pablo, you work in fintech in Dubai,
# and you prefer dark mode.Automatic Memory Extraction
Instead of relying on the agent to save memories, automatically extract facts after each conversation:
auto_extract.py
from pydantic import BaseModel
from pydantic_ai import Agent
class ExtractedFacts(BaseModel):
facts: list[str]
categories: list[str]
extractor = Agent(
'openai:gpt-4o-mini',
output_type=ExtractedFacts,
instructions='''Extract important facts from the conversation.
Focus on: names, preferences, jobs, locations, relationships.
Return each fact separately with its category.'''
)
async def auto_extract_memories(conversation: str, memory_store: MemoryStore):
"""Run after each conversation to extract and store memories."""
result = await extractor.run(f'Extract facts from:\n{conversation}')
for fact, category in zip(result.output.facts, result.output.categories):
memory_store.add(fact, category, importance=0.7)Vector Database for Production
For production, use a vector database for semantic search. Popular options:
- Pinecone (managed)
- Weaviate (self-hosted or managed)
- ChromaDB (local, good for development)
- pgvector (PostgreSQL extension)
vector_memory.py
# Conceptual example - actual implementation depends on your vector DB
class VectorMemoryStore:
def __init__(self, embedding_model, vector_db):
self.embedding_model = embedding_model
self.vector_db = vector_db
def add(self, content: str, metadata: dict):
embedding = self.embedding_model.embed(content)
self.vector_db.insert(embedding, content, metadata)
def search(self, query: str, limit: int = 5) -> list[dict]:
query_embedding = self.embedding_model.embed(query)
return self.vector_db.search(query_embedding, limit=limit)Key Takeaways
- 1Memory is not the same as session. Sessions are one conversation. Memory spans all conversations.
- 2Store facts, not raw messages. Extract important information, not entire chat logs.
- 3Categories help organization. Personal, work, preferences, etc.
- 4Search by relevance. When recalling, find the most relevant memories for the current context.
- 5Use vector DBs in production. Semantic search finds relevant memories even with different wording.