Chapter 9
30 min read
Section 58 of 175

Building a Memory System

Memory Systems for Agents

Introduction

It's time to bring everything together. In this section, we'll build a complete memory system that combines short-term conversation memory, long-term vector storage, and knowledge graph relationships into a unified system that an agent can use.

What We're Building: A production-ready memory system with working memory, semantic long-term storage, entity tracking, and session continuity.

System Architecture

Our memory system has four main components:

πŸ“architecture.txt
1MEMORY SYSTEM ARCHITECTURE
2
3β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
4β”‚                      AGENT MEMORY                           β”‚
5β”‚                                                             β”‚
6β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
7β”‚  β”‚              WORKING MEMORY (in-process)              β”‚  β”‚
8β”‚  β”‚  β€’ Current conversation messages                      β”‚  β”‚
9β”‚  β”‚  β€’ Active task state                                  β”‚  β”‚
10β”‚  β”‚  β€’ Scratchpad for reasoning                          β”‚  β”‚
11β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
12β”‚                           β”‚                                 β”‚
13β”‚                           β–Ό                                 β”‚
14β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
15β”‚  β”‚           RETRIEVAL LAYER (orchestrates)              β”‚  β”‚
16β”‚  β”‚  β€’ Decides what to retrieve                           β”‚  β”‚
17β”‚  β”‚  β€’ Combines results from different stores             β”‚  β”‚
18β”‚  β”‚  β€’ Formats context for LLM                           β”‚  β”‚
19β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
20β”‚              β”‚                           β”‚                  β”‚
21β”‚              β–Ό                           β–Ό                  β”‚
22β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”β”‚
23β”‚  β”‚  SEMANTIC MEMORY    β”‚    β”‚  STRUCTURED MEMORY          β”‚β”‚
24β”‚  β”‚  (Vector Store)     β”‚    β”‚  (Knowledge Graph)          β”‚β”‚
25β”‚  β”‚  β€’ Facts            β”‚    β”‚  β€’ Entity relationships     β”‚β”‚
26β”‚  β”‚  β€’ Preferences      β”‚    β”‚  β€’ Explicit connections     β”‚β”‚
27β”‚  β”‚  β€’ Summaries        β”‚    β”‚  β€’ Hierarchy / ownership    β”‚β”‚
28β”‚  β”‚  β€’ Experiences      β”‚    β”‚                             β”‚β”‚
29β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β”‚
30β”‚                                                             β”‚
31β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
32β”‚  β”‚              SESSION STORE (persistence)              β”‚  β”‚
33β”‚  β”‚  β€’ User profiles                                      β”‚  β”‚
34β”‚  β”‚  β€’ Session histories                                  β”‚  β”‚
35β”‚  β”‚  β€’ Cross-session summaries                           β”‚  β”‚
36β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
37β”‚                                                             β”‚
38β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Complete Implementation

Let's implement each component:

Core Data Structures

🐍memory_types.py
1from dataclasses import dataclass, field
2from datetime import datetime
3from typing import Optional, Any
4from enum import Enum
5
6class MemoryType(Enum):
7    FACT = "fact"
8    PREFERENCE = "preference"
9    EXPERIENCE = "experience"
10    SUMMARY = "summary"
11    ENTITY = "entity"
12
13@dataclass
14class Memory:
15    """A single memory entry."""
16    id: str
17    content: str
18    memory_type: MemoryType
19    importance: float  # 0.0 to 1.0
20    created_at: datetime
21    last_accessed: datetime
22    access_count: int = 0
23    metadata: dict = field(default_factory=dict)
24
25@dataclass
26class Entity:
27    """An entity in the knowledge graph."""
28    id: str
29    name: str
30    entity_type: str
31    properties: dict = field(default_factory=dict)
32
33@dataclass
34class Relationship:
35    """A relationship between entities."""
36    source_id: str
37    target_id: str
38    relation: str
39    properties: dict = field(default_factory=dict)
40
41@dataclass
42class ConversationTurn:
43    """A single turn in conversation."""
44    role: str
45    content: str
46    timestamp: datetime = field(default_factory=datetime.now)
47    tool_calls: list = field(default_factory=list)
48    tool_results: list = field(default_factory=list)
49
50@dataclass
51class Session:
52    """A conversation session."""
53    session_id: str
54    user_id: str
55    started_at: datetime
56    ended_at: Optional[datetime] = None
57    summary: Optional[str] = None
58    key_facts: list[str] = field(default_factory=list)

Working Memory

🐍working_memory.py
1from collections import deque
2import tiktoken
3
4class WorkingMemory:
5    """In-process memory for current conversation."""
6
7    def __init__(
8        self,
9        max_tokens: int = 100000,
10        model: str = "claude-3-5-sonnet-latest"
11    ):
12        self.max_tokens = max_tokens
13        self.turns: deque[ConversationTurn] = deque()
14        self.current_task: Optional[dict] = None
15        self.scratchpad: str = ""
16        self.entities_mentioned: dict[str, Entity] = {}
17        self._encoder = tiktoken.get_encoding("cl100k_base")
18
19    def add_turn(
20        self,
21        role: str,
22        content: str,
23        tool_calls: list = None,
24        tool_results: list = None
25    ) -> None:
26        """Add a conversation turn."""
27        turn = ConversationTurn(
28            role=role,
29            content=content,
30            tool_calls=tool_calls or [],
31            tool_results=tool_results or []
32        )
33        self.turns.append(turn)
34        self._trim_to_fit()
35
36    def _trim_to_fit(self) -> None:
37        """Remove old turns if exceeding token limit."""
38        while self._count_tokens() > self.max_tokens and len(self.turns) > 2:
39            self.turns.popleft()
40
41    def _count_tokens(self) -> int:
42        """Count total tokens in working memory."""
43        total = 0
44        for turn in self.turns:
45            total += len(self._encoder.encode(turn.content))
46            for tc in turn.tool_calls:
47                total += len(self._encoder.encode(str(tc)))
48            for tr in turn.tool_results:
49                total += len(self._encoder.encode(str(tr)))
50        total += len(self._encoder.encode(self.scratchpad))
51        return total
52
53    def get_messages(self) -> list[dict]:
54        """Get messages in LLM format."""
55        messages = []
56        for turn in self.turns:
57            msg = {"role": turn.role, "content": turn.content}
58            messages.append(msg)
59        return messages
60
61    def set_task(self, task: dict) -> None:
62        """Set the current active task."""
63        self.current_task = task
64
65    def update_scratchpad(self, content: str) -> None:
66        """Update reasoning scratchpad."""
67        self.scratchpad = content
68
69    def track_entity(self, entity: Entity) -> None:
70        """Track a mentioned entity."""
71        self.entities_mentioned[entity.id] = entity
72
73    def clear(self) -> None:
74        """Clear working memory."""
75        self.turns.clear()
76        self.current_task = None
77        self.scratchpad = ""
78        self.entities_mentioned.clear()

Semantic Memory (Vector Store)

🐍semantic_memory.py
1import chromadb
2from chromadb.utils import embedding_functions
3import hashlib
4import uuid
5
6class SemanticMemory:
7    """Long-term semantic memory using vector store."""
8
9    def __init__(
10        self,
11        persist_path: str = "./memory_db",
12        embedding_model: str = "text-embedding-3-small"
13    ):
14        self.client = chromadb.PersistentClient(path=persist_path)
15
16        # Set up embedding function
17        self.embedding_fn = embedding_functions.OpenAIEmbeddingFunction(
18            model_name=embedding_model
19        )
20
21        # Create collections for different memory types
22        self.memories = self.client.get_or_create_collection(
23            name="memories",
24            embedding_function=self.embedding_fn,
25            metadata={"hnsw:space": "cosine"}
26        )
27
28    def store(
29        self,
30        user_id: str,
31        content: str,
32        memory_type: MemoryType,
33        importance: float = 0.5,
34        metadata: dict = None
35    ) -> str:
36        """Store a new memory."""
37        memory_id = str(uuid.uuid4())[:8]
38
39        full_metadata = {
40            "user_id": user_id,
41            "memory_type": memory_type.value,
42            "importance": importance,
43            "created_at": datetime.now().isoformat(),
44            "last_accessed": datetime.now().isoformat(),
45            "access_count": 0,
46            **(metadata or {})
47        }
48
49        self.memories.add(
50            ids=[memory_id],
51            documents=[content],
52            metadatas=[full_metadata]
53        )
54
55        return memory_id
56
57    def retrieve(
58        self,
59        user_id: str,
60        query: str,
61        limit: int = 10,
62        memory_types: list[MemoryType] = None,
63        min_importance: float = 0.0
64    ) -> list[Memory]:
65        """Retrieve relevant memories."""
66        # Build filter
67        where_filter = {"user_id": user_id}
68
69        if memory_types:
70            where_filter["memory_type"] = {
71                "$in": [mt.value for mt in memory_types]
72            }
73
74        if min_importance > 0:
75            where_filter["importance"] = {"$gte": min_importance}
76
77        results = self.memories.query(
78            query_texts=[query],
79            n_results=limit,
80            where=where_filter,
81            include=["documents", "metadatas"]
82        )
83
84        memories = []
85        for i in range(len(results["ids"][0])):
86            meta = results["metadatas"][0][i]
87
88            # Update access stats
89            self._update_access(results["ids"][0][i])
90
91            memories.append(Memory(
92                id=results["ids"][0][i],
93                content=results["documents"][0][i],
94                memory_type=MemoryType(meta["memory_type"]),
95                importance=meta["importance"],
96                created_at=datetime.fromisoformat(meta["created_at"]),
97                last_accessed=datetime.now(),
98                access_count=meta.get("access_count", 0) + 1,
99                metadata=meta
100            ))
101
102        return memories
103
104    def _update_access(self, memory_id: str) -> None:
105        """Update access time and count."""
106        result = self.memories.get(ids=[memory_id], include=["metadatas"])
107        if result["metadatas"]:
108            meta = result["metadatas"][0]
109            meta["last_accessed"] = datetime.now().isoformat()
110            meta["access_count"] = meta.get("access_count", 0) + 1
111            self.memories.update(ids=[memory_id], metadatas=[meta])
112
113    def delete_user_memories(self, user_id: str) -> int:
114        """Delete all memories for a user (GDPR compliance)."""
115        results = self.memories.get(
116            where={"user_id": user_id},
117            include=[]
118        )
119
120        if results["ids"]:
121            self.memories.delete(ids=results["ids"])
122            return len(results["ids"])
123
124        return 0

Entity Memory (Knowledge Graph)

🐍entity_memory.py
1import networkx as nx
2import json
3
4class EntityMemory:
5    """Knowledge graph for entity relationships."""
6
7    def __init__(self, persist_path: str = "./entity_graph.json"):
8        self.persist_path = persist_path
9        self.graph = nx.MultiDiGraph()
10        self._load()
11
12    def _load(self) -> None:
13        """Load graph from disk."""
14        try:
15            with open(self.persist_path, 'r') as f:
16                data = json.load(f)
17                self.graph = nx.node_link_graph(data)
18        except FileNotFoundError:
19            self.graph = nx.MultiDiGraph()
20
21    def _save(self) -> None:
22        """Save graph to disk."""
23        data = nx.node_link_data(self.graph)
24        with open(self.persist_path, 'w') as f:
25            json.dump(data, f)
26
27    def add_entity(self, entity: Entity, user_id: str) -> None:
28        """Add or update an entity."""
29        self.graph.add_node(
30            entity.id,
31            name=entity.name,
32            type=entity.entity_type,
33            user_id=user_id,
34            **entity.properties
35        )
36        self._save()
37
38    def add_relationship(self, rel: Relationship) -> None:
39        """Add a relationship between entities."""
40        self.graph.add_edge(
41            rel.source_id,
42            rel.target_id,
43            relation=rel.relation,
44            **rel.properties
45        )
46        self._save()
47
48    def get_entity(self, entity_id: str) -> Optional[Entity]:
49        """Get an entity by ID."""
50        if entity_id not in self.graph:
51            return None
52
53        data = self.graph.nodes[entity_id]
54        return Entity(
55            id=entity_id,
56            name=data.get("name", entity_id),
57            entity_type=data.get("type", "unknown"),
58            properties={k: v for k, v in data.items()
59                       if k not in ["name", "type", "user_id"]}
60        )
61
62    def find_by_name(self, name: str, user_id: str) -> Optional[Entity]:
63        """Find entity by name."""
64        for node_id, data in self.graph.nodes(data=True):
65            if (data.get("name", "").lower() == name.lower() and
66                data.get("user_id") == user_id):
67                return self.get_entity(node_id)
68        return None
69
70    def get_related(
71        self,
72        entity_id: str,
73        relation: str = None,
74        direction: str = "both"
75    ) -> list[tuple[Entity, str]]:
76        """Get related entities."""
77        related = []
78
79        if direction in ["out", "both"]:
80            for _, target, data in self.graph.out_edges(entity_id, data=True):
81                if relation is None or data.get("relation") == relation:
82                    entity = self.get_entity(target)
83                    if entity:
84                        related.append((entity, data.get("relation")))
85
86        if direction in ["in", "both"]:
87            for source, _, data in self.graph.in_edges(entity_id, data=True):
88                if relation is None or data.get("relation") == relation:
89                    entity = self.get_entity(source)
90                    if entity:
91                        related.append((entity, data.get("relation")))
92
93        return related
94
95    def get_subgraph_context(
96        self,
97        entity_id: str,
98        hops: int = 2
99    ) -> str:
100        """Get text description of subgraph around entity."""
101        if entity_id not in self.graph:
102            return ""
103
104        # Collect nodes within N hops
105        nodes = {entity_id}
106        frontier = {entity_id}
107
108        for _ in range(hops):
109            new_frontier = set()
110            for n in frontier:
111                for _, target, _ in self.graph.out_edges(n):
112                    new_frontier.add(target)
113                for source, _, _ in self.graph.in_edges(n):
114                    new_frontier.add(source)
115            nodes.update(new_frontier)
116            frontier = new_frontier
117
118        # Build description
119        lines = []
120        for node_id in nodes:
121            data = self.graph.nodes[node_id]
122            lines.append(f"- {data.get('name', node_id)} ({data.get('type', 'entity')})")
123
124        lines.append("\nRelationships:")
125        for source, target, data in self.graph.edges(data=True):
126            if source in nodes and target in nodes:
127                source_name = self.graph.nodes[source].get('name', source)
128                target_name = self.graph.nodes[target].get('name', target)
129                lines.append(f"- {source_name} --[{data.get('relation')}]--> {target_name}")
130
131        return "\n".join(lines)

Unified Memory Manager

🐍memory_manager.py
1class MemoryManager:
2    """Unified interface for all memory components."""
3
4    def __init__(
5        self,
6        persist_dir: str = "./agent_memory",
7        llm=None  # For extraction and summarization
8    ):
9        self.working = WorkingMemory()
10        self.semantic = SemanticMemory(persist_path=f"{persist_dir}/vectors")
11        self.entities = EntityMemory(persist_path=f"{persist_dir}/graph.json")
12        self.llm = llm
13        self.current_user_id: Optional[str] = None
14        self.current_session_id: Optional[str] = None
15
16    async def start_session(self, user_id: str) -> str:
17        """Start a new session for a user."""
18        self.current_user_id = user_id
19        self.current_session_id = str(uuid.uuid4())[:8]
20        self.working.clear()
21
22        # Load user preferences into working memory scratchpad
23        prefs = self.semantic.retrieve(
24            user_id=user_id,
25            query="user preferences and settings",
26            limit=5,
27            memory_types=[MemoryType.PREFERENCE]
28        )
29
30        if prefs:
31            pref_text = "\n".join([f"- {p.content}" for p in prefs])
32            self.working.update_scratchpad(f"User preferences:\n{pref_text}")
33
34        return self.current_session_id
35
36    async def process_turn(
37        self,
38        role: str,
39        content: str,
40        tool_calls: list = None,
41        tool_results: list = None
42    ) -> None:
43        """Process a conversation turn."""
44        # Add to working memory
45        self.working.add_turn(role, content, tool_calls, tool_results)
46
47        # Extract entities if from user
48        if role == "user" and self.llm:
49            await self._extract_and_track_entities(content)
50
51    async def _extract_and_track_entities(self, content: str) -> None:
52        """Extract entities from user message."""
53        prompt = f"""Extract entities from this message.
54Return JSON: [{{"id": "short_id", "name": "Name", "type": "person|place|thing|org"}}]
55Only include clearly identifiable entities.
56
57Message: {content}"""
58
59        try:
60            response = await self.llm.generate(prompt)
61            entities_data = json.loads(response)
62
63            for e_data in entities_data:
64                entity = Entity(
65                    id=e_data["id"],
66                    name=e_data["name"],
67                    entity_type=e_data["type"]
68                )
69                self.working.track_entity(entity)
70                self.entities.add_entity(entity, self.current_user_id)
71        except:
72            pass  # Entity extraction is best-effort
73
74    def get_context_for_llm(self) -> str:
75        """Get formatted context for LLM prompt."""
76        parts = []
77
78        # Add scratchpad if present
79        if self.working.scratchpad:
80            parts.append(f"CONTEXT:\n{self.working.scratchpad}")
81
82        # Add any retrieved long-term memories
83        # (These would be added by retrieve_relevant_memories)
84
85        return "\n\n".join(parts) if parts else ""
86
87    async def retrieve_relevant_memories(
88        self,
89        query: str,
90        include_entities: bool = True
91    ) -> str:
92        """Retrieve relevant memories for current context."""
93        parts = []
94
95        # Semantic memory search
96        memories = self.semantic.retrieve(
97            user_id=self.current_user_id,
98            query=query,
99            limit=5
100        )
101
102        if memories:
103            parts.append("RELEVANT MEMORIES:")
104            for mem in memories:
105                parts.append(f"- [{mem.memory_type.value}] {mem.content}")
106
107        # Entity context
108        if include_entities:
109            for entity_id, entity in self.working.entities_mentioned.items():
110                context = self.entities.get_subgraph_context(entity_id, hops=1)
111                if context:
112                    parts.append(f"\nABOUT {entity.name}:\n{context}")
113
114        return "\n".join(parts) if parts else ""
115
116    async def end_session(self) -> None:
117        """End session and consolidate memories."""
118        if not self.llm:
119            return
120
121        messages = self.working.get_messages()
122        if len(messages) < 2:
123            return
124
125        # Generate session summary
126        conv_text = "\n".join([f"{m['role']}: {m['content'][:300]}" for m in messages])
127
128        summary_prompt = f"""Summarize this conversation in 2-3 sentences:
129{conv_text}"""
130
131        summary = await self.llm.generate(summary_prompt)
132
133        # Store summary
134        self.semantic.store(
135            user_id=self.current_user_id,
136            content=summary,
137            memory_type=MemoryType.SUMMARY,
138            importance=0.6,
139            metadata={"session_id": self.current_session_id}
140        )
141
142        # Extract and store key facts
143        facts_prompt = f"""Extract key facts to remember from this conversation.
144Return JSON array of strings. Only include important, lasting facts.
145
146{conv_text}"""
147
148        try:
149            facts = json.loads(await self.llm.generate(facts_prompt))
150            for fact in facts:
151                self.semantic.store(
152                    user_id=self.current_user_id,
153                    content=fact,
154                    memory_type=MemoryType.FACT,
155                    importance=0.7
156                )
157        except:
158            pass
159
160        # Clear working memory
161        self.working.clear()
162
163    def remember(
164        self,
165        content: str,
166        memory_type: MemoryType = MemoryType.FACT,
167        importance: float = 0.5
168    ) -> str:
169        """Explicitly store a memory."""
170        return self.semantic.store(
171            user_id=self.current_user_id,
172            content=content,
173            memory_type=memory_type,
174            importance=importance
175        )
176
177    def forget_user(self, user_id: str) -> dict:
178        """Delete all memories for a user (GDPR)."""
179        semantic_count = self.semantic.delete_user_memories(user_id)
180        # Would also need to clean entity graph
181        return {"memories_deleted": semantic_count}

Integrating with an Agent

Here's how to use the memory system with an agent:

🐍memory_agent.py
1import anthropic
2
3class MemoryAgent:
4    """Agent with integrated memory system."""
5
6    def __init__(
7        self,
8        memory_dir: str = "./agent_memory",
9        model: str = "claude-sonnet-4-20250514"
10    ):
11        self.client = anthropic.Anthropic()
12        self.model = model
13        self.memory = MemoryManager(persist_dir=memory_dir, llm=self)
14
15    async def generate(self, prompt: str) -> str:
16        """Generate text (used by memory manager)."""
17        response = self.client.messages.create(
18            model=self.model,
19            max_tokens=1024,
20            messages=[{"role": "user", "content": prompt}]
21        )
22        return response.content[0].text
23
24    async def start_session(self, user_id: str) -> None:
25        """Start a new conversation session."""
26        await self.memory.start_session(user_id)
27
28    async def chat(self, message: str) -> str:
29        """Process a user message and generate response."""
30
31        # Add user message to working memory
32        await self.memory.process_turn("user", message)
33
34        # Retrieve relevant long-term memories
35        memory_context = await self.memory.retrieve_relevant_memories(message)
36
37        # Build system prompt with memory context
38        system_prompt = self._build_system_prompt(memory_context)
39
40        # Get conversation history
41        messages = self.memory.working.get_messages()
42
43        # Generate response
44        response = self.client.messages.create(
45            model=self.model,
46            max_tokens=4096,
47            system=system_prompt,
48            messages=messages
49        )
50
51        assistant_message = response.content[0].text
52
53        # Add assistant response to working memory
54        await self.memory.process_turn("assistant", assistant_message)
55
56        # Check for explicit memory requests
57        await self._handle_memory_requests(message, assistant_message)
58
59        return assistant_message
60
61    def _build_system_prompt(self, memory_context: str) -> str:
62        """Build system prompt with memory context."""
63        base = """You are a helpful AI assistant with memory capabilities.
64You can remember information across conversations and recall relevant context.
65
66When the user asks you to remember something, acknowledge it.
67Use your memory context to personalize responses."""
68
69        if memory_context:
70            return f"{base}\n\nMEMORY CONTEXT:\n{memory_context}"
71
72        return base
73
74    async def _handle_memory_requests(
75        self,
76        user_message: str,
77        assistant_response: str
78    ) -> None:
79        """Handle explicit 'remember this' requests."""
80        remember_keywords = ["remember that", "don't forget", "keep in mind"]
81
82        if any(kw in user_message.lower() for kw in remember_keywords):
83            # Extract what to remember
84            prompt = f"""What fact should be remembered from this message?
85Return just the fact to remember, nothing else.
86
87User: {user_message}"""
88
89            fact = await self.generate(prompt)
90            self.memory.remember(
91                content=fact.strip(),
92                memory_type=MemoryType.FACT,
93                importance=0.8
94            )
95
96    async def end_session(self) -> None:
97        """End the current session."""
98        await self.memory.end_session()
99
100
101# Usage
102async def main():
103    agent = MemoryAgent()
104
105    # Start session for user
106    await agent.start_session(user_id="alice")
107
108    # Conversation
109    print(await agent.chat("Hi! My name is Alice and I prefer concise responses."))
110    print(await agent.chat("Remember that I'm working on a Python project."))
111    print(await agent.chat("What do you know about me?"))
112
113    # End session (saves memories)
114    await agent.end_session()
115
116    # New session - memories persist
117    await agent.start_session(user_id="alice")
118    print(await agent.chat("What was I working on?"))
119
120if __name__ == "__main__":
121    import asyncio
122    asyncio.run(main())

Testing the Memory System

🐍test_memory.py
1import pytest
2import asyncio
3from memory_manager import MemoryManager, MemoryType
4
5@pytest.fixture
6async def memory():
7    """Create a fresh memory manager for each test."""
8    mm = MemoryManager(persist_dir="./test_memory")
9    yield mm
10    # Cleanup
11    mm.forget_user("test_user")
12
13@pytest.mark.asyncio
14async def test_store_and_retrieve(memory):
15    """Test basic memory storage and retrieval."""
16    await memory.start_session("test_user")
17
18    # Store a memory
19    memory_id = memory.remember(
20        "User prefers dark mode",
21        memory_type=MemoryType.PREFERENCE,
22        importance=0.8
23    )
24
25    assert memory_id is not None
26
27    # Retrieve it
28    memories = memory.semantic.retrieve(
29        user_id="test_user",
30        query="display preferences",
31        limit=5
32    )
33
34    assert len(memories) > 0
35    assert "dark mode" in memories[0].content.lower()
36
37@pytest.mark.asyncio
38async def test_session_continuity(memory):
39    """Test that memories persist across sessions."""
40    # Session 1
41    await memory.start_session("test_user")
42    memory.remember("Working on Project Alpha", MemoryType.FACT)
43    await memory.end_session()
44
45    # Session 2
46    await memory.start_session("test_user")
47    context = await memory.retrieve_relevant_memories("What project?")
48
49    assert "Project Alpha" in context
50
51@pytest.mark.asyncio
52async def test_entity_tracking(memory):
53    """Test entity extraction and relationships."""
54    await memory.start_session("test_user")
55
56    # Manually add entities
57    from entity_memory import Entity, Relationship
58    alice = Entity(id="alice", name="Alice", entity_type="person")
59    bob = Entity(id="bob", name="Bob", entity_type="person")
60
61    memory.entities.add_entity(alice, "test_user")
62    memory.entities.add_entity(bob, "test_user")
63    memory.entities.add_relationship(
64        Relationship(source_id="alice", target_id="bob", relation="manages")
65    )
66
67    # Query relationships
68    related = memory.entities.get_related("alice", relation="manages")
69
70    assert len(related) == 1
71    assert related[0][0].name == "Bob"
72
73@pytest.mark.asyncio
74async def test_gdpr_deletion(memory):
75    """Test that user data can be fully deleted."""
76    await memory.start_session("gdpr_user")
77
78    # Add various memories
79    memory.remember("Personal fact 1", MemoryType.FACT)
80    memory.remember("Preference 1", MemoryType.PREFERENCE)
81    await memory.end_session()
82
83    # Delete all data
84    result = memory.forget_user("gdpr_user")
85
86    assert result["memories_deleted"] >= 2
87
88    # Verify deletion
89    memories = memory.semantic.retrieve(
90        user_id="gdpr_user",
91        query="anything",
92        limit=10
93    )
94
95    assert len(memories) == 0

Production Considerations

For production, add: encryption for sensitive memories, rate limiting on memory operations, background consolidation jobs, memory size quotas per user, and monitoring for retrieval quality.

Summary

We've built a complete memory system with:

  1. Working memory: Fast, in-process storage for current conversation
  2. Semantic memory: Vector-based long-term storage for facts and experiences
  3. Entity memory: Knowledge graph for structured relationships
  4. Session management: Continuity across conversations
  5. Unified interface: Single MemoryManager for all components
  6. Agent integration: Memory-augmented agent with context retrieval
Chapter Complete: You now have a production-ready foundation for agent memory. In the next chapter, we'll explore planning and reasoningβ€”how agents break down and solve complex tasks.