# Implementing Stateful AI Agents: How to Build Anthropic's Memory Store and Dreaming Architecture in Python

It's a blueprint production-ready for implementing a stateful, three-layer memory architecture for AI agents. Inspired by Anthropic's managed agent memory framework, this approach uses local open-source LLMs (such as Llama 3 via Ollama) and Python to resolve context window bloat, reduce inference latency, and prevent memory drift.

## What is Anthropic's Composable Agent Memory Architecture?

Standard LLM sessions are stateless. Directly injecting raw chat logs into a prompt creates a critical bottleneck: context window degradation and rising token costs.

Anthropic’s architecture decouples active reasoning from database maintenance using three layers:

| Layer | Type | Frequency | Core Function |
|---|---|---|---|
| **1. Session** | Ephemeral | Active | Manages single-turn user prompts and immediate execution context. |
| **2. Memory Store** | Persistent | Real-time | Reads/writes raw markdown files within a mounted local workspace directory. |
| **3. Dreaming** | Asynchronous | Scheduled | Runs offline to consolidate files, resolve contradictions, and rebuild a structural index. |

Anthropic's managed agent memory model solves this through three distinct, composable layers:

```
  ┌──────────────────────────────────────────────────────────┐
  │ 1. SESSION (Ephemeral conversation thread)               │
  └──────────────────────────┬───────────────────────────────┘
                             │ Mounts (Read/Write)
  ┌──────────────────────────▼───────────────────────────────┐
  │ 2. MEMORY STORE (Persistent local Markdown directory)    │
  └──────────────────────────┬───────────────────────────────┘
                             │ Consolidated by
  ┌──────────────────────────▼───────────────────────────────┐
  │ 3. DREAMING (Async batch processing & index rebuilding)  │
  └──────────────────────────────────────────────────────────┘Code language: Bash (bash)
```

## Prerequisites and System Requirements

To run this implementation locally, ensure your environment meets the following conditions:

- **Python**: Version 3.8 or higher.
- **Libraries**: requests (for local LLM API orchestration).
- **Local LLM Runner**: Ollama with Llama 3 loaded (ollama run llama3).

## Step-by-Step Implementation Guide

```
┌─────────────────────────────────────────────────────────────┐
│                      LOCAL SYSTEM WORKSPACE                 │
│                                                             │
│   ┌───────────────────┐               ┌─────────────────┐   │
│   │   data/           │               │  src/           │   │
│   │   ├── transcripts/│               │  ├── llm.py     │   │
│   │   ├── input_mem/  │◄─────────────►│  ├── agent.py   │   │
│   │   └── output_mem/ │               │  └── dream.py   │   │
│   └───────────────────┘               └─────────────────┘   │
└─────────────────────────────────────────────────────────────┘Code language: Bash (bash)
```

### Step 1: Establish the Ephemeral Session and Mount File Storage

The live session agent requires a persistent filesystem path (a "mounted memory store") to track active variables.

Implement the LiveSessionAgent to check the local storage directory (data/input\_memory) on initialization. It aggregates any existing .md files to provide context directly to the system prompt.

```
# src/agent.py
import os
from .llm import LocalLLM

class LiveSessionAgent:
    def __init__(self, memory_dir: str, llm: LocalLLM):
        self.memory_dir = memory_dir
        self.llm = llm
        os.makedirs(self.memory_dir, exist_ok=True)Code language: Python (python)
```

### Step 2: Implement Real-Time Memory Reading and Writing

Rather than forcing the agent to rewrite the entire memory store on each turn, write operations append delta updates directly to specific markdown files (such as sessions.md). This architecture limits context window bloat during active loops.

```
def execute_session(self, user_message: str) -> str:
        # Read the current contents of the mounted file storage
        existing_facts = []
        for file in os.listdir(self.memory_dir):
            if file.endswith(".md"):
                with open(os.path.join(self.memory_dir, file), "r") as f:
                    existing_facts.append(f"[{file}]:\n{f.read()}")
        
        context = "\n".join(existing_facts)
        system_prompt = (
            "You are a stateful agent with access to a mounted filesystem storage folder. "
            "Evaluate current storage files and reply to the user.\n\n"
            f"--- CURRENT STORAGE DATA ---\n{context}"
        )
        
        # Query Llama 3 via Ollama
        response = self.llm.query(system_prompt, user_message)
        
        # Append new session facts directly to sessions.md
        with open(os.path.join(self.memory_dir, "sessions.md"), "a") as f:
            f.write(f"\n- Fact: {user_message}\n")
            
        return responseCode language: Python (python)
```

### Step 3: Configure the Asynchronous Dreaming Pipeline

To optimize performance over time, run an offline "Dreaming" process to handle index maintenance.

The DreamingOrchestrator runs asynchronously (e.g., as a scheduled background worker). It scans historical chat transcripts from completed sessions (data/transcripts) and raw memory stores, then triggers an LLM orchestration loop to merge duplicates, resolve contradictions, and output clean files to data/output\_memory.

```
# src/dream.py
import os
from .llm import LocalLLM

class DreamingOrchestrator:
    def __init__(self, input_store: str, transcripts_dir: str, output_store: str, llm: LocalLLM):
        self.input_store = input_store
        self.transcripts_dir = transcripts_dir
        self.output_store = output_store
        self.llm = llm
        os.makedirs(self.output_store, exist_ok=True)Code language: Python (python)
```

### Step 4: Compile and Consolidate Memory Files

During the dreaming cycle, the LLM analyzes the raw text data. It acts as an offline editor to reorganize messy notes, produce structured files, and generate a primary map file.

```
def run_dream_cycle(self) -> str:
        # Compile inputs: raw transcripts + existing memory notes
        payload = "=== CURRENT SYSTEM MEMORY ===\n"
        for file in os.listdir(self.input_store):
            if file.endswith(".md"):
                with open(os.path.join(self.input_store, file), "r") as f:
                    payload += f"[{file}]:\n{f.read()}\n\n"

        payload += "=== HISTORICAL UNPROCESSED TRANSCRIPTS ===\n"
        for file in os.listdir(self.transcripts_dir):
            if file.endswith(".json"):
                with open(os.path.join(self.transcripts_dir, file), "r") as f:
                    payload += f"[{file}]:\n{f.read()}\n\n"

        system_prompt = (
            "You are a Dreaming Memory Consolidator. Read current memory files "
            "and historical transcripts. Merge duplicate records, resolve factual contradictions, "
            "and write structured markdown memory files along with an index mapping file."
        )

        consolidated_output = self.llm.query(system_prompt, payload)
        
        # Write clean output to output storage
        with open(os.path.join(self.output_store, "sessions.md"), "w") as f:
            f.write(consolidated_output)
        
        # Auto-generate a primary index to map context paths
        with open(os.path.join(self.output_store, "_index.md"), "w") as f:
            f.write(
                "# Consolidated Memory Index\n"
                "- sessions.md: Automatically reconciled records of previous sessions.\n"
            )
            
        return "Dream cycle executed. Consolidated files written to output memory store."Code language: Python (python)
```

All the source codes you'll find in the repo below.

---

## Github

You can download the codes on our Github repo, just download and follow the README steps. If need help you can always reach out our team on [Discord](https://discord.gg/gVcxQz7Y) 🤙

[Download the Code](https://github.com/regolo-ai/tutorials/tree/main/dreaming_agents_that_remember)

---

St**art your free 30-day trial at [regolo.ai](https://regolo.ai/) and deploy LLMs with complete privacy by design.**

👉 [Talk with our Engineers](https://regolo.ai/contacts/) or [Start your 30 days free →](https://regolo.ai/pricing)

---

- [Discord](https://discord.gg/ZzZvuR2y) - Share your thoughts
- [GitHub Repo](https://github.com/regolo-ai/) - Code of blog articles ready to start
- Follow Us on X [@regolo\_ai](https://x.com/regolo_ai)
- Open discussion on our [Subreddit Community](https://www.reddit.com/r/regolo_ai/)

---

*Built with ❤️ by the Regolo team. Questions? [regolo.ai/contact](https://regolo.ai/contact)* or chat with us on [Discord](https://discord.gg/ZzZvuR2y)