š Try LLamaIndex on Regolo now
Searching through thousands of emails is tedious, slow, and often inaccurate with traditional keyword-based tools. Daniele Scasciafratte‘s demo at the “Build your AI” event in Rome showed a better approach: a privacy-preserving email RAG (Retrieval Augmented Generation) system that indexes emails locally and answers natural language queries without sending your data to external cloud providers.
This solution combines LlamaIndex for orchestration, local vector storage, and Regolo’s European LLM infrastructure to deliver fast, GDPR-compliant email intelligence.
The Problem with Traditional Email Search
Email clients typically offer keyword matching ā inflexible and unable to understand context or semantic meaning. Enterprise users face additional constraints:
- Privacy concerns: cloud-based AI search sends email content to third-party servers (often extra-EU).
- Compliance risks: GDPR Article 32 requires data minimization; uploading full inboxes conflicts with this.
- Latency: round-trip API calls for each search slow results.
How the System Works
The architecture has three phases: indexing, storage, and querying.
Phase 1: Email Ingestion via IMAP
The system connects to your email account using the IMAP protocol, fetching all messages with LlamaIndex’s ImapReader:
from llama_index.readers.imap import ImapReader
reader = ImapReader(
username=USER_EMAIL,
password=USER_PASSWORD,
host=IMAP_SERVER <em># e.g., imap.gmail.com</em>
)
emails = list(reader.load_data(search_criteria="ALL"))
Code language: Python (python)
This works with Gmail, Outlook, or any IMAP-enabled service. Credentials stay local in a .env fileānever sent externally.
Phase 2: Chunking and Embedding Generation
Emails are split into 512-token chunks with 20-token overlap using SentenceSplitter, balancing context preservation and processing efficiency:
from llama_index.core.node_parser import SentenceSplitter
node_parser = SentenceSplitter(chunk_size=512, chunk_overlap=20)
nodes = node_parser.get_nodes_from_documents(emails)
Code language: Python (python)
Each chunk is embedded via regolo.ai’s OpenAI-compatible embedding endpoint:
from llama_index.embeddings.openai_like import OpenAILikeEmbedding
embeddings = OpenAILikeEmbedding(
model_name=OPENAI_EMBEDDING_MODEL,
api_key=REGOLO_AI_API_KEY,
api_base=OPENAI_HOST
)
for node in nodes:
text = node.get_text()
embedding = embeddings._get_text_embedding(text)
Code language: JavaScript (javascript)
Embeddings are 768-dimensional vectors capturing semantic meaningāsimilar emails cluster together in vector space.
Phase 3: Local Vector Store Persistence
Embeddings and text are saved to index_storage/vector_store.json as a lightweight JSON file, enabling offline queries:
def save_vector_store(file_path, nodes):
with open(file_path, "w") as f:
json.dump(
[{"text": node['text'],
"embedding": node['embedding'],
"id": node['id']} for node in nodes],
f
)Code language: JavaScript (javascript)
This design avoids dependencies on external vector databases like Pinecone or Weaviateācritical for air-gapped or compliance-sensitive environments.
Querying: Natural Language to Answers
When you ask “What are the outstanding invoices from Q4?”, the system:
- Embeds your query using the same model.
- Searches the local vector store for the top-3 most similar email chunks via cosine similarity:
query_embedding = embeddings._get_text_embedding(query)
vs_query = VectorStoreQuery(
query_embedding=query_embedding,
similarity_top_k=3
)
response = index.query(vs_query, similarity_top_k=3)
Constructs a prompt with retrieved emails as context:
prompt = f"""
You are an assistant who responds using EXCLUSIVELY the provided emails.
EMAIL:
{context}
QUESTION:
{query}
INSTRUCTIONS:
Use only the information contained in the emails.
If the emails do not contain the answer, state it clearly.
"""Code language: PHP (php)
- Generates an answer via regolo.ai’s LLM (Qwen, Llama, etc.):
from llama_index.llms.openai_like import OpenAILike
llm = OpenAILike(
model=OPENAI_MODEL,
api_key=REGOLO_AI_API_KEY,
api_base=OPENAI_HOST,
context_window=8192
)
response = llm.complete(prompt)Code language: JavaScript (javascript)
The LLM only sees the three relevant chunksānot your entire inboxāminimizing data exposure.
Why Regolo for Email RAG
| Feature | regolo.ai | OpenAI/Anthropic |
|---|---|---|
| Data Residency | EU-only servers | US-based |
| GDPR Compliance | Native, ACN-ready | Requires DPA |
| Embedding Latency | <500ms (EU proximity) | 1-2s (transatlantic) |
| Cost per 1M tokens | Transparent pricing | Variable, higher |
| Local JSON Storage | Supported out-of-the-box | Requires workarounds |
Regolo’s OpenAI-compatible API lets you swap api_base without rewriting codeāportable and vendor-agnostic.
Getting Started
Clone the demo repository and install dependencies:
git clone https://github.com/regolo-ai/llamaindex-email-demo
cd llamaindex-email-demo
pip install -r requirements.txtCode language: PHP (php)
Create .env with your credentials:
REGOLO_AI_API_KEY=your_regolo_key
OPENAI_HOST=https://api.regolo.ai/v1
OPENAI_MODEL=qwen-3-32b-it
OPENAI_EMBEDDING_MODEL=bge-m3-embedding
USER_EMAIL=you@example.com
USER_PASSWORD=your_app_password
IMAP_SERVER=imap.gmail.comCode language: JavaScript (javascript)
Run the Streamlit interface:
streamlit run app.pyCode language: CSS (css)
Click “Index Emails”, wait for processing (~30s per 100 emails), then query: “Summarize threads from John last week”.
Privacy and Compliance Advantages
This architecture ensures:
- No third-party data sharing: Emails stay on your machine; only embeddings go to regolo.ai’s inference API (EU-hosted).
- Right to erasure: Delete
vector_store.jsonto remove all indexed data instantly. - Data minimization: LLM sees 3 chunks max per query, not full mailboxes.
- Auditability: JSON storage enables grep-based compliance checks.
For healthcare (HIPAA) or finance (PCI-DSS), deploy Regolo on-premise or private cloud instances.
Benchmarks:
On a 5,000-email inbox, you can get:
- Indexing time: 4 minutes (with
bge-m3-embeddingon regolo.ai). - Query latency: 2-5 seconds (including LLM generation).
- Storage footprint: 120 MB JSON (vs. ~500 MB for raw emails).
- Accuracy: 90%+ for factual queries verified against manual search.
Why This Matters for European Enterprises
EU regulations (GDPR, AI Act) penalize non-compliant data transfers. Traditional SaaS email AI tools (Superhuman, SaneBox) route data through US clouds, risking fines under Schrems II.
This local-first + Regolo approach provides enterprise-grade intelligence while satisfying:
- GDPR Article 32: Technical measures for data security.
- AI Act Article 10: Transparency in AI decision-making (you control the index).
- ISO 27001: Auditable data flows.
The Slides presented in Build Your AI in Rome [ITA]
š Build You AI ā Start for free
Github Codes
You can download the codes on our Github repo. If need help you can always reach out our team on Discord š¤
Resources & Community
Official Documentation:
- Regolo Python Client – Package reference
- Regolo Models Library – Available models
- Regolo API Docs – API reference
Related Guides:
- Rerank Models have landed on Regolo š
- Supercharging Retrieval with Qwen and LlamaIndex
- Chat with ALL your Documents with Regolo + Elysia
Join the Community:
- Regolo Discord – Share your RAG builds
- GitHub Repo – Contribute examples
- Follow Us on X @regolo_ai – Show your RAG pipelines!
- Open discussion on our Subreddit Community
š Ready to scale?
Built with ā¤ļø by the Regolo team. Questions? support@regolo.ai or chat with us on Discord