We need your help.Read our story
M
v2.8.0 — Temporal Graph & Privacy

Memory for Hermes

The native memory system for Hermes Agent. SQLite-backed, sub-millisecond, zero dependencies. No cloud. No API keys. Just pure speed.

Migrate from Zep, Mem0, Honcho, or Hindsight in one command — see migration docs

<1msQuery latency
0Dependencies
98.9%LongMemEval
100%Local & private
Simple

Three lines. Infinite memory.

No configuration files. No environment variables. No cloud accounts. Import, remember, recall. That is all.

  • pip install mnemosyne-memory
  • Zero external services required
  • Works offline, always
  • Hermes Agent integration built-in
from mnemosyne import remember, recall

# Store a memory
remember(
    "User prefers dark mode",
    importance=0.9,
    scope="global"
)

# Retrieve relevant context
results = recall("user preferences")
# => [{"content": "User prefers dark mode", ...}]
Features

Everything you need. Nothing you do not.

Built from the ground up for AI agents that need fast, reliable, persistent memory.

Sub-Millisecond Latency

Direct SQLite access delivers <1ms queries. No network overhead. No HTTP roundtrips.

100% Private

All data stays on your machine. No cloud services. No data leaves your device, ever.

Native Vector Search

sqlite-vec integration for semantic search. Hybrid ranking: 50% vector + 30% FTS + 20% importance.

Beam Architecture

Three-tier memory: working_memory for hot context, episodic_memory for long-term, scratchpad for reasoning.

Auto Consolidation

Old working memories are automatically summarized and moved to episodic storage via sleep cycles. Configurable auto_sleep intervals.

Hybrid Search

Combines vector similarity, full-text search, and importance scoring for the best recall accuracy.

Streaming & DeltaSync

Real-time incremental memory updates via DeltaSync. Stream results as they arrive — no more waiting for full batches.

Smart Filtering

ignore_patterns blocks noisy or irrelevant content from entering memory. Keep your context window clean and focused.

Speed

Numbers that speak

Measured on CPU with sqlite-vec + FTS5. No GPU required.

Write0.81ms56x faster
Read0.076ms500x faster
Search1.2ms43x faster
Cold Start0msInstant
OperationHonchoZepMem0Mnemosyne
Write45ms85ms50ms0.81ms
Read38ms62ms45ms0.076ms
Search52ms78ms60ms1.2ms
Cold Start500ms800ms300ms0ms

BEAM Benchmark (ICLR 2026)

End-to-end memory retrieval at scale. LLM-as-judge against published baselines.

100K Context35.4%Retrieval from 100K-token conversations
500K Context19.3%Retrieval from 500K-token conversations
1M Context19.2%Retrieval from 1M-token conversations
Compare

Mnemosyne vs. cloud memory providers

See exactly what you gain — and what you trade — when you switch.

FeatureMnemosyneHonchoZepMem0
CostFree forever$$$ Paid (credits)$$$ Paid (Flex+)Freemium ($0–$249/mo)
HostingLocal — your machineCloud onlyCloud / BYOCCloud only
Privacy100% local, zero exfilExternal API callsExternal API callsExternal API calls
Offline modeYes — airplane modeNoNoNo
Setuppip installDocker + API keysDocker + PostgresAPI key + signup
Vector storesqlite-vec (built-in)pgvector (external)pgvector (external)pgvector (external)
Full-text searchFTS5 (built-in)Separate serviceSeparate serviceSeparate service
Auth requiredNoneSupabase authOAuth / API keyAPI key
Rate limitsUnlimitedPlan-dependentCredit-basedPlan-dependent
Data ownershipYou own the SQLite fileVendor-hostedVendor-hostedVendor-hosted
Export / importOne JSON fileLimitedLimitedLimited
DependenciesPython stdlib + ONNXDocker, PostgresDocker, Postgrespip + API key
Memory architectureBEAM (3-tier)Session + factsGraph RAG + factsSession + facts
Auto-consolidationSleep cycles built-inManual / paidManualManual
Temporal triplesNative with validityNoNoNo
LongMemEval98.9% Recall@All@5Not publishedNot publishedNot published
BEAM-100K35.4% / 19.3% / 19.2%Not publishedNot publishedNot published

Switching from Honcho

You gain

500x faster reads, zero monthly bill, 100% offline, no Docker, no credit system

You lose

Cloud dashboard, managed scaling, team sharing

Switching from Zep

You gain

43x faster search, no PostgreSQL to maintain, no deployment overhead, instant cold start

You lose

Graph RAG viz, SOC 2 certs, managed BYOC

Switching from Mem0

You gain

Sub-millisecond everything, no rate limits, no vendor lock-in, full data portability

You lose

Managed platform, 90K+ community, YC ecosystem

Switching from Hindsight

You gain

Zero dependency, no network calls, SQLite-native, BEAM architecture

You lose

Cloud sync, managed inference, web dashboard

The bottom line

  • Speed: 43–500x faster than cloud alternatives — zero HTTP roundtrips.
  • Privacy: Data never leaves your machine. No API calls. No telemetry.
  • Cost: Zero ongoing cost. No credits. No tiers. No "contact sales."
  • Simplicity: One pip install. No Docker. No config. No signup.

Trade-off: You manage your own backup (one SQLite file). No web dashboard or team collaboration — Mnemosyne is built for individual developers and local agents.

Beam

Bilevel Episodic-Associative Memory (Beam)

Three SQLite tables working in harmony. Working memory for hot context auto-injected into prompts. Episodic memory for long-term storage with native vector + FTS5 search. Scratchpad for temporary agent reasoning.

Working Memory

Hot, recent context — auto-injected into prompts. Session-scoped by default, global scope available.

Episodic Memory

Long-term storage with sqlite-vec + FTS5. Hybrid ranking for semantic + text search.

Scratchpad

Temporary agent reasoning workspace. Cleared per session.

# Working memory — auto-injected
beam.remember("User prefers dark mode")

# Episodic — long-term with embedding
beam.remember(
    content="Detailed project context...",
    source="conversation",
    importance=0.8
)

# Hybrid recall across both tiers
results = beam.recall("user preferences")

# Consolidation — move old to episodic
beam.sleep()  # Compress & summarize
Install

One command. Zero configuration.

Get started in seconds. No setup required.

# Basic install
pip install mnemosyne-memory

# With all features (dense retrieval + local LLM)
pip install mnemosyne-memory[all]

# As Hermes MemoryProvider
python -m mnemosyne.install
Trusted

Built for production

"Been running it today (replaced mem0) and so far I am really impressed. Well done on building this!"

Community userMigrated from mem0

"Mnemosyne replaced our entire memory infrastructure. From 50ms average latency to sub-millisecond. Unreal."

Production deploymentHermes Agent integration

"The Beam architecture just makes sense. Working memory for context, episodic for long-term, automatic consolidation."

Open source communityGitHub contributors
Begin

Give your agent a memory

Join the growing number of developers who have replaced cloud memory services with something faster, simpler, and completely private.