Skip to main
Omar Nagy.

Open sourcecase 07 / 08

Mnemonic.

Self-hosted memory for AI agents: a FastAPI server on mem0 and Qdrant that auto-categorizes what it stores and serves a tiered context tree instead of the whole pile. MIT.

PythonFastAPIQdrantmem0

Mnemonic: self-hosted AI memory server, terminal-style thumbnail on a violet-ink stage.

01 · a map, not a dump

A context tree, not a context dump.

Most agent memory dumps everything into a vector store and hopes, or carries one history that truncates. Mnemonic serves a tiered tree: a compact map of what the agent knows, then detail only down the branch the question goes.

L1Category summaries
L2The relevant slice
L3Full detail on demand

02 · the contradictory fact

Resolved, not duplicated.

The failure mode that bites real agents: vegetarian last week, chicken this week. mem0 reasons add, update, or delete underneath; an explicit detector is the opt-in stricter layer, being wired into /add as the first open issue.

Two facts, one user
stored · last week
“User is vegetarian.”
incoming · this week
“User ordered chicken.”
Keep both, and the agent is confused. Hover or tap to resolve.
Resolved, not duplicated
1 mem0 reasons about the new fact against the stored one: add, update, or delete
2 a contradicting preference updates the stale fact in place, instead of leaving two that fight
3 the explicit detector (keep-old / keep-new / merge) is the opt-in stricter layer, being wired into /add

03 · the architecture without the lease

The same architecture, on your own box.

The cloud memory APIs want your agents’ history on their server and bill around $20 a month per developer. Self-hosted on a small VPS with Qdrant, it runs roughly $2 a month and the history stays on yours.

~$2/moself-hosted vs $20+/mo cloud (estimate, not a billing screenshot)

Mnemonic · self-hosted on your VPS, history stays yours~$2/mo
Cloud memory APIs · history on a vendor server$20+/mo

Self-hosted is free: grab it from GitHub, point it at Qdrant, plug in your OpenAI key. Want it wired into a real agent fleet without a week of plumbing? Find the leak ($950, one week) picks the right memory shape for your domain and ships a working prototype.

04 · what the server does

The layers a raw library leaves to you.

mem0 owns the vector add/update/delete reasoning and Qdrant the storage. On top, Mnemonic adds the parts a raw library leaves to you, all self-hosted and licensed MIT.

  1. 01CategorizeEvery memory auto-sorted into one of seven categories with an importance score.
  2. 02ExtractPreferences and decisions pulled from a conversation, no “remember this” prompt.
  3. 03Retrieve + reflectSimilarity scored against recency and importance; /reflect synthesizes one answer, not a list.
  4. 04CompactNear the token limit, /compact saves the working context before old turns fall off.
  5. 05ExploreA graph, a timeline, per-category views: a memory layer you can see and debug.

the receipts

~$2/mo
Self-hosted run cost
7
Memory categories
MIT
License
Qdrant
Vector backend