HOUSE OF JAYDR
Home

MemPalace

The AI Memory System That Broke the Internet — And Ignited a War Over Benchmarks.

This is a deep dive review of the latest contribution to A.I. technology and the online chatter around its capabilities.

Home
MemPalace: The AI Memory System That Broke the Internet (And Maybe the Benchmarks) — A Deep Dive
AI Tools Deep Dive Open Source Developer Infrastructure

MemPalace: The AI Memory System That Broke the Internet — And Ignited a War Over Benchmarks

On April 6, 2026, actress Milla Jovovich pushed a GitHub repository into public view. Within 48 hours, the developer internet had taken sides. Here is everything you need to know about what MemPalace is, how it works, why it matters, and where the numbers are contested.

19.5K+GitHub stars in 72 hours
2.3K+Repository forks
96.6%LongMemEval (raw mode)
$0Cost — MIT licensed
1.5M+Launch tweet impressions
100%Local — no cloud required

What Is MemPalace?

MemPalace is an open-source AI memory system designed to give large language models — Claude, ChatGPT, Cursor, local Llama models, and others — persistent memory across sessions. That means your AI assistant remembers conversations you had six months ago, not just the last few exchanges before a context window snaps shut.

The core problem MemPalace addresses is what its creators call "AI amnesia." Every session with an AI assistant starts from scratch. Months of debugging decisions, architectural trade-offs, creative reasoning, personal context — gone the moment you close the tab. Existing memory solutions tried to fix this by having an AI decide what was worth keeping, extracting key facts and discarding the rest. MemPalace takes the opposite position: store everything verbatim, then build a structure that makes it findable.

Technically, MemPalace is a local Python application that indexes your conversation history into ChromaDB (a vector database), uses SQLite for metadata and knowledge graph storage, and exposes a set of 19 tools via the Model Context Protocol (MCP) — the open standard that lets AI assistants call external tools at runtime. Once installed, your AI agent can search and retrieve from your memory palace automatically during conversations, without you manually triggering anything.

The repository lives at github.com/milla-jovovich/mempalace. It is MIT-licensed, free to use, free to modify, and runs entirely on your local machine with zero external API dependencies in its default mode.

The Origin Story: How a Hollywood Actress Ended Up on GitHub

The story of MemPalace begins, somewhat improbably, with Milla Jovovich — the actress known globally for her role as Alice in the Resident Evil franchise and as Leeloo in Luc Besson's The Fifth Element. By late 2025, Jovovich had become a daily user of AI tools, accumulating thousands of conversations with ChatGPT and Claude over months of creative and business work.

The frustration that sparked the project was straightforward: no session remembered the last. She described the core insight in her Instagram launch video on April 7, 2026:

"I realized after months of meticulous filing, AI is just not great at finding things, even if you keep the best files." — Milla Jovovich, Instagram Reel, April 7, 2026

Jovovich tried existing memory tools — Mem0 and Zep, the two dominant products in the space — but hit a philosophical wall. Both systems use AI to decide what's worth remembering, summarizing conversations and discarding the parts the model deems unimportant. For Jovovich, that was exactly the wrong approach: the nuance, the dead ends, the reasoning behind decisions — that was precisely what she needed to keep, and precisely what these tools threw away.

The conceptual breakthrough came from an unexpected direction: reading about how ancient Greek orators memorized entire speeches. The technique, known as the method of loci or "memory palace," involves mentally placing pieces of information in specific rooms of an imagined building, then walking through it to retrieve them. Jovovich recognized the structural parallel to what she wanted AI memory to do: not a flat list of extracted facts, but a navigable architecture.

She brought the idea to developer Ben Sigman, CEO of Bitcoin lending platform Libre Labs. Sigman had the engineering experience to translate the architectural concept into working software. Over several months, the two built MemPalace using Anthropic's Claude Code — an AI-assisted coding tool — iterating on the design until it hit what they believed were benchmark-breaking results.

On April 5, 2026, Jovovich pushed the first commit to GitHub under her personal account. On April 6, Sigman announced the project on X, where his launch post — "My friend Milla Jovovich and I spent months creating an AI memory system with Claude. It just posted a perfect score on the standard benchmark" — crossed 1.5 million impressions within 24 hours. Sigman's follow-up tweet read simply: "Multipass." Resident Evil fans and developers alike understood the reference. The internet ignited.

Project Identity Card

  • Full name MemPalace
  • Repository github.com/milla-jovovich/mempalace
  • Website mempalace.tech
  • Creators Milla Jovovich (concept, architecture) & Ben Sigman (engineering)
  • Launch date April 5–6, 2026
  • Language Python (TypeScript-compatible via MCP)
  • License MIT — completely free
  • Storage ChromaDB (vector) + SQLite (metadata/graph)
  • Compatible clients Claude Code, ChatGPT, Cursor, Claude Desktop, any MCP client
  • GitHub stars (72h) 19,500+

How MemPalace Works: The Architecture

The name is not decoration. MemPalace's organizational structure maps directly onto the ancient memory palace metaphor, creating a hierarchical taxonomy for stored memories that improves retrieval by giving the vector search engine contextual scaffolding to work with.

The Hierarchy: Wings, Halls, Rooms, Closets, Drawers

Every piece of memory in MemPalace lives somewhere in this five-level structure:

Wings
Top-level containers for projects or people. Each major domain of your work or life gets its own wing — "Orion project," "client Maya," "personal finance," etc.
Rooms
Sub-topics within a wing. A software project wing might have rooms for "auth," "deployment," "billing," "architecture decisions."
Halls
Corridors that cut across all wings by memory type: facts, events, discoveries, preferences, milestones, problems. A hall groups memories by what kind of thing they are, not where they came from.
Closets
Compressed summaries of room contents, stored in AAAK format (see below). The AI reads the closet first to get oriented before diving into full documents.
Drawers
The original verbatim files. Never deleted, never summarized. The source of truth that all other layers point back to.

This structure is not just aesthetic. According to independent benchmark analysis, the hierarchical organization alone contributes a 34% accuracy improvement in retrieval compared to flat search indexes, without any algorithmic sophistication. You get better results simply by knowing where to look before you look.

Four-Layer Loading: Why 170 Tokens Is Enough

One of the more technically interesting aspects of MemPalace is how little context it requires at startup. Most memory systems front-load the AI's prompt with everything they know about you, burning tokens before a single message is exchanged. MemPalace loads in four layers on demand:

  • L0: The palace map — what wings, halls, and rooms exist (~50 tokens)
  • L1: Agent orientation — who the AI is working with, current projects (~120 tokens total at startup)
  • L2: Closet summaries — retrieved when a hall or room is relevant to the query
  • L3: Full drawer contents — original verbatim files, retrieved only when the specific content is needed

The practical result: a session starts with roughly 170 tokens of memory overhead and scales on demand, rather than dumping a 10,000-token memory blob into every conversation regardless of what you're working on.

AAAK: The Compression Dialect

AAAK is MemPalace's proprietary compression format — described by its creators as "an AI-readable shorthand" that achieves approximately 30x compression of repeated entities. The name comes from its creators and is, per the README, "a whole story of its own."

The design principle is elegant: AAAK is essentially structured English abbreviations, not binary encoding. Any large language model — Claude, GPT-4o, Llama, Mistral, Gemini — can read it natively without a decoder, because it reads as heavily abbreviated text rather than a format the model needs to be trained on. The system trains your agent on AAAK shorthand each time it wakes up, which takes seconds because the dialect is recognizably linguistic.

However, and this matters: AAAK is explicitly marked as experimental in the repository. In its current implementation, using AAAK compression reduces LongMemEval accuracy from 96.6% (raw verbatim mode) to approximately 84.2%. The creators acknowledge this openly in the README and say they are iterating. The closet system currently uses AAAK for summaries, not for the primary storage layer, which preserves the strong retrieval scores for full-document search.

The Temporal Knowledge Graph

Beyond the conversation storage system, MemPalace includes a SQLite-backed temporal knowledge graph — a feature uncommon in this class of tool. The knowledge graph stores facts as triples with validity windows:

kg.add_triple("Maya", "assigned_to", "auth-migration", valid_from="2026-01-15")
kg.invalidate("Maya", "assigned_to", "auth-migration", ended="2026-02-01")
# Historical query
kg.query_entity("Maya", as_of="2026-01-20")
# → [Maya → assigned_to → auth-migration (active)]

This solves a specific, persistent problem in AI assistants: stale information. When a team member leaves a project, when a technology decision changes, when a deadline moves — most memory systems keep the outdated fact and silently conflict with new information. MemPalace's graph marks facts as invalid rather than deleting them, preserving historical accuracy while ensuring current queries return current truth. The graph also includes contradiction detection: if two facts conflict, the system flags it rather than silently accepting both.

MCP Integration: 19 Tools, Auto-Discovered

Once installed, MemPalace exposes 19 tools to compatible AI clients via the Model Context Protocol. Claude Code, Cursor, and Claude Desktop all support MCP natively, meaning the AI can read, write, and search your memory palace automatically during conversations — no manual commands needed. The tool set covers search, storage, knowledge graph queries, agent diaries, and session hooks.

The Three Mining Modes

To populate the palace from your existing conversation history, MemPalace offers three mining modes:

  • Projects mode: Indexes code, documentation, and notes from a file directory
  • Conversations mode: Parses exported conversation histories from Claude, ChatGPT, and Slack
  • General mode: Auto-classifies content into decisions, preferences, milestones, problems, and emotional context

The Benchmark Controversy: What the Numbers Actually Say

MemPalace launched with an aggressive marketing claim: "The highest-scoring AI memory system ever benchmarked. First perfect score on LongMemEval. 500/500 questions. Every category at 100%." The developer community, characteristically, started digging within hours.

The LongMemEval Benchmark

LongMemEval is the standard academic benchmark for AI memory systems, testing models across five categories: information extraction, multi-session reasoning, temporal reasoning, knowledge updates, and abstention. A score of 100% means the system correctly identified the relevant source material for every one of 500 test questions.

MemPalace's claimed scores and what independent reviewers found:

ModeClaimed ScoreIndependent AssessmentNotes
Raw (no API, no reranking)96.6%CredibleGenerally accepted by critics as a genuine result
Hybrid (with LLM reranking)100%ContestedRequires cloud LLM call; team disclosed 3 targeted fixes
LoCoMo benchmark100%Disputedtop_k=50 exceeds the candidate pool, effectively bypassing retrieval

The most damaging technical critique came from the Penfield Labs Substack, which published a detailed code-level analysis. Their core finding on the LongMemEval runner: the benchmark evaluator never actually generates an answer — it only checks whether the correct source session appears in the top-5 retrieved results. More specifically, it checks the softer "recall_any@5" metric (does at least one correct session appear?) rather than the stricter "recall_all@5" (do all correct sessions appear?). The system never had to demonstrate that it could answer the question, only that it retrieved the right document.

On LoCoMo, the critique was sharper. MemPalace runs its LoCoMo test with top_k=50. The test dataset has conversations with 19–32 sessions each. Setting top_k=50 against a pool that maxes out at 32 means the system retrieves every session — the "retrieval" step is bypassed entirely, and the LLM is essentially doing reading comprehension over the complete text. The Penfield Labs analysis called this "cat *.txt | claude." It is not a memory system being tested; it is a reading comprehension test with an open book.

What the critics said

Penfield Labs (Substack): "None of the benchmark scores are real... The LongMemEval 100% was achieved after targeted fixes on specific failing questions. The LoCoMo 100% setting retrieves the entire conversation — the memory architecture contributes nothing to the score."

X community notes on Ben Sigman's posts: "Jovovich's involvement in MemPalace appears conceptual or promotional" — added within hours of launch by users questioning her technical contribution.

Alleged "mystery developer Lu": An X-based AI commentator claimed Sigman and Jovovich hired an unlisted third developer — referenced internally as "Lu" — to write the actual code, citing Jovovich's two-day GitHub history and seven commits. The post received 660,000 views. Sigman has not publicly confirmed or denied the claim.

USC Professor Sean Ren (CEO, Sahara AI), speaking to Yahoo Tech: "That's not proven. We need to wait to see how the community reacts when deploying it in real systems."

The MemPalace team responded directly in the README with what they called "a note from Milla & Ben — April 7, 2026," acknowledging the community's concerns, correcting the primary claimed score to 96.6% (raw), and laying out a public roadmap of fixes. The note reads: "Brutal honest criticism is exactly what makes open source work, and it's what we asked for... We'd rather be right than impressive." They then listed specific GitHub issues they were addressing, including a shell injection vulnerability in the hooks system, a macOS ARM64 segfault, and ChromaDB version pinning.

Separately, it is worth noting that MemPalace is not alone in the AI memory benchmark controversy. Zep, Mem0, and Letta have all published mutual accusations of misconfigured benchmarks and inflated scores — a pattern Penfield Labs acknowledged in their analysis. The entire AI memory benchmarking space has documented reproducibility problems. MemPalace amplified an existing mess.

What People Are Saying: The Community Response

The reaction to MemPalace split along predictable lines: enthusiasm from the broader tech-adjacent public, skepticism and forensic scrutiny from working developers. Both camps were loud.

Brian Roemmele, tech commentator and founder of "The Zero-Human Company" — a real firm that runs with zero full-time human employees — announced he had deployed MemPalace to 79 employees within hours of the launch. His endorsement, widely shared on X, added significant credibility signal to the project for non-technical audiences.

On Hacker News, the launch was noted with the observation: "Yes, that Milla Jovovich (Resident Evil actress). This was definitely not on my 2026 Bingo Card. Missed opportunity to call it Resident Eval." The thread reflected a mix of genuine technical curiosity and bemusement at the celebrity dimension.

On X (Twitter), the reaction was bifurcated by the algorithm. Ben Sigman's launch post passed 1.5 million impressions. The Penfield Labs critique circulated widely among developers. A separate viral X post accusing Sigman and Jovovich of using a ghostcoder drew 660,000 views and hundreds of comments.

On Threads, reactions ranged from straightforward technical interest to the memorable: "Milla Jovovich launching an AI memory system was NOT on my 2026 bingo card." (Wayne Sutton, @WayneSutton). The reaction tweet most shared in non-developer circles was @am_will's: "Milla Jovovich has a GitHub. She's co-developed the highest-scoring AI memory system. What a boss."

In Chinese-language developer communities, the project received serious technical attention — GitHub Issue #37 documents a deep-dive analysis thread from Chinese developers, reflecting how rapidly the project circulated across language communities.

A blog in the European bootstrapper community (blog.mean.ceo) published one of the more grounded assessments: the architecture is genuinely different, the 96.6% raw score is honest and competitive, and the tool is worth evaluating for any organization currently spending on Mem0 or Zep subscriptions. The broader lesson they drew — that you don't need to be a career developer to ship something useful with AI-assisted coding tools — resonated with the wider vibe-coding moment MemPalace landed in.

Key Links for Further Reading

How to Install MemPalace

MemPalace requires Python 3.9 or later. The full setup — install, initialization, first data import, and MCP connection — takes approximately five to ten minutes.

Step 1: Install

pip install mempalace

If you use pipx (recommended for tools you want globally accessible without activating a virtual environment each time):

pipx install mempalace

On macOS with Homebrew, install pipx first if you don't have it:

brew install pipx
pipx install mempalace

Step 2: Initialize Your Palace

mempalace init ~/projects/myproject

The init command walks you through setting up your first wing — defining what entities (people, projects) and rooms (sub-topics) it should contain. This is where you name the spaces that your memories will live in.

Step 3: Mine Your Data

Mining is how you populate the palace from existing files and conversation exports:

# Import a code project (indexes code, docs, notes)
mempalace mine ~/projects/myproject

# Import conversation exports (Claude, ChatGPT, Slack)
mempalace mine ~/chats/ --mode convos

# Auto-classify into decisions, milestones, preferences, problems
mempalace mine ~/chats/ --mode convos --extract general

One practical note from early users: mining an entire large project directory can be overwhelming in the first version, as the file filtering mechanism is still maturing. The Chinese developer community on GitHub Issue #37 recommended an alternative approach — have the AI create and add records itself, starting with key documents and gradually filling the palace over time, rather than bulk-importing everything at once.

Step 4: Connect to Your AI Client

For Claude Code:

mempalace mcp

Then add the following to your Claude Code MCP settings:

{
  "mcpServers": {
    "mempalace": {
      "command": "mempalace",
      "args": ["mcp"]
    }
  }
}

If you installed via pipx on macOS, you may need to specify the full path:

{
  "mempalace": {
    "command": "~/.local/pipx/venvs/mempalace/bin/python",
    "args": ["-m", "mempalace.mcp_server"]
  }
}

For Cursor, ChatGPT, and Claude Desktop, the same MCP config block applies — each client has its own settings file where you add the server configuration. The mempalace.tech setup guide documents the exact file paths for each client.

Step 5: Verify and Search

mempalace status
mempalace search "why did we switch to GraphQL"

Once connected, you can search from the command line or let your AI agent search automatically. You can also ask your AI directly: "Remember that our production database is on us-east-1." It will confirm storage and file it in the appropriate room.

Known issues at launch: A shell injection vulnerability in the hooks system (Issue #110), a macOS ARM64 segfault (Issue #74), and a ChromaDB version pinning problem (Issue #100) were all flagged by the community within 48 hours. The team has acknowledged all three and listed them as priorities. Check the GitHub issues page before deploying to a production environment.

What You Can Actually Use MemPalace For

Set aside the celebrity story and the benchmark fight for a moment. The practical question is: where does this tool earn its place in a real workflow?

Long-Running Software Projects

Six months of architectural decisions — why you chose Postgres over SQLite, why you switched from REST to GraphQL, which rate-limiting approach you tested and abandoned — MemPalace makes these retrievable. Not as a bullet-pointed summary, but as the actual conversation where you reasoned it out. This matters because the reasoning is often as important as the conclusion.

Cross-Project Pattern Recognition

MemPalace's cross-wing search lets you find patterns across projects simultaneously. The command mempalace search "rate limiting approach" returns your answer from Project Orion and Project Nova in the same query, showing you the differences between the two approaches. No project manager or second brain tool currently does this for AI-assisted work.

Team Knowledge Infrastructure

The temporal knowledge graph is particularly useful in team contexts. Who is assigned to which project, as of which date? What was the status of the auth migration in January? When did Kai leave the Orion project? These are the questions that create coordination failures in distributed teams, and the knowledge graph answers them with temporal precision — including historical queries that tell you what was true at a specific past date.

Personal AI Context Management

For individual power users — developers, researchers, writers, operators — MemPalace functions as a second brain that your AI can actually navigate. Instead of manually re-briefing Claude or ChatGPT at the start of every session, the palace map loads automatically and the agent queries for relevant context on demand. The 170-token startup overhead is low enough to leave running permanently without meaningful cost.

Conversation Archive and Retrieval

MemPalace supports importing exports from Claude, ChatGPT, and Slack. This means years of existing conversations can be indexed and made searchable — a use case that has nothing to do with real-time AI memory and everything to do with building a personal knowledge base from the work you've already done.

Storage at Scale

With AAAK compression (in the closet layer, not the primary storage), a typical six months of daily AI conversations — approximately 19.5 million tokens raw — compresses to around 650,000 tokens of stored closet data, using 50–100MB of disk space. The verbatim drawers grow larger, but storage is local and cheap. ChromaDB and SQLite are both mature, production-tested technologies handling millions of records without complaint.

How MemPalace Compares to the Competition

SystemLongMemEvalCostLocalStorage approachMCP support
MemPalace (raw)96.6%Free (MIT)YesVerbatim + vector searchYes — 19 tools
Mem0~85%$19–249/moCloud-firstAI extraction + discardVia API
Zep~85%$25+/moCloud-firstAI extraction + discardVia API
Supermemory~99%PaidCloud onlyHybridLimited
CLAUDE.md (flat file)N/AFreeYesPlain text, manualNo

The philosophical difference between MemPalace and Mem0/Zep is worth emphasizing: Mem0 and Zep both use an AI to decide what's worth keeping before it's stored. MemPalace stores first, decides what's relevant at retrieval time. The debate is essentially "compression at write" vs. "compression at read" — and the benchmark results suggest the latter approach preserves enough additional information to outperform on recall tests, at the cost of more disk space and longer indexing times.

The Bigger Picture: What MemPalace Represents

Beyond its specific functionality, MemPalace is interesting as a cultural artifact. It is one of the earliest high-profile examples of a non-developer building genuinely functional software infrastructure using AI-assisted coding tools — not a toy app or a landing page generator, but a Python library with ChromaDB integration, SQLite schema management, MCP server implementation, and benchmark tooling.

Whether Jovovich wrote every line of code herself (she clearly did not), or whether a ghostcoder named Lu contributed substantially (disputed, unconfirmed), or whether the collaboration with Sigman and Claude Code represents a legitimate new model of software authorship — these questions matter less than what the result demonstrates: the floor for who can ship functional developer tooling has moved significantly lower.

Sean Ren, the USC computer science professor and Sahara AI CEO who reviewed the project, put the architectural premise clearly: the memory palace structure is a general method for organizing information that does not depend on any AI-specific magic. It could, in principle, scale to any agent framework. The retrieval improvement from structure alone — that 34% gain from simply knowing where to look before you look — is a transferable insight regardless of whether MemPalace's specific benchmark numbers hold up to scrutiny.

The controversy, meanwhile, illustrates something equally important: the developer community will forensically inspect claims made in the AI memory space, because the benchmark landscape is already a documented mess of mutual accusations and methodology disputes between Zep, Mem0, and Letta. MemPalace walked into that environment with aggressive marketing and a celebrity co-creator. The scrutiny was guaranteed and, ultimately, productive — it surfaced real bugs, a shell injection vulnerability, and documentation inconsistencies within 48 hours of launch, all of which the team has committed to fixing.

"We'd rather be right than impressive." — Milla Jovovich & Ben Sigman, GitHub README, April 7, 2026

That line — added in response to community pressure, not written in anticipation of it — is the most credible thing in the MemPalace repository. Open source tools live or die on whether the people behind them can take criticism and act on it. The first 72 hours suggested they can.

Bottom Line

MemPalace is a real, working, MIT-licensed AI memory system with a genuinely different architectural approach — verbatim storage with structured retrieval — that outperforms cloud-based competitors on the benchmark that matters most, at zero cost. The 100% headline score is marketing; the 96.6% raw score is honest and impressive. The knowledge graph, the layered loading architecture, and the AAAK compression direction are all interesting engineering regardless of the celebrity wrapper.

If you are an individual developer or researcher who currently pays for Mem0 or Zep, it is worth evaluating now. If you are building production infrastructure, wait for the shell injection fix and the ChromaDB pinning patch before deploying. If you just want AI that remembers your work across sessions without sending your data to a cloud: this is the most capable free option available as of April 2026.


All quotes sourced from publicly available GitHub READMEs, X posts, Substack articles, and news coverage. Links current as of April 8, 2026.