Back to Blog

How OpenClaw Persists Memory (ClawdBot/MoltBot) — Files, Sessions, and Search

AIAutomation

OpenClaw’s memory is file-first: durable facts live in Markdown, sessions are stored as JSONL, and semantic recall is a rebuildable index. Here’s how it actually persists across restarts.

Summarize this post with

OpenClaw (formerly ClawdBot, then MoltBot) has one core philosophy that makes its “memory” feel real:

Durable memory is file-first.

That means the canonical source of truth is what’s written to disk (Markdown files in your workspace). Everything else—indexes, embeddings, caches—is derived and can be rebuilt.

This post summarizes the most important persistence mechanisms so you understand what survives across:

  • message turns
  • restarts
  • long-running operation
  • context compaction

Read the TL;DR

OpenClaw persists memory through three durable surfaces:

  1. Workspace memory (Markdown) — what you want the assistant to remember
  2. Session persistence (JSONL transcripts) — what was said + tool calls + compaction events
  3. Memory search index (SQLite / vector) — an accelerant built from your Markdown (and optionally sessions)

If you only remember one thing: if it’s not written to disk, it’s not truly remembered.

1) Workspace memory (canonical, human-auditable)

This is the “real memory.” It’s plain text.

Typical layout:

  • memory/YYYY-MM-DD.md — daily logs (append-oriented)
  • MEMORY.md — curated long-term memory (often only loaded in the private/main session)

Why this is powerful:

  • It’s diffable (you can review changes like code)
  • It’s easy to back up (git works great—just don’t commit secrets)
  • You can deliberately curate what matters instead of letting the model hallucinate “memory”

Practical implication:

  • When something matters (a decision, a preference, a TODO), write it.

2) Session persistence (what happened, stored on disk)

OpenClaw’s Gateway owns session state and persists it on the host machine.

There are two key pieces:

sessions.json — metadata

A small mutable store mapping sessionKey → SessionEntry (things like current session file, timestamps, counters, compaction state).

<sessionId>.jsonl — append-only transcript

An append-only JSONL log containing:

  • user + assistant messages
  • tool calls (and outputs)
  • compaction summaries
  • tree structure via id + parentId

This is not “just chat logs.” It’s a typed event stream that the system can use to reconstruct context.

3) Memory search (derived index over Markdown)

For semantic recall, OpenClaw builds a search index over your memory files.

By default, it indexes:

  • MEMORY.md
  • memory/**/*.md
  • optionally, extra markdown paths (configured)

The important design choice:

The index is not canonical.

It exists to speed up retrieval and can be rebuilt from Markdown.

Hybrid retrieval: lexical + semantic

OpenClaw’s memory search is typically hybrid:

  • a lexical retriever (BM25-style ranking)
  • a semantic retriever (embeddings)
  • results merged into a ranked list

If you want to understand BM25 quickly, start with the Okapi BM25 overview.

SQLite + optional vector acceleration

OpenClaw stores the derived index per agent in SQLite and can accelerate vector search via sqlite-vec when available.

External references:

Compaction: why “memory flush” matters

Long conversations eventually exceed model context limits. OpenClaw can compact sessions (summarize/trim) to keep operating.

The key persistence-friendly behavior:

Pre-compaction memory flush.

Before compaction, the system can write durable items to workspace memory files so they survive even if older transcript turns are summarized away.

If you’re running agents long-term, this is the difference between:

  • “we talked about it once”
  • “we wrote it down, so it stays true forever”

Memory plugins: memory-core vs memory-lancedb

Only one memory plugin owns memory tools at a time (slot-based ownership). Two relevant modes:

  • memory-core — default, file-first Markdown + derived search index
  • memory-lancedb — uses LanceDB as an embedded vector DB for long-term memory recall/capture

LanceDB reference:

Even with alternate backends, the guiding idea stays the same: durable memory should be explicit and inspectable.

Security & privacy (the important reality check)

Two things can be true at once:

  • Your workspace + transcripts live locally on disk.
  • Your model provider still sees what you send it.

If you care about privacy, be deliberate about what gets written into memory files and what gets sent upstream.

If you’re setting this up on Windows, here’s our practical guide (WSL2 + common fixes):

Practical recommendations (what actually works)

  1. Write durable facts to Markdown (daily log + curated memory)
  2. Treat sessions as history, not memory (use them for auditing, not long-term truth)
  3. Let search do the work (keep canonical memory clean; rely on retrieval)
  4. Back up your workspace (git repo is ideal; avoid secrets)

If you’re building a product and you want your AI to compound over time, file-first memory is the simplest reliable approach.

If you’re curious what we’re building on-chain (and why we care about durable workflows): onchainsite.xyz