Nightly QMD Reindex
Keep search indexes fresh with a nightly reindex cron job
What QMD Is
QMD is a local search engine for markdown files. It indexes your files, generates vector embeddings for them, and gives your agent the ability to semantically search across everything -- not just keyword matching, but "find me notes related to this concept" kind of search.
Your agent uses QMD when it needs to find something in your files. If you ask "what did we decide about the discovery platform architecture?" your agent can search across your vault, memory files, and workspace to find relevant notes instead of reading through hundreds of files manually. It supports keyword search (qmd search), semantic vector search (qmd vsearch), and hybrid search with reranking (qmd query).
What Gets Indexed
QMD works with collections -- groups of markdown files from different directories. In my setup, I have three:
- workspace -- the OpenClaw agent workspace (
~/.openclaw/workspace/). All the agent config files, project docs, notes. About 4,300 files. - memory -- the agent's daily memory logs (
~/.openclaw/workspace/memory/). Session logs, what happened each day. About 90 files and growing. - slipbox -- my LogSeq vault (
~/slipbox/). All my zettelkasten notes, literature notes, reference notes, journal entries. About 4,300 files.
That's around 8,800 files total with ~38,000 embedded vectors.
Setting Up Collections
QMD doesn't automatically know what to index -- you tell it by creating collections. Each collection points to a directory and a file pattern. You can ask your agent to set these up:
Install QMD (bun install -g github:tobi/qmd) and set up three
collections:
1. "workspace" pointing to ~/.openclaw/workspace/ for all .md files
2. "memory" pointing to ~/.openclaw/workspace/memory/ for all .md files
3. "slipbox" pointing to ~/slipbox/ for all .md files
Then run qmd update and qmd embed to do the initial indexing.Under the hood, the agent runs qmd collection add <path> --name <name> --mask "**/*.md" for each one. After that, qmd update and qmd embed know which directories to scan. You only need to set up collections once -- the nightly cron job just refreshes them.
If you want to add more collections later (say, a projects folder or a second vault), just ask your agent to add another collection and it'll be included in the nightly reindex automatically.
Why Reindex Nightly
New files get created constantly -- journal entries, memory logs, vault notes, project docs. If you don't reindex, QMD's search results go stale. The nightly job runs qmd update to scan for new and changed files, then qmd embed to generate embeddings for anything new. This way when you ask your agent to search for something the next morning, it can find last night's notes.
Setup Prompt (Cron Job)
Create a cron job called "Nightly QMD Reindex" at 2:00 AM every day.
Run these commands:
qmd update
qmd embed
This updates all QMD collections and regenerates embeddings for any
new or changed files. Log results (file counts, any errors) in
today's memory file. If there are errors, message me. Otherwise
complete silently.- Schedule:
0 2 * * *(2:00 AM daily) - Timeout: 600 seconds (embedding can be slow with large vaults)