mdb-lucene

absorb complexity into the platform · build convenience into the query language   — a strategy MongoDB has been running, one capability at a time, since 2009 live demo · github ↗

the macro-strategy — in one sentence

It isn't fastest vector retrieval.
It isn't even the database that added vectors to its query language.
It's that MongoDB Atlas absorbs complexity into the platform
and builds convenience into the query language — and has been doing it, one capability at a time, since 2009.

Lucene proves the vector engine is commodity therefore the only game left is what the database absorbs beyond the engine.

Premise on the left = engine equivalence to four decimals (Atlas $vectorSearch ≡ raw Lucene HNSW on the same vectors). Conclusion on the right = the composable aggregation pipeline (one query, one round trip, one ops surface). The arrow in the middle is the entire post.

OK but what does any of this actually mean? Picture an actual AI agent. A user types "change my Tuesday flight to Thursday." Behind the scenes, five things have to happen:

  1. 1 Look up your booking. A normal database query against the airline's records.
  2. 2 Read what's been said in this chat so far. Conversation history.
  3. 3 Look up the airline's reschedule policy. vector search — this is the step the term "vector database" was named for.
  4. 4 Recall what you usually prefer. Long-term memory.
  5. 5 Check what worked last time. Tool-call traces.

A vector database handles step 3. That's it. One out of five. The other four steps live in completely different systems — a Postgres for step 1, a Redis for step 2, a DynamoDB for step 4, a ClickHouse for step 5 — or they live in your application code held together with HTTP calls and prayers, plus a CDC pipeline keeping all of them in sync.

An agentic database is shorthand for "the document model + the aggregation framework — the same one that absorbed geo, graph, time-series, full-text, vector, stream processing, and encrypted-yet-searchable BSON, one capability at a time — applied to all five of those state types at once." That's MongoDB Atlas.

A vector database is to an agentic database what an engine is to a car. The engine is critical. It's where the horsepower lives. It's expensive. It also doesn't have wheels, seats, brakes, AC, headlights, or a steering wheel. You can't drive an engine. You drive the car. When you ship an AI agent, you're shipping a car. The vector store is the engine.

The 5+1 punchline · the same five steps, then one MQL pipeline

Each step on its own is a boring everyday query. Watch what happens at the bottom when the same five steps become five stages of one aggregation pipeline.

1"Look up your booking." — operational record · findOne

db.bookings.findOne({
  user_id:  "u_8421",
  depart_at: { $gte: ISODate("2026-05-19"),
               $lt:  ISODate("2026-05-20") }
})
// → { _id: "BK-7724", flight_no: "AA1132", seat: "27A",
//     fare_class: "Q", origin: "JFK", destination: "SFO" }

2"Read what's been said in this chat so far." — session state · scoped find + sort

db.chats.find(
  { session_id: "sess_3f9c", user_id: "u_8421" }
).sort({ ts: 1 })

3"Look up the airline's reschedule policy." — vector recall · $vectorSearch with pre-filter pushdown

db.policies.aggregate([
  { $vectorSearch: {
      index:       "policy_index",
      path:        "embedding",
      queryVector: embed("rescheduling rules"),
      filter:      { airline: "AA" },   // pushed INTO HNSW, not after
      limit:       3
  }},
  { $project: { title: 1, body: 1,
                _score: { $meta: "vectorSearchScore" } }}
])

4"Recall what you usually prefer." — long-term memory · findOne on the user's preferences

db.user_memory.findOne(
  { user_id: "u_8421" },
  { preferences: 1 }
)

5"Check what worked last time." — tool traces · time-series · most-recent successful call

db.tool_traces.find({
  user_id: "u_8421",
  tool:    "airline_reschedule_api",
  ts:      { $gte: ISODate("2026-04-01") }
}).sort({ ts: -1 }).limit(1)
the punchline — the same five steps, composed into one pipeline, in one round trip
// One agent turn. One aggregation pipeline. One round trip.
db.bookings.aggregate([

  // Step 1 — operational record (the seed for everything else)
  { $match: {
      user_id:  "u_8421",
      depart_at: { $gte: ISODate("2026-05-19"),
                   $lt:  ISODate("2026-05-20") }
  }},

  // Step 4 — long-term memory, joined onto the user
  { $lookup: {
      from: "user_memory", localField: "user_id",
      foreignField: "user_id", as: "memory"
  }},

  // Step 2 — session/chat state, ordered, scoped to this session
  { $lookup: {
      from: "chats",
      let:  { uid: "$user_id" },
      pipeline: [
        { $match: { $expr: { $and: [
            { $eq: ["$user_id",    "$$uid"] },
            { $eq: ["$session_id", "sess_3f9c"] }
        ]}}},
        { $sort: { ts: 1 } }
      ],
      as: "chat"
  }},

  // Step 3 — vector recall over the policy corpus, pre-filter pushed INTO HNSW
  { $lookup: {
      from: "policies",
      pipeline: [
        { $vectorSearch: {
            index:       "policy_index",
            path:        "embedding",
            queryVector: embed("rescheduling rules"),
            filter:      { airline: "AA" },
            limit:       3
        }},
        { $project: { title: 1, body: 1 } }
      ],
      as: "policy"
  }},

  // Step 5 — most recent successful call to this tool for this user (time-series)
  { $lookup: {
      from: "tool_traces",
      let:  { uid: "$user_id" },
      pipeline: [
        { $match: { $expr: { $eq: ["$user_id", "$$uid"] },
                    tool: "airline_reschedule_api",
                    ts:   { $gte: ISODate("2026-04-01") } }},
        { $sort:  { ts: -1 } },
        { $limit: 1 }
      ],
      as: "last_attempt"
  }},

  // Final shape — exactly what the agent's reasoning step needs as input.
  { $project: {
      booking:      { _id: "$_id", flight_no: "$flight_no",
                      seat: "$seat", origin: "$origin",
                      destination: "$destination" },
      memory:       { $first: "$memory" },
      chat:         1,
      policy:       1,
      last_attempt: { $first: "$last_attempt" }
  }}

])
// → ONE document containing operational + memory + chat + policy + traces.
//   ONE round trip. ONE query language. ONE ops surface. ONE auth model.
//   ONE backup. ONE SOC2 audit. ONE on-call rotation.

That's it. $match + four $lookups + a $project — operators that have existed in MongoDB since long before "AI agent" was a job title. The $vectorSearch stage is one of the four lookups. Vector recall is a stage. The aggregation framework already knew how to do four other kinds of stages; it now also knows how to do this one; it composes them all without you needing to write any glue.

The unbroken thread · what MongoDB has been absorbing since 2009

Vector search is one expression of a strategy MongoDB has been running, one capability at a time, since 2009. Every "you'll need a separate database for that" wave got the same answer: extend the query layer; never rebuild the foundation.

year capability the JSON-shape observation
2009document modelJSON in code = JSON on disk
2010geo · 2dspherea coordinate is a nested array
2014ACID · WiredTigerdocument-level locking + transactions across shards
2018$graphLookupa graph edge is a reference
2018change streamsan event is a document; the oplog already had them
2021time-seriesa reading is a timestamped document
2022$search · LuceneLucene tails the oplog · no separate writer
2023$vectorSearchLucene HNSW · same oplog tail · embedding is just an array of floats
2024queryable encryptionencrypted-yet-searchable BSON
2025Atlas Stream Processingaggregation pipelines on data in motion
2026$rankFusion · $rerankhybrid + rerank as native pipeline stages
2026autoEmbed · Voyageembedding model lives next to the documents it embeds

Every node above was the same answer to the same question. Same answer. Same framework. Same model. Sixteen years of compounding. That's the macro-strategy: absorb complexity into the platform; build convenience into the query language. Vector search is the latest expression of it.

This is the static landing page · run the live demo for the interactive panels

The full mdb-lucene demo includes two interactive panels:

  1. The composable-pipeline panel — one MongoDB Atlas aggregation pipeline ($vectorSearch + $lookup + $addFields + $project + $sort) against movies + reviews data, side-by-side with the equivalent raw Lucene + glue column that quantifies the cost of the alternative.
  2. The engine-equivalence panel — Atlas $vectorSearch versus raw KnnFloatVectorQuery against the same MiniLM embeddings on the same 20 docs, agreeing to four decimal places. The proof that the engine is commodity, therefore the contest is at the layers above and around it.

To run the live demo:
git clone https://github.com/fabian-valle-simmons/mdb-lucene
cd mdb-lucene
docker compose up

Then open http://localhost:8088. Five minutes. One Docker host. Real MongoDB Atlas Local + a real Lucene HNSW service. Run your own queries against the live engines.

Why static here? The interactive panels rely on a Python embedding model (MiniLM) that turns free-form prose into 384-d vectors at query time — that lives in the demo's Python process, not in your browser. The static landing page above is the strategic argument; the live demo is the working proof on a laptop.

The honest position · when a vector database is the right answer

This page argues vector databases are the wrong shape for AI agents. That argument is true. But specialized vector databases ship in production today against workloads where they're absolutely the right pick.

Use a vector database when your product is vector search. Use an agentic database when your product is an agent.

✓ vector DB is right

  • Recommendation systems · image-similarity · face search · plagiarism · fraud-by-similarity · chemistry · audio fingerprinting
  • Pure semantic-search features (no agent reasoning around them)
  • 1B+ vectors / 10K+ QPS with serious cost-per-query pressure
  • Brownfield: existing OLTP can't be migrated, vector recall needed now
  • 2-week prototype where speed-to-ship beats architectural purity

✗ vector DB is wrong (use an agentic DB)

  • AI agents — anything that reads operational records + chat + memory + traces on every turn
  • Greenfield AI app, no incumbent OLTP, choosing now
  • Anything where the "rescheduling your flight" walkthrough has 3+ steps that aren't step 3
  • Per-tenant scoped retrieval where filter pushdown into HNSW matters for correctness
  • Anything where "we'll just CDC the vectors over" is on a slide somewhere