MongoDB as Your Vector Database: Python Guide to Semantic & Hybrid Search

MongoDB Atlas now doubles as a vector database. You can store embeddings alongside your documents, run $vectorSearch for semantic retrieval, blend it with full-text for hybrid search, and scale on Search Nodes—no extra system to run. Recent updates add View support (GA) and built-in smarts via the Voyage AI acquisition.
Why MongoDB for vectors?
One place for app + AI data: Keep embeddings in the same collection as your source records, then query with
$vectorSearch(ANN/ENN). Less glue code, fewer moving parts.Hybrid search: Combine BM25 full-text with vectors (RRF, semantic boosting) to improve relevance—best of both worlds.
Scales cleanly: Atlas Search Nodes isolate capacity for search/vector workloads; multi-region options and big perf gains reported.
Fresh in 2025: View support GA for Atlas Search/Vector Search—pre-shape and pre-filter data before indexing.
Better embeddings: MongoDB acquired Voyage AI; models (text, multimodal, rerankers) integrate into Atlas to boost retrieval quality and cost-efficiency.
What you can build (fast)
Semantic search: over product docs, tickets, logs, or knowledge bases.
RAG chatbots: that ground LLM answers on your collections.
Hybrid search: experiences (keyword + meaning) with tunable weights.
Quick start (Python): index → embed → query
# pip install "pymongo>=4.7" # plus your embedding library
from pymongo import MongoClient
from pymongo.operations import SearchIndexModel
import os
ATLAS_URI = os.getenv("ATLAS_URI")
client = MongoClient(ATLAS_URI)
col = client.kb.articles
1) Create a Vector Search index (embedding: 1024 dims, cosine)
vector_index = SearchIndexModel(
name="vec_articles",
type="vectorSearch",
definition={
"fields": [
{"type": "vector", "path": "embedding", "numDimensions": 1024, "similarity": "cosine"},
{"type": "filter", "path": "tenant_id"} # optional pre-filter field(s)
]
},
)
col.create_search_index(vector_index)
2) Upsert a doc with an embedding (pseudo-code; swap in your embedding call)
def embed(text:str) -> list[float]:
# e.g., any embedding service—just return a list of floats sized to your index
return [0.0]*1024
doc = {
"_id": "kb-001",
"title": "Reset 2FA on iOS",
"body": "Steps to recover access when you lost your authenticator.",
"tenant_id": "acme",
"embedding": embed("Reset 2FA on iOS Steps...")
}
col.replace_one({"_id": doc["_id"]}, doc, upsert=True)
# 3) Semantic search with $vectorSearch (+ optional pre-filter)
user_query = "I can't log in because I lost my phone authenticator"
query_vec = embed(user_query)
pipeline = [
{"$vectorSearch": {
"index": "vec_articles",
"path": "embedding",
"queryVector": query_vec,
"numCandidates": 200,
"limit": 5,
"filter": {"tenant_id": "acme"}
}},
{"$project": {
"title": 1, "score": {"$meta": "vectorSearchScore"}
}}
]
results = list(col.aggregate(pipeline))
print(results[:2])
$vectorSearchis an aggregation stage; setnumCandidatesandlimitfor recall/latency trade-offs.Build the vector index with the same dimensions + similarity metric as your embedding model.
You can also create indexes via Atlas UI, CLI, mongosh (`db.collection.createSearchIndex()`), or driver APIs.
Hybrid search (keyword + vectors)
Blend BM25 full-text with vectors for higher quality. A common pattern is Reciprocal Rank Fusion (RRF) to merge results. MongoDB docs walk through RRF/semantic boosting with $search + $vectorSearch.
RAG on Atlas (simple path)
MongoDB’s RAG guides show ingest → index → retrieve with $vectorSearch → generate with your LLM. There’s even a local RAG tutorial if you want to avoid external APIs while prototyping.
Performance & scale notes
Search Nodes: dedicate & scale search/vector compute separately from your database replica set (multi-region available).
Views (GA): index a view to pre-filter/transform documents before vector indexing—handy for multi-tenant or curated catalogs.
Quantization & dim choices: fewer dimensions and lower precision reduce storage/latency.
ANN under the hood: Atlas Vector Search uses HNSW for approximate nearest-neighbor (fast and accurate for semantic search).
When to use MongoDB vs a dedicated vector DB
MongoDB Atlas: best when you want operational + vector data together, need hybrid search, or already run Atlas and want to keep infra minimal.
Alt options: Cloud-compatible “MongoDB-like” services also ship vector search (e.g., Amazon DocumentDB, Azure Cosmos DB for MongoDB vCore). Consider them if you’re locked into those clouds.
Wrap-up
If you’re already on Atlas, you don’t need a separate vector store. Create a vector index, push embeddings, and start with $vectorSearch. Layer hybrid search for quality, use Views to pre-shape data, and move heavy traffic to Search Nodes. Voyage AI’s models + MongoDB’s native integration round out a practical, production-ready stack for RAG and semantic features.






