So you’ve decided you need a vector database for your RAG pipeline. Maybe you tried Pinecone and the bill got scary. Maybe pgvector started choking past 10M rows. The question reads roughly like this: what’s the best vector DB I can actually self-host without setting my weekend on fire?
For most teams, the honest answer in 2026 is Qdrant. Apache 2.0 licensed, written in Rust, and the Docker image works on the first try – assuming you don’t fall into one of the three install traps below. This guide walks through deploying Qdrant v1.17.1 (latest stable as of March 26, 2026, per the GitHub releases page) on your own hardware, with the gotchas other tutorials skip.
Why Qdrant wins for self-hosting
Two releases changed the calculus. ACORN landed in v1.16 – it’s an algorithm that traverses neighbors-of-neighbors in the HNSW graph when a filter eliminates direct neighbors, which fixes the classic “filter kills my recall” problem. Before that, heavy payload filtering could quietly crater search quality with no obvious error. The v1.16 release post has the details if you want the graph-traversal mechanics.
Then v1.15 added 2-bit and 1.5-bit binary quantization – 16× compression at 2 bits, 24× at 1.5 bits, per the v1.15 release post. Combined with v1.17’s storage overhaul (more on that in the upgrade section), the RAM story is genuinely different from two years ago. By end of 2025 the project crossed 27,000 GitHub stars – a decent proxy for production adoption.
System requirements (as of 2026)
Qdrant itself is light. The vectors are what eat your RAM.
| Resource | Minimum | Recommended (~10M vectors, 768-dim) |
|---|---|---|
| CPU | 2 cores | 8 cores |
| RAM | 2 GB | 16-32 GB (or use quantization) |
| Disk | 10 GB SSD | NVMe SSD, 100+ GB |
| OS | Linux, macOS, Windows (WSL2) | Linux (Ubuntu 22.04+) |
| Docker | 20.10+ | Engine 25+ / Compose v2+ |
The RAM number is the tricky one. A naive deployment of 10M 768-dimensional float32 vectors is ~30 GB before any index overhead. According to the official Docker Hub page, Qdrant’s built-in vector quantization can cut that by up to 97% – but it’s not on by default. You have to enable it per collection.
Install Qdrant in Docker (the actually-correct command)
Pull the image from Docker Hub. Don’t pin :latest in production – pin the exact tag.
docker pull qdrant/qdrant:v1.17.1
Now the run command. Every tutorial on the internet shows you this:
# DON'T DO THIS
docker run -p 6333:6333 qdrant/qdrant
It boots. The dashboard works. Then you wire up a Python client and it hangs. The catch: Qdrant runs both REST on port 6333 and gRPC on port 6334 – the web console uses REST, but the Python client defaults to gRPC on 6334 (prefer_grpc=True). Expose only 6333 and every official Python tutorial breaks silently. A GitHub discussion (#2195) documents exactly this: user followed install docs, only exposed 6333, Python client failed with a connection error they couldn’t trace back to the missing port.
The correct command exposes both ports and persists storage:
docker run -d
--name qdrant
--restart unless-stopped
-p 6333:6333
-p 6334:6334
-v $(pwd)/qdrant_storage:/qdrant/storage
-e QDRANT__SERVICE__API_KEY=$(openssl rand -hex 32)
qdrant/qdrant:v1.17.1
Save the generated API key – you’ll need it for every client call. Without it, any process that can reach port 6333 has full database read/write access. The QDRANT__SERVICE__API_KEY environment variable is the documented way to close that hole.
Docker Compose version (production-leaning)
If you want something you can actually git commit, use Compose.
services:
qdrant:
image: qdrant/qdrant:v1.17.1
container_name: qdrant
restart: unless-stopped
ports:
- "6333:6333" # REST
- "6334:6334" # gRPC
volumes:
- ./qdrant_storage:/qdrant/storage
- ./qdrant_snapshots:/qdrant/snapshots
environment:
- QDRANT__SERVICE__API_KEY=${QDRANT_API_KEY}
deploy:
resources:
limits:
memory: 4G
A note on the memory limit. Qdrant will happily use everything you give it during index builds. If you don’t cap it, an OOM on a shared host will silently kill the container – and Docker’s restart policy will bring it back up mid-flush. That’s how data corruption stories start.
Verify it works
Three checks, in order. If any fail, fix it before writing code.
- Container is up:
docker ps | grep qdrant - REST API responds:
curl -H "api-key: $QDRANT_API_KEY" http://localhost:6333/healthzshould returnhealthz check passed - Readiness:
curl http://localhost:6333/readinessshould return a status response
For the gRPC port, easiest test is a one-liner from Python:
from qdrant_client import QdrantClient
client = QdrantClient(url="http://localhost:6334", api_key="YOUR_KEY", prefer_grpc=True)
print(client.get_collections())
If that prints an empty collections list instead of timing out, you’re done.
Common install errors and what they actually mean
These are the four that come up over and over in GitHub issues and Discord.
Before you debug anything: run
docker logs qdrant | tail -50. Qdrant logs are unusually readable for a database – they’ll tell you exactly what’s missing, often in the first 10 lines after the ASCII-art banner.
“Connection refused” from Python client. You forgot -p 6334:6334. See above. This is the single most common Qdrant install bug.
Port 6333 already in use. Run lsof -i :6333 to see what’s occupying it, then remap in your compose file (e.g., "6380:6333"). Don’t kill whatever’s on 6333 without checking – on some Linux distros it’s a different service entirely.
Data disappears after container restart on Windows. Qdrant’s official installation docs state that using Docker/WSL on Windows with bind mounts has known file-system problems that cause data loss. The fix: use a named Docker volume instead – -v qdrant_data:/qdrant/storage rather than -v $(pwd)/qdrant_storage:/qdrant/storage.
Permission denied writing to /qdrant/storage. The container runs as a non-root qdrant user. If your host directory is owned by root, the mount won’t be writable. Fix: sudo chown -R 1000:1000 ./qdrant_storage before starting the container.
The upgrade trap nobody warns you about
This one bites people running older self-hosted deployments. v1.17 removed RocksDB entirely in favor of Gridstore – and that means a direct jump from v1.15.x to v1.17.x simply won’t work. The v1.17 release notes are explicit: direct upgrade from v1.15.x into v1.17.x is not possible. Your container will refuse to start and your storage directory will be left mid-migration.
The correct path is one minor version at a time: 1.14 → 1.15 → 1.16 → 1.17. Take a snapshot at each step:
# Snapshot before each upgrade
curl -X POST -H "api-key: $QDRANT_API_KEY"
http://localhost:6333/collections/your_collection/snapshots
# Then bump the image tag, recreate, verify, repeat
docker compose down
# edit image: qdrant/qdrant:v1.16.x in compose.yml
docker compose up -d
If that sounds tedious, this is exactly the case Qdrant Cloud’s auto-upgrade handles for you. The free tier covers small projects; for production it’s worth doing the math on managed vs. self-hosted ops cost – the version-migration dance is real engineering time.
Uninstall and cleanup
docker compose down # stops and removes container
docker volume rm qdrant_data # only if you used a named volume
rm -rf ./qdrant_storage ./qdrant_snapshots
docker rmi qdrant/qdrant:v1.17.1
If you ran it without Compose: docker stop qdrant && docker rm qdrant, then remove the storage directory. Qdrant doesn’t touch /etc or your shell profile – no system-wide state to chase down.
FAQ
Is Qdrant really better than Pinecone or Weaviate?
Depends on one thing: whether you want to manage infra. Pinecone is faster to start and you never touch a server. Qdrant wins on cost and control at scale – Weaviate is the closer feature competitor. Neither is universally “better.”
Do I need Qdrant Cloud, or can I really just self-host?
Self-hosted is genuinely fine for a side project or internal tool – a single Docker container on a low-cost VPS will handle a few million vectors without issue. The math shifts when you need things you’d otherwise build yourself. Zero-downtime upgrades, automated backups, and the version-migration handling described above all take real engineering hours. If your team is running on-call rotations for a customer-facing system and nobody’s primary job is infra, the Cloud pricing probably pays for itself in avoided incidents before you get to month three.
What about Qdrant Edge – when would I use that instead?
Qdrant Edge is a different deployment model, not just a smaller version. It runs inside the application process – no separate server, no network hop. Data is stored and queried locally, with the option to sync with a Qdrant server instance. That makes it useful for desktop apps, mobile, or offline-capable agents where spinning up a Docker container isn’t an option. For a standard server deployment, you want the regular image; Edge is specifically for resource-constrained or embedded environments where client-server architecture is the constraint, not the goal.
Next step: run the Docker Compose snippet above against a throwaway VPS, then load 100k vectors from your actual embedding model. If query latency at p99 stays under 50ms, you’ve got your answer about whether Qdrant is the right vector DB for your workload – without trusting a single benchmark blog.