When does pgvector stop being enough?

Above ~50M vectors with strict latency budgets. Below that, pgvector + the rest of Postgres beats dedicated vector DBs on operational simplicity.

Pinecone or self-host?

Pinecone for fastest time-to-prod with zero ops. Self-host (Qdrant / Weaviate / Milvus) for cost control above ~100M vectors or compliance requirements.

Embedding Databases in 2026 — pgvector, Qdrant, Weaviate, Milvus, Pinecone

The vector DB market matured in 2026. Most apps don’t need a dedicated one. Some do. This post is the practical comparison.

The contenders

	Type	Strengths
pgvector	Postgres extension	One DB, joins, RLS, transactions
Qdrant	Self-host or cloud	Rust, fast, payload filtering
Weaviate	Self-host or cloud	Built-in vectorization, GraphQL
Milvus	Self-host	Billion-scale; complex ops
Pinecone	Managed only	Simplest path; pricey at scale
Cloudflare Vectorize	Managed	Edge, cheap, integrated with Workers
Turso (libsql vector)	Managed	SQLite + vectors
LanceDB	Embedded	DuckDB-style local

When pgvector wins

Under 50M vectors with reasonable latency budget.
You already have Postgres (Build a RAG App with pgvector ).
Joins between embeddings and business data matter.
Per-tenant scoping via RLS .
One backup story.

For 95% of products: pgvector.

When dedicated DB wins

>50M vectors with sub-100ms p99.
Complex filtering on vector results.
Hybrid search built-in (Qdrant has BM25; Weaviate too).
Specific deployment constraints (edge with Vectorize, Workers integration).

Qdrant

Rust. Fast. Payload-aware filtering — important for multi-tenant.

client.search(
    collection_name="docs",
    query_vector=embedding,
    query_filter={"must": [{"key": "tenant_id", "match": {"value": 42}}]},
    limit=10,
)

Filter while searching, not after. Big quality difference for filtered results.

Self-host or Qdrant Cloud. Production-strong.

Weaviate

Has built-in modules to vectorize text (you don’t need a separate embedding step). Strong for “just give me search.”

{
  Get {
    Product(
      nearText: { concepts: ["leather wallet"] }
      where: { path: ["price"], operator: LessThan, valueNumber: 50 }
    ) { title price }
  }
}

GraphQL surface. Different model. Lock-in higher than alternatives.

Pinecone

Managed; no ops; pricier at scale.

index.query(vector=embedding, top_k=10, filter={"tenant_id": 42})

Best for: ship fast, don’t think about ops, willing to pay.

Milvus

For billion-row workloads. Complex ops; powerful at scale. Used by larger teams with dedicated infra.

Vectorize (Cloudflare)

Built into Workers. Cheap. Tight integration.

const matches = await env.VECTORIZE.query(embedding, { topK: 10 });

Pair with Workers AI for end-to-end edge RAG.

Turso libsql vector

CREATE TABLE chunks (
  id INTEGER PRIMARY KEY,
  content TEXT,
  embedding F32_BLOB(1536)
);

CREATE INDEX chunks_emb_idx ON chunks(libsql_vector_idx(embedding));

SQLite + vectors. Sub-millisecond reads via embedded replicas. See SQLite at the Edge .

Decision matrix

Need	Pick
Already on Postgres, <50M vectors	pgvector
TS-first, edge-deployed	Vectorize
Self-host, fast, payload filter	Qdrant
Managed, simple, willing to pay	Pinecone
Per-tenant SQLite	Turso
Billion+ vectors	Milvus

Performance

Rough numbers, 10M vectors, top-10 query:

	p50	p99
pgvector HNSW	5–15ms	20–40ms
Qdrant	3–10ms	15–30ms
Pinecone	10–30ms	50–100ms (network)
Milvus	3–10ms	15–30ms
Vectorize	30–60ms	80–150ms (edge round-trip)

Network adds latency for managed. Self-host wins p99.

Cost

For 10M vectors at 1M queries/month:

pgvector: included in Postgres bill (~$50–200/month).
Qdrant Cloud: ~$200–500/month.
Pinecone: ~$300–800/month.
Vectorize: $0–100/month.
Self-host Qdrant: ~$100/month VM + ops.

For embedding model selection .

Common mistakes

1. Picking dedicated DB before needing it

Operational overhead. Dual storage. For the data sizes most products handle, pgvector is enough.

2. No filtering at the vector layer

Filter after retrieval → low recall on filtered queries. Use payload filters (Qdrant) or partial indexes (pgvector).

3. Wrong dimension

Stored 768; querying with 1536. Silent fail. Lock down dimensions.

4. No reranker

Bare vector search has lower quality than vector+rerank. See Rerankers in RAG .

5. Choosing on benchmarks not real workload

Benchmarks differ from your data. Test on your corpus, your filter shapes, your latency budget.

Read this next

If you want a multi-DB benchmark harness on your data, it’s at rajpoot.dev .

Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .

The contenders#

When pgvector wins#

When dedicated DB wins#

Qdrant#

Weaviate#

Pinecone#

Milvus#

Vectorize (Cloudflare)#

Turso libsql vector#

Decision matrix#

Performance#

Cost#

Common mistakes#

1. Picking dedicated DB before needing it#

2. No filtering at the vector layer#

3. Wrong dimension#

4. No reranker#

5. Choosing on benchmarks not real workload#

Read this next#