Why Postgres for hybrid search?

One database, one backup story, one ACL. pgvector + tsvector handle hybrid retrieval up to 50M+ chunks. Beyond that, dedicated search engines might win, but most apps never need them.

How does RRF fusion work?

Reciprocal Rank Fusion combines multiple ranked lists by summing 1/(k + rank). Order-of-magnitude improvement over picking either list alone. No tuning required.

Hybrid Search with Postgres in 2026 — pgvector + FTS + Reranker

Most RAG systems in 2026 still ship vector-only. They lose to hybrid search by 10–25% on retrieval quality. Postgres can do hybrid in one query. This post is the working pattern.

The ingredients

CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE chunks (
    id BIGSERIAL PRIMARY KEY,
    document_id BIGINT,
    content TEXT NOT NULL,
    embedding vector(1536) NOT NULL,
    tsv tsvector GENERATED ALWAYS AS (to_tsvector('english', content)) STORED
);

CREATE INDEX chunks_embedding_hnsw
  ON chunks USING hnsw (embedding vector_cosine_ops)
  WITH (m = 16, ef_construction = 64);

CREATE INDEX chunks_tsv_gin ON chunks USING GIN (tsv);

Two indexes. One for semantic. One for lexical.

For pgvector tuning .

The hybrid query

WITH vec AS (
    SELECT id, 1 - (embedding <=> $1) AS score
    FROM chunks
    ORDER BY embedding <=> $1
    LIMIT 30
),
lex AS (
    SELECT id, ts_rank(tsv, websearch_to_tsquery('english', $2)) AS score
    FROM chunks
    WHERE tsv @@ websearch_to_tsquery('english', $2)
    ORDER BY score DESC
    LIMIT 30
),
ranked AS (
    SELECT id, ROW_NUMBER() OVER (ORDER BY score DESC) AS rank FROM vec
    UNION ALL
    SELECT id, ROW_NUMBER() OVER (ORDER BY score DESC) AS rank FROM lex
),
fused AS (
    SELECT id, SUM(1.0 / (60 + rank)) AS rrf_score
    FROM ranked
    GROUP BY id
)
SELECT c.id, c.content, f.rrf_score
FROM chunks c
JOIN fused f USING (id)
ORDER BY f.rrf_score DESC
LIMIT 8;

Three CTEs:

Vector top-30.
Lexical top-30.
RRF fusion.
Final order.

Runs in tens of ms on 1M chunks with the right indexes.

For the build-a-RAG-app pattern.

Why RRF wins

RRF (Reciprocal Rank Fusion) is sum(1 / (k + rank)) for each item across ranked lists. The k=60 constant smooths it.

Item ranked 1 in vector list, 5 in lexical: 1/61 + 1/65 = 0.0317.
Item ranked 1 only in vector: 1/61 = 0.0164.
The first wins; co-occurrence in both lists matters.

No tuning needed. Robust across domains.

Adding a reranker

Top-30 fused → reranker → top-8:

fused = await db.fetch(HYBRID_QUERY, q_emb, q_text)
candidates = [r['content'] for r in fused]

# Cohere or BGE reranker
reranked = co.rerank(query=q_text, documents=candidates, top_n=8)

return [fused[r.index] for r in reranked.results]

Reranker improves quality 10–25% over fused alone. See Rerankers in RAG .

Filtering

WHERE tenant_id = $3 AND status = 'active' AND ...

Goes in both subqueries. With proper indexes (composite on (tenant_id, status) + the GIN), filtered hybrid stays fast.

For deep pgvector filtering (partial indexes, iterative scan).

Performance numbers

For 10M chunks:

Vector top-30: ~10ms (HNSW with ef_search=100).
Lexical top-30: ~10ms (GIN index).
RRF fusion in CTE: ~1ms.
Reranker (Cohere top-8): ~80ms.
Total: ~100ms.

Fast enough for interactive RAG.

Tuning

ef_search per session for vector recall.
default_statistics_target higher on the chunks table for better plans.
work_mem higher (32–64 MB) for the GIN scan.

For Postgres tuning .

When this stops scaling

>50M chunks: maybe move to a dedicated vector DB.
Very high QPS (>1k searches/sec): dedicated cluster.
Multi-region with low latency required: replication latency might bite.

For most apps, Postgres handles billions of chunks with proper partitioning and indexing.

What I’d ship today

Postgres + pgvector + tsvector for hybrid.
HNSW with m=16, ef_construction=64.
GIN with default ops class.
RRF fusion in a single SQL query.
Cohere or BGE reranker for the top-8.
Per-tenant scoping in WHERE.

Production-grade RAG search. No new infra.

Read this next

If you want a complete hybrid-search FastAPI service, it’s at rajpoot.dev .

Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .

The ingredients#

The hybrid query#

Why RRF wins#

Adding a reranker#

Filtering#

Performance numbers#

Tuning#

When this stops scaling#

What I’d ship today#

Read this next#