Search is a deceptively rich problem. “Find docs matching this query” hides a tower of relevance, ranking, freshness, and ops. This post is the working design at every scale.
Phase 1: Postgres FTS
For up to ~10M docs:
-- tsvector column
ALTER TABLE posts ADD COLUMN search_vec tsvector;
UPDATE posts SET search_vec =
setweight(to_tsvector('english', title), 'A') ||
setweight(to_tsvector('english', body), 'B');
CREATE INDEX posts_search_idx ON posts USING GIN (search_vec);
-- Auto-update via trigger
CREATE TRIGGER posts_search_update BEFORE INSERT OR UPDATE
ON posts FOR EACH ROW EXECUTE FUNCTION ...;
-- Query
SELECT title, ts_rank(search_vec, q) AS rank
FROM posts, websearch_to_tsquery('english', $1) q
WHERE search_vec @@ q
ORDER BY rank DESC LIMIT 20;
websearch_to_tsquery handles user-entered queries with or, not, "phrases". Rank with ts_rank. See Postgres Full-Text Search
.
Phase 2: dedicated search index
When you outgrow Postgres FTS:
| Strengths | |
|---|---|
| Elasticsearch | Mature, scales high, complex |
| OpenSearch | OSS fork; same API |
| Meilisearch | Fast, simple, typo-tolerant |
| Typesense | Similar; great DX |
| Algolia | Hosted; fast; expensive |
| Quickwit | Cheap object-storage backend |
For typical SaaS: Meilisearch / Typesense — much simpler than Elastic, fast typo tolerance, good DX.
For massive scale or complex queries: Elastic / OpenSearch.
Phase 3: vector search
Add semantic search for “what’s similar in meaning”:
-- pgvector
CREATE TABLE post_embeddings (
post_id bigint PRIMARY KEY,
embedding vector(1024)
);
CREATE INDEX ON post_embeddings USING hnsw (embedding vector_cosine_ops);
-- Query
SELECT p.title, p.body, e.embedding <=> $1 AS distance
FROM posts p JOIN post_embeddings e ON p.id = e.post_id
ORDER BY distance LIMIT 20;
For dedicated: Qdrant, Weaviate, Pinecone. See Embedding Databases .
Phase 4: hybrid
Real query: “postgres jsonb gin index”. Keyword search nails “postgres” “jsonb” “gin”. Vector finds posts about “indexing JSON in Postgres” that don’t use the word “gin.”
async def hybrid_search(query):
embedding = await embed(query)
bm25_results = await pg_fts(query, k=50)
vector_results = await pg_vector(embedding, k=50)
# Reciprocal Rank Fusion
scores = defaultdict(float)
for rank, doc in enumerate(bm25_results):
scores[doc.id] += 1 / (60 + rank)
for rank, doc in enumerate(vector_results):
scores[doc.id] += 1 / (60 + rank)
return sorted(scores.items(), key=lambda x: -x[1])[:20]
RRF combines without needing scaled-comparable scores. Strong default.
For higher quality: use a reranker (cross-encoder) on the top-N candidates from RRF.
Reranking
async def rerank(query, candidates):
scores = await cross_encoder.score([(query, c.body) for c in candidates])
return sorted(zip(candidates, scores), key=lambda x: -x[1])
Cohere’s rerank API or self-hosted bge-reranker. Adds 50–200ms; quality jump is significant for non-trivial queries.
Indexing pipeline
[Source DB] → [Change events / CDC] → [Indexer] → [Search index]
↑
[Embedder]
Don’t dual-write from app code. Subscribe to changes; let an indexer keep the search index in sync.
For Postgres: logical replication or pg_notify. For Mongo: change streams. For “everything”: Debezium → Kafka → indexer.
Freshness
User publishes a post; expects it searchable immediately.
- Sync indexing: write to DB and search index in same transaction. Risky (partial failures, latency).
- Async: write to DB; CDC propagates within seconds. Standard.
- Optimistic: include the new doc in result locally for the user who wrote it.
For most apps: async with sub-second propagation is fine.
Ranking discipline
Beyond raw relevance, ranking incorporates:
- Recency boost: newer docs higher.
- Authority: popular docs higher.
- Personalization: user’s past clicks.
- Business rules: featured items, sponsored.
score = 0.6 * relevance + 0.2 * recency_decay + 0.2 * authority
Weights tuned via A/B tests. Don’t ship un-evaluated ranking changes.
A/B testing search
def search_variant(user, query):
bucket = hash(user.id) % 100
if bucket < 50:
return search_v1(query)
return search_v2(query)
Track click-through, dwell time, conversion. Find which variant wins.
Operational concerns
- Index size: full-text + vectors = often larger than the source DB.
- Replication / sharding for high availability.
- Backup the index OR reindex from source on disaster (often faster).
- Update lag monitoring: alert if propagation > N seconds.
- Slow queries: index + analyze; cap query complexity.
Common mistakes
1. LIKE ‘%query%’
Sequential scan; slow on large tables; no relevance ranking. Always use FTS.
2. Vector-only search
Misses keyword precision. “Show me ‘postgres’” returns posts about “databases.” Use hybrid.
3. Reindexing in foreground
Reindex 100M docs in the app process; OOM. Use a background pipeline.
4. No relevance evaluation
Ship search changes without testing. Quality regressions go unnoticed. Build an eval set; track metrics.
5. Trusting users to write good queries
“postgres jsonb” gets 0 results because user typed “postgres jsonb’s”? Apply spell correction, synonyms, query expansion.
What I’d ship today
For a new app:
- Postgres FTS if <10M docs, moderate volume.
- Meilisearch / Typesense if more volume or want typo tolerance and great DX.
- Add pgvector for semantic when needed.
- Hybrid (RRF) when queries are non-trivial.
- Reranker on top-N for quality bump.
- Async indexing pipeline via CDC.
- A/B test ranking changes.
- Eval set in CI.
Read this next
- Postgres Full-Text Search 2026
- Embedding Databases 2026
- Embeddings & Semantic Search 2026
- RAG Patterns 2026
If you want my hybrid search starter (Postgres FTS + pgvector + RRF), it’s at rajpoot.dev .
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .