Search is a deceptively rich problem. “Find docs matching this query” hides a tower of relevance, ranking, freshness, and ops. This post is the working design at every scale.

Phase 1: Postgres FTS

For up to ~10M docs:

-- tsvector column
ALTER TABLE posts ADD COLUMN search_vec tsvector;
UPDATE posts SET search_vec = 
    setweight(to_tsvector('english', title), 'A') ||
    setweight(to_tsvector('english', body), 'B');

CREATE INDEX posts_search_idx ON posts USING GIN (search_vec);

-- Auto-update via trigger
CREATE TRIGGER posts_search_update BEFORE INSERT OR UPDATE
ON posts FOR EACH ROW EXECUTE FUNCTION ...;

-- Query
SELECT title, ts_rank(search_vec, q) AS rank
FROM posts, websearch_to_tsquery('english', $1) q
WHERE search_vec @@ q
ORDER BY rank DESC LIMIT 20;

websearch_to_tsquery handles user-entered queries with or, not, "phrases". Rank with ts_rank. See Postgres Full-Text Search .

Phase 2: dedicated search index

When you outgrow Postgres FTS:

Strengths
ElasticsearchMature, scales high, complex
OpenSearchOSS fork; same API
MeilisearchFast, simple, typo-tolerant
TypesenseSimilar; great DX
AlgoliaHosted; fast; expensive
QuickwitCheap object-storage backend

For typical SaaS: Meilisearch / Typesense — much simpler than Elastic, fast typo tolerance, good DX.

For massive scale or complex queries: Elastic / OpenSearch.

Add semantic search for “what’s similar in meaning”:

-- pgvector
CREATE TABLE post_embeddings (
    post_id bigint PRIMARY KEY,
    embedding vector(1024)
);

CREATE INDEX ON post_embeddings USING hnsw (embedding vector_cosine_ops);

-- Query
SELECT p.title, p.body, e.embedding <=> $1 AS distance
FROM posts p JOIN post_embeddings e ON p.id = e.post_id
ORDER BY distance LIMIT 20;

For dedicated: Qdrant, Weaviate, Pinecone. See Embedding Databases .

Phase 4: hybrid

Real query: “postgres jsonb gin index”. Keyword search nails “postgres” “jsonb” “gin”. Vector finds posts about “indexing JSON in Postgres” that don’t use the word “gin.”

async def hybrid_search(query):
    embedding = await embed(query)
    
    bm25_results = await pg_fts(query, k=50)
    vector_results = await pg_vector(embedding, k=50)
    
    # Reciprocal Rank Fusion
    scores = defaultdict(float)
    for rank, doc in enumerate(bm25_results):
        scores[doc.id] += 1 / (60 + rank)
    for rank, doc in enumerate(vector_results):
        scores[doc.id] += 1 / (60 + rank)
    
    return sorted(scores.items(), key=lambda x: -x[1])[:20]

RRF combines without needing scaled-comparable scores. Strong default.

For higher quality: use a reranker (cross-encoder) on the top-N candidates from RRF.

Reranking

async def rerank(query, candidates):
    scores = await cross_encoder.score([(query, c.body) for c in candidates])
    return sorted(zip(candidates, scores), key=lambda x: -x[1])

Cohere’s rerank API or self-hosted bge-reranker. Adds 50–200ms; quality jump is significant for non-trivial queries.

Indexing pipeline

[Source DB] → [Change events / CDC] → [Indexer] → [Search index]
                                                 [Embedder]

Don’t dual-write from app code. Subscribe to changes; let an indexer keep the search index in sync.

For Postgres: logical replication or pg_notify. For Mongo: change streams. For “everything”: Debezium → Kafka → indexer.

Freshness

User publishes a post; expects it searchable immediately.

  • Sync indexing: write to DB and search index in same transaction. Risky (partial failures, latency).
  • Async: write to DB; CDC propagates within seconds. Standard.
  • Optimistic: include the new doc in result locally for the user who wrote it.

For most apps: async with sub-second propagation is fine.

Ranking discipline

Beyond raw relevance, ranking incorporates:

  • Recency boost: newer docs higher.
  • Authority: popular docs higher.
  • Personalization: user’s past clicks.
  • Business rules: featured items, sponsored.
score = 0.6 * relevance + 0.2 * recency_decay + 0.2 * authority

Weights tuned via A/B tests. Don’t ship un-evaluated ranking changes.

def search_variant(user, query):
    bucket = hash(user.id) % 100
    if bucket < 50:
        return search_v1(query)
    return search_v2(query)

Track click-through, dwell time, conversion. Find which variant wins.

Operational concerns

  • Index size: full-text + vectors = often larger than the source DB.
  • Replication / sharding for high availability.
  • Backup the index OR reindex from source on disaster (often faster).
  • Update lag monitoring: alert if propagation > N seconds.
  • Slow queries: index + analyze; cap query complexity.

Common mistakes

1. LIKE ‘%query%’

Sequential scan; slow on large tables; no relevance ranking. Always use FTS.

Misses keyword precision. “Show me ‘postgres’” returns posts about “databases.” Use hybrid.

3. Reindexing in foreground

Reindex 100M docs in the app process; OOM. Use a background pipeline.

4. No relevance evaluation

Ship search changes without testing. Quality regressions go unnoticed. Build an eval set; track metrics.

5. Trusting users to write good queries

“postgres jsonb” gets 0 results because user typed “postgres jsonb’s”? Apply spell correction, synonyms, query expansion.

What I’d ship today

For a new app:

  1. Postgres FTS if <10M docs, moderate volume.
  2. Meilisearch / Typesense if more volume or want typo tolerance and great DX.
  3. Add pgvector for semantic when needed.
  4. Hybrid (RRF) when queries are non-trivial.
  5. Reranker on top-N for quality bump.
  6. Async indexing pipeline via CDC.
  7. A/B test ranking changes.
  8. Eval set in CI.

Read this next

If you want my hybrid search starter (Postgres FTS + pgvector + RRF), it’s at rajpoot.dev .


Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .