Most apps that need search reach for Elasticsearch on day one and regret it on day 30. Running, monitoring, and scaling a separate search cluster is a real cost — and for most use cases (under a few million documents, English-language text, no fancy aggregations), Postgres can do it natively.
This post is a practical guide to PostgreSQL’s full-text search: what it is, how to use it, how to make it fast, and when to actually graduate to a dedicated search service.
The basic idea
Postgres represents searchable text as a tsvector — a sorted list of normalized words (called “lexemes”) with their positions. You search using a tsquery — a small expression of search terms.
SELECT to_tsvector('english', 'The quick brown foxes jumped');
-- → 'brown':3 'fox':4 'jump':5 'quick':2
Notice what happened:
Thewas discarded as a stop word.foxeswas stemmed tofox.jumpedwas stemmed tojump.- Positions are tracked (1, 2, 3…) for ranking later.
The 'english' argument tells Postgres which language config to use — that controls stop words, stemming rules, and dictionaries. Use 'simple' if you want no stemming at all.
A working example
Let’s give a posts table real search:
CREATE TABLE posts (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
body TEXT NOT NULL,
published BOOLEAN NOT NULL DEFAULT false
);
INSERT INTO posts (title, body, published) VALUES
('Introducing Django', 'Django is a Python web framework with batteries included.', true),
('FastAPI in 2026', 'FastAPI is the modern async-first framework for Python.', true),
('PostgreSQL JSONB', 'JSONB lets you store arbitrary JSON in Postgres efficiently.', true),
('Async Python', 'Async/await is great for I/O-bound code.', true);
A naive search:
SELECT id, title
FROM posts
WHERE to_tsvector('english', title || ' ' || body) @@ to_tsquery('english', 'python');
Result:
id | title
----+--------------------
1 | Introducing Django
2 | FastAPI in 2026
4 | Async Python
Three hits — including “Async Python” via stemming. Notice we never put python into the database explicitly; the search just figures it out.
Operators
@@— “matches”; used between atsvectorand atsquery.&— AND.to_tsquery('python & async')matches documents containing both.|— OR.to_tsquery('django | fastapi').!— NOT.to_tsquery('python & !django').<->— phrase / “follows by”.to_tsquery('quick <-> brown')matches “quick brown” exactly.
For user input, use plainto_tsquery (treats it as plain words) or websearch_to_tsquery (handles "phrase queries", -exclusions, and OR operators the way Google does):
SELECT id, title
FROM posts
WHERE to_tsvector('english', title || ' ' || body)
@@ websearch_to_tsquery('english', 'python -django');
websearch_to_tsquery is what you almost always want for user-facing search.
Storing the tsvector (the right way)
Building the tsvector on every query works but it’s wasteful — Postgres has to recompute it for every row each time. Better: store it as a generated column:
ALTER TABLE posts
ADD COLUMN search_vector tsvector
GENERATED ALWAYS AS (
setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
setweight(to_tsvector('english', coalesce(body, '')), 'B')
) STORED;
Two things are happening:
- The
tsvectoris computed automatically when rows are inserted/updated, then stored. setweight(..., 'A')flags title words as more important than body words ('B'). This will matter when we rank.
The index — GIN
Without an index, you still scan every row. Add a GIN index for the search column:
CREATE INDEX idx_posts_search_vector ON posts USING GIN (search_vector);
Now searches are fast. Queries become:
SELECT id, title
FROM posts
WHERE search_vector @@ websearch_to_tsquery('english', 'python');
Postgres uses the GIN index, your laptop fan stays quiet, and your users get sub-millisecond response times.
Ranking
By default, results come back in whatever order Postgres feels like. To rank by relevance:
SELECT
id,
title,
ts_rank(search_vector, query) AS rank
FROM posts, websearch_to_tsquery('english', 'python framework') query
WHERE search_vector @@ query
ORDER BY rank DESC
LIMIT 10;
ts_rank returns a score per row; we sort by it. Higher is more relevant.
A few useful variants:
ts_rank_cd— “cover density” ranking; takes word proximity into account.- Combine with weights —
ts_rank(search_vector, query, 32)normalizes by document length, so short documents don’t always win.
For a “best results first” experience, ts_rank is the workhorse. Combine it with setweight(..., 'A'/'B'/'C'/'D') (above) to give important fields more weight.
Highlighting matches
If you want to show the user which words matched (like Google’s bold snippets), use ts_headline:
SELECT
id,
title,
ts_headline('english', body, websearch_to_tsquery('english', 'python'),
'StartSel=<mark>, StopSel=</mark>, MaxFragments=2') AS snippet
FROM posts
WHERE search_vector @@ websearch_to_tsquery('english', 'python');
Returns a snippet of the body with <mark>...</mark> tags around matching words. Wire that straight into your frontend.
Phrase queries and exact matches
websearch_to_tsquery already handles quoted phrases:
SELECT * FROM posts
WHERE search_vector @@ websearch_to_tsquery('english', '"web framework"');
Internally this becomes web <-> framework — words must be adjacent.
Searching across multiple fields with different weights
The setweight pattern earlier already does this, but here’s a fuller example:
ALTER TABLE posts
ADD COLUMN search_vector tsvector
GENERATED ALWAYS AS (
setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
setweight(to_tsvector('english', coalesce(body, '')), 'B') ||
setweight(to_tsvector('english', coalesce(tags::text, '')), 'C')
) STORED;
Now ts_rank will treat title matches as ~10× more important than body matches, and tag matches somewhere in between.
Using it from Python
From Django
django.contrib.postgres.search has nice ergonomics:
from django.contrib.postgres.search import SearchQuery, SearchRank, SearchVector
results = (
Post.objects
.annotate(rank=SearchRank(F("search_vector"), SearchQuery("python", search_type="websearch")))
.filter(search_vector=SearchQuery("python", search_type="websearch"))
.order_by("-rank")
)
If you didn’t store search_vector as a column, you can build it inline:
.annotate(search=SearchVector("title", "body"))
.filter(search=SearchQuery("python", search_type="websearch"))
But generated columns + GIN indexes are dramatically faster for any non-trivial corpus.
From SQLAlchemy
from sqlalchemy import func, select
query = (
select(
Post,
func.ts_rank(Post.search_vector, func.websearch_to_tsquery("english", "python")).label("rank"),
)
.where(Post.search_vector.op("@@")(func.websearch_to_tsquery("english", "python")))
.order_by(text("rank DESC"))
)
A bit more verbose but full-featured.
When to graduate to Elasticsearch
Postgres FTS is not a complete replacement for Elasticsearch. Reach for a dedicated search engine when you need:
- Multi-language stemming on the same corpus (Postgres FTS picks one language per index).
- Fuzzy matching / typo tolerance at scale (Postgres has
pg_trgmfor this, but it’s slower than Elasticsearch’s BM25). - Faceted aggregations across millions of docs in real time.
- Custom analyzers / tokenizers for unusual text (e.g. CJK languages, code search, chemical formulas).
- Tens or hundreds of millions of documents — Postgres FTS can handle this, but ES is purpose-built.
For everything else — blog search, internal docs, e-commerce search, “find a user by name” — Postgres is genuinely enough.
Common pitfalls
- Stop words can bite you.
to_tsquery('english', 'the')returns nothing becausetheis a stop word. Usesimpleconfig when you need literal matching. to_tsqueryrequires properly-formatted input. Useplainto_tsqueryorwebsearch_to_tsqueryfor user input — they sanitize.ILIKE '%term%'is NOT a substitute for full-text search at any scale beyond a few thousand rows.- Different languages need different configs. Run
SELECT cfgname FROM pg_ts_config;to see what’s available. - Generated columns are indexable, computed columns aren’t. Make sure you write
STORED(default) and notVIRTUAL.
Conclusion
PostgreSQL’s full-text search is one of the most underrated features in the database. For most apps, you can deliver fast, ranked, language-aware search without standing up a separate service, paying for a hosted Elasticsearch cluster, or operating Lucene shards. The complexity savings are real — and the search is genuinely good.
Start with a generated tsvector column + GIN index + websearch_to_tsquery. Add ranking, weights, and highlighting as you need them. Graduate to a dedicated search engine only when Postgres genuinely runs out of room — which, for most teams, is never.
Want more on Postgres? Read:
Happy searching!
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .