Most apps that need search reach for Elasticsearch on day one and regret it on day 30. Running, monitoring, and scaling a separate search cluster is a real cost — and for most use cases (under a few million documents, English-language text, no fancy aggregations), Postgres can do it natively.

This post is a practical guide to PostgreSQL’s full-text search: what it is, how to use it, how to make it fast, and when to actually graduate to a dedicated search service.

The basic idea

Postgres represents searchable text as a tsvector — a sorted list of normalized words (called “lexemes”) with their positions. You search using a tsquery — a small expression of search terms.

SELECT to_tsvector('english', 'The quick brown foxes jumped');
-- → 'brown':3 'fox':4 'jump':5 'quick':2

Notice what happened:

  • The was discarded as a stop word.
  • foxes was stemmed to fox.
  • jumped was stemmed to jump.
  • Positions are tracked (1, 2, 3…) for ranking later.

The 'english' argument tells Postgres which language config to use — that controls stop words, stemming rules, and dictionaries. Use 'simple' if you want no stemming at all.

A working example

Let’s give a posts table real search:

CREATE TABLE posts (
    id SERIAL PRIMARY KEY,
    title TEXT NOT NULL,
    body TEXT NOT NULL,
    published BOOLEAN NOT NULL DEFAULT false
);

INSERT INTO posts (title, body, published) VALUES
    ('Introducing Django',  'Django is a Python web framework with batteries included.', true),
    ('FastAPI in 2026',     'FastAPI is the modern async-first framework for Python.', true),
    ('PostgreSQL JSONB',    'JSONB lets you store arbitrary JSON in Postgres efficiently.', true),
    ('Async Python',        'Async/await is great for I/O-bound code.', true);

A naive search:

SELECT id, title
FROM posts
WHERE to_tsvector('english', title || ' ' || body) @@ to_tsquery('english', 'python');

Result:

 id |       title
----+--------------------
  1 | Introducing Django
  2 | FastAPI in 2026
  4 | Async Python

Three hits — including “Async Python” via stemming. Notice we never put python into the database explicitly; the search just figures it out.

Operators

  • @@ — “matches”; used between a tsvector and a tsquery.
  • & — AND. to_tsquery('python & async') matches documents containing both.
  • | — OR. to_tsquery('django | fastapi').
  • ! — NOT. to_tsquery('python & !django').
  • <-> — phrase / “follows by”. to_tsquery('quick <-> brown') matches “quick brown” exactly.

For user input, use plainto_tsquery (treats it as plain words) or websearch_to_tsquery (handles "phrase queries", -exclusions, and OR operators the way Google does):

SELECT id, title
FROM posts
WHERE to_tsvector('english', title || ' ' || body)
   @@ websearch_to_tsquery('english', 'python -django');

websearch_to_tsquery is what you almost always want for user-facing search.

Storing the tsvector (the right way)

Building the tsvector on every query works but it’s wasteful — Postgres has to recompute it for every row each time. Better: store it as a generated column:

ALTER TABLE posts
ADD COLUMN search_vector tsvector
GENERATED ALWAYS AS (
    setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
    setweight(to_tsvector('english', coalesce(body, '')), 'B')
) STORED;

Two things are happening:

  1. The tsvector is computed automatically when rows are inserted/updated, then stored.
  2. setweight(..., 'A') flags title words as more important than body words ('B'). This will matter when we rank.

The index — GIN

Without an index, you still scan every row. Add a GIN index for the search column:

CREATE INDEX idx_posts_search_vector ON posts USING GIN (search_vector);

Now searches are fast. Queries become:

SELECT id, title
FROM posts
WHERE search_vector @@ websearch_to_tsquery('english', 'python');

Postgres uses the GIN index, your laptop fan stays quiet, and your users get sub-millisecond response times.

Ranking

By default, results come back in whatever order Postgres feels like. To rank by relevance:

SELECT
    id,
    title,
    ts_rank(search_vector, query) AS rank
FROM posts, websearch_to_tsquery('english', 'python framework') query
WHERE search_vector @@ query
ORDER BY rank DESC
LIMIT 10;

ts_rank returns a score per row; we sort by it. Higher is more relevant.

A few useful variants:

  • ts_rank_cd — “cover density” ranking; takes word proximity into account.
  • Combine with weightsts_rank(search_vector, query, 32) normalizes by document length, so short documents don’t always win.

For a “best results first” experience, ts_rank is the workhorse. Combine it with setweight(..., 'A'/'B'/'C'/'D') (above) to give important fields more weight.

Highlighting matches

If you want to show the user which words matched (like Google’s bold snippets), use ts_headline:

SELECT
    id,
    title,
    ts_headline('english', body, websearch_to_tsquery('english', 'python'),
                'StartSel=<mark>, StopSel=</mark>, MaxFragments=2') AS snippet
FROM posts
WHERE search_vector @@ websearch_to_tsquery('english', 'python');

Returns a snippet of the body with <mark>...</mark> tags around matching words. Wire that straight into your frontend.

Phrase queries and exact matches

websearch_to_tsquery already handles quoted phrases:

SELECT * FROM posts
WHERE search_vector @@ websearch_to_tsquery('english', '"web framework"');

Internally this becomes web <-> framework — words must be adjacent.

Searching across multiple fields with different weights

The setweight pattern earlier already does this, but here’s a fuller example:

ALTER TABLE posts
ADD COLUMN search_vector tsvector
GENERATED ALWAYS AS (
    setweight(to_tsvector('english', coalesce(title, '')),       'A') ||
    setweight(to_tsvector('english', coalesce(body, '')),        'B') ||
    setweight(to_tsvector('english', coalesce(tags::text, '')),  'C')
) STORED;

Now ts_rank will treat title matches as ~10× more important than body matches, and tag matches somewhere in between.

Using it from Python

From Django

django.contrib.postgres.search has nice ergonomics:

from django.contrib.postgres.search import SearchQuery, SearchRank, SearchVector

results = (
    Post.objects
        .annotate(rank=SearchRank(F("search_vector"), SearchQuery("python", search_type="websearch")))
        .filter(search_vector=SearchQuery("python", search_type="websearch"))
        .order_by("-rank")
)

If you didn’t store search_vector as a column, you can build it inline:

.annotate(search=SearchVector("title", "body"))
.filter(search=SearchQuery("python", search_type="websearch"))

But generated columns + GIN indexes are dramatically faster for any non-trivial corpus.

From SQLAlchemy

from sqlalchemy import func, select


query = (
    select(
        Post,
        func.ts_rank(Post.search_vector, func.websearch_to_tsquery("english", "python")).label("rank"),
    )
    .where(Post.search_vector.op("@@")(func.websearch_to_tsquery("english", "python")))
    .order_by(text("rank DESC"))
)

A bit more verbose but full-featured.

When to graduate to Elasticsearch

Postgres FTS is not a complete replacement for Elasticsearch. Reach for a dedicated search engine when you need:

  • Multi-language stemming on the same corpus (Postgres FTS picks one language per index).
  • Fuzzy matching / typo tolerance at scale (Postgres has pg_trgm for this, but it’s slower than Elasticsearch’s BM25).
  • Faceted aggregations across millions of docs in real time.
  • Custom analyzers / tokenizers for unusual text (e.g. CJK languages, code search, chemical formulas).
  • Tens or hundreds of millions of documents — Postgres FTS can handle this, but ES is purpose-built.

For everything else — blog search, internal docs, e-commerce search, “find a user by name” — Postgres is genuinely enough.

Common pitfalls

  • Stop words can bite you. to_tsquery('english', 'the') returns nothing because the is a stop word. Use simple config when you need literal matching.
  • to_tsquery requires properly-formatted input. Use plainto_tsquery or websearch_to_tsquery for user input — they sanitize.
  • ILIKE '%term%' is NOT a substitute for full-text search at any scale beyond a few thousand rows.
  • Different languages need different configs. Run SELECT cfgname FROM pg_ts_config; to see what’s available.
  • Generated columns are indexable, computed columns aren’t. Make sure you write STORED (default) and not VIRTUAL.

Conclusion

PostgreSQL’s full-text search is one of the most underrated features in the database. For most apps, you can deliver fast, ranked, language-aware search without standing up a separate service, paying for a hosted Elasticsearch cluster, or operating Lucene shards. The complexity savings are real — and the search is genuinely good.

Start with a generated tsvector column + GIN index + websearch_to_tsquery. Add ranking, weights, and highlighting as you need them. Graduate to a dedicated search engine only when Postgres genuinely runs out of room — which, for most teams, is never.

Want more on Postgres? Read:

Happy searching!


Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .