PostgreSQL Full-Text Search Without an Extra Service

Most apps that need search reach for Elasticsearch on day one and regret it on day 30. Running, monitoring, and scaling a separate search cluster is a real cost — and for most use cases (under a few million documents, English-language text, no fancy aggregations), Postgres can do it natively.

This post is a practical guide to PostgreSQL’s full-text search: what it is, how to use it, how to make it fast, and when to actually graduate to a dedicated search service.

The basic idea

Postgres represents searchable text as a tsvector — a sorted list of normalized words (called “lexemes”) with their positions. You search using a tsquery — a small expression of search terms.

SELECT to_tsvector('english', 'The quick brown foxes jumped');
-- → 'brown':3 'fox':4 'jump':5 'quick':2

Notice what happened:

The was discarded as a stop word.
foxes was stemmed to fox.
jumped was stemmed to jump.
Positions are tracked (1, 2, 3…) for ranking later.

The 'english' argument tells Postgres which language config to use — that controls stop words, stemming rules, and dictionaries. Use 'simple' if you want no stemming at all.

A working example

Let’s give a posts table real search:

CREATE TABLE posts (
    id SERIAL PRIMARY KEY,
    title TEXT NOT NULL,
    body TEXT NOT NULL,
    published BOOLEAN NOT NULL DEFAULT false
);

INSERT INTO posts (title, body, published) VALUES
    ('Introducing Django',  'Django is a Python web framework with batteries included.', true),
    ('FastAPI in 2026',     'FastAPI is the modern async-first framework for Python.', true),
    ('PostgreSQL JSONB',    'JSONB lets you store arbitrary JSON in Postgres efficiently.', true),
    ('Async Python',        'Async/await is great for I/O-bound code.', true);

A naive search:

SELECT id, title
FROM posts
WHERE to_tsvector('english', title || ' ' || body) @@ to_tsquery('english', 'python');

Result:

 id |       title
----+--------------------
  1 | Introducing Django
  2 | FastAPI in 2026
  4 | Async Python

Three hits — including “Async Python” via stemming. Notice we never put python into the database explicitly; the search just figures it out.

Operators

@@ — “matches”; used between a tsvector and a tsquery.
& — AND. to_tsquery('python & async') matches documents containing both.
| — OR. to_tsquery('django | fastapi').
! — NOT. to_tsquery('python & !django').
<-> — phrase / “follows by”. to_tsquery('quick <-> brown') matches “quick brown” exactly.

For user input, use plainto_tsquery (treats it as plain words) or websearch_to_tsquery (handles "phrase queries", -exclusions, and OR operators the way Google does):

SELECT id, title
FROM posts
WHERE to_tsvector('english', title || ' ' || body)
   @@ websearch_to_tsquery('english', 'python -django');

websearch_to_tsquery is what you almost always want for user-facing search.

Storing the `tsvector` (the right way)

Building the tsvector on every query works but it’s wasteful — Postgres has to recompute it for every row each time. Better: store it as a generated column:

ALTER TABLE posts
ADD COLUMN search_vector tsvector
GENERATED ALWAYS AS (
    setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
    setweight(to_tsvector('english', coalesce(body, '')), 'B')
) STORED;

Two things are happening:

The tsvector is computed automatically when rows are inserted/updated, then stored.
setweight(..., 'A') flags title words as more important than body words ('B'). This will matter when we rank.

The index — GIN

Without an index, you still scan every row. Add a GIN index for the search column:

CREATE INDEX idx_posts_search_vector ON posts USING GIN (search_vector);

Now searches are fast. Queries become:

SELECT id, title
FROM posts
WHERE search_vector @@ websearch_to_tsquery('english', 'python');

Postgres uses the GIN index, your laptop fan stays quiet, and your users get sub-millisecond response times.

Ranking

By default, results come back in whatever order Postgres feels like. To rank by relevance:

SELECT
    id,
    title,
    ts_rank(search_vector, query) AS rank
FROM posts, websearch_to_tsquery('english', 'python framework') query
WHERE search_vector @@ query
ORDER BY rank DESC
LIMIT 10;

ts_rank returns a score per row; we sort by it. Higher is more relevant.

A few useful variants:

ts_rank_cd — “cover density” ranking; takes word proximity into account.
Combine with weights — ts_rank(search_vector, query, 32) normalizes by document length, so short documents don’t always win.

For a “best results first” experience, ts_rank is the workhorse. Combine it with setweight(..., 'A'/'B'/'C'/'D') (above) to give important fields more weight.

Highlighting matches

If you want to show the user which words matched (like Google’s bold snippets), use ts_headline:

SELECT
    id,
    title,
    ts_headline('english', body, websearch_to_tsquery('english', 'python'),
                'StartSel=<mark>, StopSel=</mark>, MaxFragments=2') AS snippet
FROM posts
WHERE search_vector @@ websearch_to_tsquery('english', 'python');

Returns a snippet of the body with <mark>...</mark> tags around matching words. Wire that straight into your frontend.

Phrase queries and exact matches

websearch_to_tsquery already handles quoted phrases:

SELECT * FROM posts
WHERE search_vector @@ websearch_to_tsquery('english', '"web framework"');

Internally this becomes web <-> framework — words must be adjacent.

Searching across multiple fields with different weights

The setweight pattern earlier already does this, but here’s a fuller example:

ALTER TABLE posts
ADD COLUMN search_vector tsvector
GENERATED ALWAYS AS (
    setweight(to_tsvector('english', coalesce(title, '')),       'A') ||
    setweight(to_tsvector('english', coalesce(body, '')),        'B') ||
    setweight(to_tsvector('english', coalesce(tags::text, '')),  'C')
) STORED;

Now ts_rank will treat title matches as ~10× more important than body matches, and tag matches somewhere in between.

Using it from Python

From Django

django.contrib.postgres.search has nice ergonomics:

from django.contrib.postgres.search import SearchQuery, SearchRank, SearchVector

results = (
    Post.objects
        .annotate(rank=SearchRank(F("search_vector"), SearchQuery("python", search_type="websearch")))
        .filter(search_vector=SearchQuery("python", search_type="websearch"))
        .order_by("-rank")
)

If you didn’t store search_vector as a column, you can build it inline:

.annotate(search=SearchVector("title", "body"))
.filter(search=SearchQuery("python", search_type="websearch"))

But generated columns + GIN indexes are dramatically faster for any non-trivial corpus.

From SQLAlchemy

from sqlalchemy import func, select


query = (
    select(
        Post,
        func.ts_rank(Post.search_vector, func.websearch_to_tsquery("english", "python")).label("rank"),
    )
    .where(Post.search_vector.op("@@")(func.websearch_to_tsquery("english", "python")))
    .order_by(text("rank DESC"))
)

A bit more verbose but full-featured.

When to graduate to Elasticsearch

Postgres FTS is not a complete replacement for Elasticsearch. Reach for a dedicated search engine when you need:

Multi-language stemming on the same corpus (Postgres FTS picks one language per index).
Fuzzy matching / typo tolerance at scale (Postgres has pg_trgm for this, but it’s slower than Elasticsearch’s BM25).
Faceted aggregations across millions of docs in real time.
Custom analyzers / tokenizers for unusual text (e.g. CJK languages, code search, chemical formulas).
Tens or hundreds of millions of documents — Postgres FTS can handle this, but ES is purpose-built.

For everything else — blog search, internal docs, e-commerce search, “find a user by name” — Postgres is genuinely enough.

Common pitfalls

Stop words can bite you. to_tsquery('english', 'the') returns nothing because the is a stop word. Use simple config when you need literal matching.
to_tsquery requires properly-formatted input. Use plainto_tsquery or websearch_to_tsquery for user input — they sanitize.
ILIKE '%term%' is NOT a substitute for full-text search at any scale beyond a few thousand rows.
Different languages need different configs. Run SELECT cfgname FROM pg_ts_config; to see what’s available.
Generated columns are indexable, computed columns aren’t. Make sure you write STORED (default) and not VIRTUAL.

Conclusion

PostgreSQL’s full-text search is one of the most underrated features in the database. For most apps, you can deliver fast, ranked, language-aware search without standing up a separate service, paying for a hosted Elasticsearch cluster, or operating Lucene shards. The complexity savings are real — and the search is genuinely good.

Start with a generated tsvector column + GIN index + websearch_to_tsquery. Add ranking, weights, and highlighting as you need them. Graduate to a dedicated search engine only when Postgres genuinely runs out of room — which, for most teams, is never.

Want more on Postgres? Read:

Happy searching!

Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .

The basic idea#

A working example#

Operators#

Storing the tsvector (the right way)#

The index — GIN#

Ranking#

Highlighting matches#

Phrase queries and exact matches#

Searching across multiple fields with different weights#

Using it from Python#

From Django#

From SQLAlchemy#

When to graduate to Elasticsearch#

Common pitfalls#

Conclusion#

The basic idea

A working example

Operators

Storing the `tsvector` (the right way)

The index — GIN

Ranking

Highlighting matches

Phrase queries and exact matches

Searching across multiple fields with different weights

Using it from Python

From Django

From SQLAlchemy

When to graduate to Elasticsearch

Common pitfalls

Conclusion