Cheatsheet for Postgres FTS via SQLAlchemy.

Setup (generated tsvector column)

from sqlalchemy import Index, Computed
from sqlalchemy.dialects.postgresql import TSVECTOR

class Post(Base):
    __tablename__ = "posts"
    id: Mapped[int] = mapped_column(primary_key=True)
    title: Mapped[str]
    body: Mapped[str]
    search_vec: Mapped[str] = mapped_column(
        TSVECTOR,
        Computed(
            "setweight(to_tsvector('english', coalesce(title, '')), 'A') || "
            "setweight(to_tsvector('english', coalesce(body, '')), 'B')",
            persisted=True,
        ),
    )
    
    __table_args__ = (
        Index("ix_posts_search_gin", "search_vec", postgresql_using="gin"),
    )

Generated column kept in sync by Postgres. setweight boosts title over body in ranking.

Query

from sqlalchemy import func

q = "postgres jsonb"

q_ts = func.websearch_to_tsquery("english", q)
stmt = (
    select(Post)
    .where(Post.search_vec.op("@@")(q_ts))
    .order_by(func.ts_rank(Post.search_vec, q_ts).desc())
    .limit(20)
)

websearch_to_tsquery parses user-friendly queries: "phrase", or, -not.

Ranking variants

# Frequency-based
func.ts_rank(Post.search_vec, q_ts)

# Cover-density (proximity-aware) — usually more relevant
func.ts_rank_cd(Post.search_vec, q_ts)

# Normalization (0 = none, 1 = log doc length, 2 = doc length, 4 = harmonic, 8 = doc unique terms)
func.ts_rank_cd(Post.search_vec, q_ts, 32)

Highlighting snippets

stmt = select(
    Post,
    func.ts_headline(
        "english", Post.body, q_ts,
        "MaxWords=30, MinWords=10, MaxFragments=2",
    ).label("snippet"),
)

Returns body with matched terms wrapped in <b>...</b>.

Multi-language

Store language per row:

class Post(Base):
    language: Mapped[str] = mapped_column(default="english")
    search_vec: Mapped[str] = mapped_column(
        TSVECTOR,
        Computed("to_tsvector(language::regconfig, body)", persisted=True),
    )

phraseto_tsquery (exact phrase)

q_ts = func.phraseto_tsquery("english", "exact phrase")

Or include in websearch with quotes: "exact phrase".

Trigram fuzzy (typo tolerance)

CREATE EXTENSION pg_trgm;
__table_args__ = (
    Index(
        "ix_posts_title_trgm",
        "title",
        postgresql_using="gin",
        postgresql_ops={"title": "gin_trgm_ops"},
    ),
)

# Similarity threshold
stmt = select(Post).where(Post.title.op("%")(q))   # uses default threshold

# LIKE uses trigram index
stmt = select(Post).where(Post.title.ilike(f"%{q}%"))

Hybrid (FTS + trigram)

fts_ids = select(Post.id).where(Post.search_vec.op("@@")(q_ts))
trgm_ids = select(Post.id).where(Post.title.op("%")(q))

stmt = select(Post).where(Post.id.in_(fts_ids.union(trgm_ids)))

Or use RRF (Reciprocal Rank Fusion) to combine rankings.

Faceted

stmt = (
    select(Post)
    .where(Post.search_vec.op("@@")(q_ts))
    .where(Post.category == category)
    .where(Post.published == True)
)

For composite index across FTS + B-tree column: needs btree_gin extension:

CREATE EXTENSION btree_gin;
__table_args__ = (
    Index(
        "ix_posts_cat_search",
        "category", "search_vec",
        postgresql_using="gin",
    ),
)

Backfilling

def upgrade():
    op.add_column("posts", sa.Column("search_vec", postgresql.TSVECTOR()))
    op.execute("""
        UPDATE posts SET search_vec =
            setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
            setweight(to_tsvector('english', coalesce(body, '')), 'B')
    """)
    op.execute("CREATE INDEX CONCURRENTLY ix_posts_search ON posts USING gin (search_vec)")

For huge tables: batch the backfill.

RUM index (3rd-party, ranking-aware)

CREATE EXTENSION rum;

Faster ts_rank queries with metadata in the index. Niche.

When to use Postgres FTS

  • Up to ~10M docs.
  • Moderate query rates.
  • Single language (or per-row config).
  • Don’t need aggressive typo tolerance.

Beyond: Meilisearch / Typesense / Elasticsearch. See Search System Design .

Common mistakes

  • tsvector without GIN — seq scans on millions of rows.
  • Forgetting coalesce(...) — NULL fields make whole tsvector NULL.
  • Using to_tsvector('english', ...) on non-English content.
  • Trying to compete with Elasticsearch on aggressive ranking.

Read this next

If you want my Postgres FTS + trigram hybrid template, it’s at rajpoot.dev .


Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .