Vector Search

Rerankers in RAG — The Underrated Quality Multiplier in 2026

Rerankers turn ‘pretty good RAG’ into ‘great RAG’ for one extra API call. Cross-encoders explained, Cohere Rerank vs BGE-Reranker vs Jina, two-stage retrieval architecture, latency, cost, and implementation.

Embedding Models in 2026 — OpenAI, Voyage, Cohere, BGE, and How to Pick

A practical 2026 guide to picking an embedding model. OpenAI text-embedding-3 vs Voyage vs Cohere vs open BGE / Nomic. Quality on MTEB, cost, dimensions, multilingual, and how to evaluate on your own data.

pgvector Deep Dive — HNSW, IVFFlat, and Tuning Postgres for Vector Search

Everything you need to make pgvector fast in production: HNSW vs IVFFlat, distance operators, m and ef_construction, ef_search at query time, partial indexes for multi-tenant data, and benchmarking your real workload.

Build a Production RAG App with pgvector and FastAPI in 2026

A complete, end-to-end RAG backend built on PostgreSQL + pgvector and FastAPI. Real chunking, real embeddings, hybrid (vector + BM25) retrieval, prompt assembly, citations, and production gotchas.