AI Engineering

Posts on AI engineering — the discipline of building real products on top of LLMs. Practical writing on RAG, agents, prompt engineering, vector databases, evaluations, and the production realities of shipping AI features that don’t fall apart in week three.

LLM Evaluations — How to Test Prompts and Agents Like a Pro

A practical guide to LLM evaluations — what to measure, building eval sets, LLM-as-judge done right, RAG-specific metrics, and integrating evals into CI so you stop shipping silent regressions.

Prompt Engineering Patterns That Survive Production

The prompt patterns I keep reaching for in production LLM apps — system prompt structure, role separation, structured output, few-shot, chain-of-thought, prompt caching, and the anti-patterns to skip.

Anthropic Claude API + Tool Use — A Practical Guide for 2026

How to actually use the Anthropic Claude API in production. Messages format, tool use, prompt caching for 90% cost cuts, structured outputs, streaming, and the gotchas worth knowing.

AI Agents with LangGraph in 2026 — A Practical Tutorial

A from-scratch tutorial on building AI agents with LangGraph. Tools, persistent state, conditional routing, human-in-the-loop, and the production patterns most demos skip.

Build a Production RAG App with pgvector and FastAPI in 2026

A complete, end-to-end RAG backend built on PostgreSQL + pgvector and FastAPI. Real chunking, real embeddings, hybrid (vector + BM25) retrieval, prompt assembly, citations, and production gotchas.