LLM Evaluations — How to Test Prompts and Agents Like a Pro

A practical guide to LLM evaluations — what to measure, building eval sets, LLM-as-judge done right, RAG-specific metrics, and integrating evals into CI so you stop shipping silent regressions.

April 28, 2026 · 7 min · 1352 words · Manvendra Rajpoot

Prompt Engineering Patterns That Survive Production

The prompt patterns I keep reaching for in production LLM apps — system prompt structure, role separation, structured output, few-shot, chain-of-thought, prompt caching, and the anti-patterns to skip.

April 28, 2026 · 7 min · 1409 words · Manvendra Rajpoot

Anthropic Claude API + Tool Use — A Practical Guide for 2026

How to actually use the Anthropic Claude API in production. Messages format, tool use, prompt caching for 90% cost cuts, structured outputs, streaming, and the gotchas worth knowing.

April 28, 2026 · 6 min · 1207 words · Manvendra Rajpoot

AI Agents with LangGraph in 2026 — A Practical Tutorial

A from-scratch tutorial on building AI agents with LangGraph. Tools, persistent state, conditional routing, human-in-the-loop, and the production patterns most demos skip.

April 28, 2026 · 7 min · 1294 words · Manvendra Rajpoot

Build a Production RAG App with pgvector and FastAPI in 2026

A complete, end-to-end RAG backend built on PostgreSQL + pgvector and FastAPI. Real chunking, real embeddings, hybrid (vector + BM25) retrieval, prompt assembly, citations, and production gotchas.

April 28, 2026 · 8 min · 1679 words · Manvendra Rajpoot