<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>AI Engineering on Manvendra Rajpoot</title>
    <link>https://blog.rajpoot.dev/posts/ai/</link>
    <description>Recent content in AI Engineering on Manvendra Rajpoot</description>
    <image>
      <title>Manvendra Rajpoot</title>
      <url>https://blog.rajpoot.dev/img/personal/cover.png</url>
      <link>https://blog.rajpoot.dev/img/personal/cover.png</link>
    </image>
    <generator>Hugo</generator>
    <language>en</language>
    <copyright>Manvendra Rajpoot</copyright>
    <atom:link href="https://blog.rajpoot.dev/posts/ai/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Self-Hosting LLMs in 2026 — When the Math Actually Works</title>
      <link>https://blog.rajpoot.dev/posts/ai/llm-self-host-economics-2026/</link>
      <pubDate>Tue, 05 May 2026 08:30:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/llm-self-host-economics-2026/</guid>
      <description>Self-hosting LLMs in 2026 — vLLM, GPU economics, break-even, and when self-host beats API.</description>
    </item>
    <item>
      <title>Anthropic API Best Practices in 2026 — Caching, Tool Use, Streaming, and Production Patterns</title>
      <link>https://blog.rajpoot.dev/posts/ai/anthropic-api-best-practices-2026/</link>
      <pubDate>Tue, 05 May 2026 07:50:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/anthropic-api-best-practices-2026/</guid>
      <description>Anthropic API best practices in 2026 — prompt caching, tool use, streaming, batch API, and production patterns from real Claude apps.</description>
    </item>
    <item>
      <title>Evaluating AI Coding Tools in 2026 — Benchmarks That Matter and Ones That Don&#39;t</title>
      <link>https://blog.rajpoot.dev/posts/ai/ai-coding-evals-2026/</link>
      <pubDate>Tue, 05 May 2026 06:20:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/ai-coding-evals-2026/</guid>
      <description>Evaluating AI coding tools in 2026 — SWE-bench, real-world tasks, and what&amp;#39;s actually predictive of productivity gains.</description>
    </item>
    <item>
      <title>Synthetic Data with LLMs in 2026 — Use Cases, Risks, and the Patterns That Work</title>
      <link>https://blog.rajpoot.dev/posts/ai/synthetic-data-2026/</link>
      <pubDate>Tue, 05 May 2026 06:10:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/synthetic-data-2026/</guid>
      <description>Synthetic data generation with LLMs in 2026 — when it helps, model collapse risk, eval set generation, and production patterns.</description>
    </item>
    <item>
      <title>Voice Agents in 2026 — STT, LLM, TTS, and Latency That Doesn&#39;t Hurt</title>
      <link>https://blog.rajpoot.dev/posts/ai/voice-agents-2026/</link>
      <pubDate>Tue, 05 May 2026 06:00:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/voice-agents-2026/</guid>
      <description>Building voice AI agents in 2026 — streaming STT, LLM, TTS pipelines, latency budgets, interruption, and real-world architectures.</description>
    </item>
    <item>
      <title>Model Context Protocol (MCP) in 2026 — What It Solved, What It Didn&#39;t</title>
      <link>https://blog.rajpoot.dev/posts/ai/llm-mcp-protocol-2026/</link>
      <pubDate>Mon, 04 May 2026 06:10:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/llm-mcp-protocol-2026/</guid>
      <description>MCP in 2026 — protocol overview, server / client patterns, ecosystem, and an honest take on where MCP fits in agent infrastructure.</description>
    </item>
    <item>
      <title>LLM Tool Use Patterns in 2026 — Schemas, Validation, and the Loop</title>
      <link>https://blog.rajpoot.dev/posts/ai/llm-tool-use-patterns-2026/</link>
      <pubDate>Mon, 04 May 2026 06:00:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/llm-tool-use-patterns-2026/</guid>
      <description>LLM tool use in 2026 — designing tool schemas, parallel calls, error handling, and the patterns from production agents.</description>
    </item>
    <item>
      <title>Agentic Coding in 2026 — Claude Code, Cursor, and the Real Workflow</title>
      <link>https://blog.rajpoot.dev/posts/ai/agentic-coding-2026/</link>
      <pubDate>Sun, 03 May 2026 08:50:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/agentic-coding-2026/</guid>
      <description>Agentic coding in 2026 — Claude Code, Cursor, Aider, and how AI coding agents actually fit into senior engineers&amp;#39; workflows.</description>
    </item>
    <item>
      <title>LLM Batch Processing in 2026 — Anthropic / OpenAI Batch API for 50% Off</title>
      <link>https://blog.rajpoot.dev/posts/ai/llm-batch-processing-2026/</link>
      <pubDate>Sun, 03 May 2026 06:20:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/llm-batch-processing-2026/</guid>
      <description>LLM batch APIs in 2026 — Anthropic, OpenAI, Bedrock batch processing for 50% discount, when to use them, and the patterns that work.</description>
    </item>
    <item>
      <title>LLM Deployment Patterns in 2026 — Inference Servers, Routing, and Production Architectures</title>
      <link>https://blog.rajpoot.dev/posts/ai/llm-deployment-patterns-2026/</link>
      <pubDate>Sun, 03 May 2026 06:10:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/llm-deployment-patterns-2026/</guid>
      <description>LLM deployment patterns in 2026 — vLLM, TGI, Ollama, hybrid API&#43;self-hosted, routing layers, and the production architectures that actually work.</description>
    </item>
    <item>
      <title>Prompt Engineering in 2026 — What Still Works, What Doesn&#39;t, and What Changed</title>
      <link>https://blog.rajpoot.dev/posts/ai/llm-prompt-engineering-2026/</link>
      <pubDate>Sun, 03 May 2026 06:00:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/llm-prompt-engineering-2026/</guid>
      <description>Prompt engineering in 2026 — patterns that still work, what&amp;#39;s been obsoleted by better models, structured prompts, and production discipline.</description>
    </item>
    <item>
      <title>LLM Agent Frameworks in 2026 — LangGraph, CrewAI, and the Bare-Metal Alternative</title>
      <link>https://blog.rajpoot.dev/posts/ai/llm-agent-frameworks-2026/</link>
      <pubDate>Sat, 02 May 2026 12:50:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/llm-agent-frameworks-2026/</guid>
      <description>LLM agent frameworks in 2026 — LangGraph, CrewAI, OpenAI Agents SDK, AutoGen, and when bare-metal is better.</description>
    </item>
    <item>
      <title>Agent Memory Systems in 2026 — Episodic, Semantic, and the Patterns That Stick</title>
      <link>https://blog.rajpoot.dev/posts/ai/agent-memory-systems-2026/</link>
      <pubDate>Sat, 02 May 2026 11:10:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/agent-memory-systems-2026/</guid>
      <description>Agent memory systems in 2026 — episodic vs semantic memory, vector stores, working memory, and patterns from production agents.</description>
    </item>
    <item>
      <title>LLM Context Windows in 2026 — Long Context, Cache, and the Limits of &#39;Just Add More&#39;</title>
      <link>https://blog.rajpoot.dev/posts/ai/llm-context-windows-2026/</link>
      <pubDate>Sat, 02 May 2026 11:00:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/llm-context-windows-2026/</guid>
      <description>LLM context windows in 2026 — what 200k / 1M context can and can&amp;#39;t do, prompt caching, retrieval, and patterns from production.</description>
    </item>
    <item>
      <title>Multimodal LLMs in 2026 — Vision, Audio, and What&#39;s Actually Useful</title>
      <link>https://blog.rajpoot.dev/posts/ai/multimodal-llms-2026/</link>
      <pubDate>Sat, 02 May 2026 09:30:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/multimodal-llms-2026/</guid>
      <description>Multimodal LLMs in 2026 — vision input, audio input, generation, real-world use cases, and the patterns that work in production.</description>
    </item>
    <item>
      <title>Evaluating RAG Systems in 2026 — Retrieval Quality, Faithfulness, and the Metrics That Matter</title>
      <link>https://blog.rajpoot.dev/posts/ai/llm-rag-evaluation-2026/</link>
      <pubDate>Sat, 02 May 2026 09:20:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/llm-rag-evaluation-2026/</guid>
      <description>RAG evaluation in 2026 — retrieval metrics (recall, MRR), generation metrics (faithfulness, relevance), Ragas, and the patterns from production RAG.</description>
    </item>
    <item>
      <title>LLM Observability in 2026 — Tracing, Evals, and the Things You Can&#39;t Skip</title>
      <link>https://blog.rajpoot.dev/posts/ai/llm-observability-2026/</link>
      <pubDate>Sat, 02 May 2026 07:30:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/llm-observability-2026/</guid>
      <description>Production LLM observability in 2026 — distributed tracing, eval pipelines, Langfuse, Arize, and the patterns that turn black-box LLMs into operable systems.</description>
    </item>
    <item>
      <title>LLM Cost Optimization in 2026 — From Bills That Hurt to Bills That Don&#39;t</title>
      <link>https://blog.rajpoot.dev/posts/ai/llm-cost-optimization-2026/</link>
      <pubDate>Sat, 02 May 2026 07:20:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/llm-cost-optimization-2026/</guid>
      <description>Cutting LLM costs in 2026 — prompt caching, routing, batching, fine-tunes, and the patterns that drop bills 5-20× without quality loss.</description>
    </item>
    <item>
      <title>LLM Guardrails in 2026 — Input Filtering, Output Validation, and Safety Nets</title>
      <link>https://blog.rajpoot.dev/posts/ai/llm-guardrails-content-safety-2026/</link>
      <pubDate>Fri, 01 May 2026 07:10:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/llm-guardrails-content-safety-2026/</guid>
      <description>Practical LLM guardrails in 2026 — input filtering, output validation, NVIDIA NeMo, Guardrails AI, and the patterns that prevent embarrassments.</description>
    </item>
    <item>
      <title>Embedding Databases in 2026 — pgvector, Qdrant, Weaviate, Milvus, Pinecone</title>
      <link>https://blog.rajpoot.dev/posts/ai/embedding-databases-2026/</link>
      <pubDate>Fri, 01 May 2026 07:00:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/embedding-databases-2026/</guid>
      <description>Embedding databases compared in 2026 — pgvector, Qdrant, Weaviate, Milvus, Pinecone, Vectorize. When each fits.</description>
    </item>
    <item>
      <title>Fine-Tuning LLMs in 2026 — LoRA, QLoRA, and the Cheap Path to Specialized Models</title>
      <link>https://blog.rajpoot.dev/posts/ai/llm-fine-tuning-lora-qlora-2026/</link>
      <pubDate>Fri, 01 May 2026 06:00:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/llm-fine-tuning-lora-qlora-2026/</guid>
      <description>Practical LLM fine-tuning in 2026 — LoRA, QLoRA, training data prep, evaluation, and the patterns from teams shipping fine-tuned models.</description>
    </item>
    <item>
      <title>LLM Agent Error Recovery in 2026 — Patterns That Don&#39;t Loop Forever</title>
      <link>https://blog.rajpoot.dev/posts/ai/llm-agent-error-recovery-2026/</link>
      <pubDate>Fri, 01 May 2026 04:30:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/llm-agent-error-recovery-2026/</guid>
      <description>How to build LLM agents that recover from errors gracefully — retry policies, fallback paths, max-step caps, and the patterns that prevent runaway loops.</description>
    </item>
    <item>
      <title>OpenAI vs Anthropic vs Google for Production AI in 2026</title>
      <link>https://blog.rajpoot.dev/posts/ai/openai-vs-anthropic-vs-google-2026/</link>
      <pubDate>Fri, 01 May 2026 03:00:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/openai-vs-anthropic-vs-google-2026/</guid>
      <description>Honest comparison of OpenAI vs Anthropic vs Google for production LLM apps in 2026 — model quality, pricing, latency, ecosystem, and how to pick.</description>
    </item>
    <item>
      <title>Document AI in 2026 — Extracting Structured Data from PDFs and Images</title>
      <link>https://blog.rajpoot.dev/posts/ai/document-ai-pdf-extraction-2026/</link>
      <pubDate>Fri, 01 May 2026 01:50:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/document-ai-pdf-extraction-2026/</guid>
      <description>How to extract structured data from PDFs and images in 2026 — vision LLMs, OCR pipelines, layout-aware models, and the patterns that ship.</description>
    </item>
    <item>
      <title>LLM Prompt Caching Deep Dive — Anthropic, OpenAI, and the Patterns That Save 90%</title>
      <link>https://blog.rajpoot.dev/posts/ai/llm-prompt-caching-deep-dive-2026/</link>
      <pubDate>Fri, 01 May 2026 00:30:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/llm-prompt-caching-deep-dive-2026/</guid>
      <description>Prompt caching mechanics in 2026 — Anthropic&amp;#39;s ephemeral cache, OpenAI&amp;#39;s automatic caching, breakpoint placement, hit-rate measurement, and the patterns that save real money.</description>
    </item>
    <item>
      <title>LLM Evaluation Frameworks in 2026 — Braintrust, LangSmith, Ragas, DeepEval</title>
      <link>https://blog.rajpoot.dev/posts/ai/llm-evaluation-frameworks-2026/</link>
      <pubDate>Thu, 30 Apr 2026 23:59:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/llm-evaluation-frameworks-2026/</guid>
      <description>Picking an LLM evaluation framework in 2026 — Braintrust vs LangSmith vs Ragas vs DeepEval. What each does, when each fits.</description>
    </item>
    <item>
      <title>Designing Tools for AI Agents in 2026 — The Patterns That Work</title>
      <link>https://blog.rajpoot.dev/posts/ai/agent-tool-design-patterns-2026/</link>
      <pubDate>Thu, 30 Apr 2026 22:40:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/agent-tool-design-patterns-2026/</guid>
      <description>How to design tools that AI agents use correctly — naming, descriptions, schemas, error returns, and the patterns from production agent systems.</description>
    </item>
    <item>
      <title>Context Engineering for LLMs in 2026 — The Discipline Beyond Prompting</title>
      <link>https://blog.rajpoot.dev/posts/ai/llm-context-engineering-patterns-2026/</link>
      <pubDate>Thu, 30 Apr 2026 21:00:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/llm-context-engineering-patterns-2026/</guid>
      <description>Context engineering for LLMs in 2026 — what to put in context, what to leave out, ordering, compression, and the patterns that make agents work.</description>
    </item>
    <item>
      <title>LLM Streaming with Cancellation — Patterns That Don&#39;t Waste Tokens</title>
      <link>https://blog.rajpoot.dev/posts/ai/llm-streaming-cancellation-patterns-2026/</link>
      <pubDate>Thu, 30 Apr 2026 20:40:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/llm-streaming-cancellation-patterns-2026/</guid>
      <description>How to implement LLM streaming with proper cancellation in 2026 — SSE patterns, abort signals, server-side cancel, and not paying for tokens the user doesn&amp;#39;t want.</description>
    </item>
    <item>
      <title>LLM Routing in 2026 — Use Haiku to Save 80% on Sonnet/Opus Bills</title>
      <link>https://blog.rajpoot.dev/posts/ai/llm-routing-classification-haiku-2026/</link>
      <pubDate>Thu, 30 Apr 2026 19:40:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/llm-routing-classification-haiku-2026/</guid>
      <description>How LLM routing with a small classifier (Haiku) saves 80% on Sonnet / Opus / GPT-5 bills in 2026 — patterns, accuracy, and how to wire it in.</description>
    </item>
    <item>
      <title>Giving AI Agents Memory in 2026 — Mem0, Zep, and the Patterns That Work</title>
      <link>https://blog.rajpoot.dev/posts/ai/agents-with-memory-mem0-zep-2026/</link>
      <pubDate>Thu, 30 Apr 2026 13:40:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/agents-with-memory-mem0-zep-2026/</guid>
      <description>How to give AI agents long-term memory in 2026 — Mem0, Zep, hand-rolled memory layers, summary memory, and the architecture that scales.</description>
    </item>
    <item>
      <title>Sandboxed Code Execution for AI Agents — E2B, Modal, Daytona, and the 2026 Stack</title>
      <link>https://blog.rajpoot.dev/posts/ai/sandboxed-code-execution-agents-e2b-modal-2026/</link>
      <pubDate>Thu, 30 Apr 2026 13:30:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/sandboxed-code-execution-agents-e2b-modal-2026/</guid>
      <description>How AI agents run code safely in 2026 — E2B, Modal, Daytona, microVMs, and the patterns for sandboxed execution with internet access.</description>
    </item>
    <item>
      <title>AI Coding Assistants ROI in 2026 — The Honest Numbers</title>
      <link>https://blog.rajpoot.dev/posts/ai/ai-coding-assistants-cost-roi-2026/</link>
      <pubDate>Thu, 30 Apr 2026 13:20:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/ai-coding-assistants-cost-roi-2026/</guid>
      <description>Honest ROI numbers for AI coding assistants in 2026 — productivity gains, where they actually help, where they hurt, and the patterns of high-leverage adoption.</description>
    </item>
    <item>
      <title>1M-Token Context Windows in 2026 — When They Help, When They Hurt</title>
      <link>https://blog.rajpoot.dev/posts/ai/long-context-1m-tokens-2026/</link>
      <pubDate>Thu, 30 Apr 2026 13:10:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/long-context-1m-tokens-2026/</guid>
      <description>Practical guide to 1M-token context windows in 2026 — when long context replaces RAG, when it doesn&amp;#39;t, prompt caching, and the cost reality.</description>
    </item>
    <item>
      <title>Agentic RAG in 2026 — When Retrieval Becomes a Tool, Not a Pipeline</title>
      <link>https://blog.rajpoot.dev/posts/ai/agentic-rag-2026/</link>
      <pubDate>Thu, 30 Apr 2026 13:00:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/agentic-rag-2026/</guid>
      <description>Agentic RAG explained — when the agent decides what and when to retrieve, multi-step reasoning, query rewriting, self-reflection, and the patterns that beat naive RAG.</description>
    </item>
    <item>
      <title>LLM Security in 2026 — Prompt Injection, Data Exfiltration, and Defense in Depth</title>
      <link>https://blog.rajpoot.dev/posts/ai/llm-security-prompt-injection-2026/</link>
      <pubDate>Thu, 30 Apr 2026 12:50:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/llm-security-prompt-injection-2026/</guid>
      <description>How to defend against LLM-specific attacks in 2026 — prompt injection, indirect injection, data exfiltration, jailbreaks, and the layered defenses that work.</description>
    </item>
    <item>
      <title>LLM Observability in 2026 — LangSmith, Langfuse, Helicone, and OpenTelemetry</title>
      <link>https://blog.rajpoot.dev/posts/ai/llm-observability-tracing-langsmith-2026/</link>
      <pubDate>Thu, 30 Apr 2026 12:40:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/llm-observability-tracing-langsmith-2026/</guid>
      <description>How to observe production LLM apps in 2026 — LangSmith, Langfuse, Helicone, OpenTelemetry GenAI semantic conventions, and the metrics that matter.</description>
    </item>
    <item>
      <title>Rerankers in RAG — The Underrated Quality Multiplier in 2026</title>
      <link>https://blog.rajpoot.dev/posts/ai/rerankers-rag-quality-2026/</link>
      <pubDate>Thu, 30 Apr 2026 12:30:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/rerankers-rag-quality-2026/</guid>
      <description>Why rerankers are the highest-ROI upgrade to a RAG system in 2026 — Cohere Rerank, BGE-Reranker, JinaAI, cross-encoders, and how to wire one into a production pipeline.</description>
    </item>
    <item>
      <title>Embedding Models in 2026 — OpenAI, Voyage, Cohere, BGE, and How to Pick</title>
      <link>https://blog.rajpoot.dev/posts/ai/embeddings-models-comparison-2026/</link>
      <pubDate>Thu, 30 Apr 2026 12:20:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/embeddings-models-comparison-2026/</guid>
      <description>How to pick an embedding model in 2026 — OpenAI text-embedding-3, Voyage, Cohere, BGE, and the open-source landscape. Quality, cost, dimensions, multilingual support.</description>
    </item>
    <item>
      <title>Voice Agents and Realtime LLM APIs in 2026 — How They Actually Work</title>
      <link>https://blog.rajpoot.dev/posts/ai/voice-agents-realtime-llm-2026/</link>
      <pubDate>Thu, 30 Apr 2026 12:10:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/voice-agents-realtime-llm-2026/</guid>
      <description>How voice agents work in 2026 — Realtime APIs from OpenAI / Anthropic / Google, latency budgets, ASR, TTS, interruption handling, and production architecture.</description>
    </item>
    <item>
      <title>LLM Cost Optimization in 2026 — Tactics That Cut Bills 50–90%</title>
      <link>https://blog.rajpoot.dev/posts/ai/llm-cost-optimization-tactics-2026/</link>
      <pubDate>Thu, 30 Apr 2026 12:00:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/llm-cost-optimization-tactics-2026/</guid>
      <description>Concrete LLM cost optimization tactics that cut your Anthropic / OpenAI / Gemini bill by 50–90% — caching, model routing, batching, fine-tuning, and the patterns that compound.</description>
    </item>
    <item>
      <title>Building an MCP Server for Your SaaS — A 2026 Distribution Strategy</title>
      <link>https://blog.rajpoot.dev/posts/ai/build-mcp-server-saas-2026/</link>
      <pubDate>Thu, 30 Apr 2026 09:20:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/build-mcp-server-saas-2026/</guid>
      <description>Why every SaaS needs an MCP server in 2026 — the distribution play, what to expose as tools, OAuth patterns, and a working TypeScript example.</description>
    </item>
    <item>
      <title>Structured Output for LLMs in 2026 — Pydantic AI, Instructor, and the End of JSON Parsing</title>
      <link>https://blog.rajpoot.dev/posts/ai/structured-output-pydantic-ai-instructor-2026/</link>
      <pubDate>Thu, 30 Apr 2026 09:10:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/structured-output-pydantic-ai-instructor-2026/</guid>
      <description>How to get structured, validated output from LLMs in 2026 — Pydantic AI, Instructor, native tool-calling, OpenAI&amp;#39;s structured outputs API, and the patterns that make extraction reliable.</description>
    </item>
    <item>
      <title>AI Gateways in 2026 — LiteLLM, Portkey, Helicone, and the OpenAI Façade</title>
      <link>https://blog.rajpoot.dev/posts/ai/ai-gateways-litellm-portkey-helicone-2026/</link>
      <pubDate>Thu, 30 Apr 2026 09:00:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/ai-gateways-litellm-portkey-helicone-2026/</guid>
      <description>AI gateways explained — why every serious LLM app needs one in 2026, comparison of LiteLLM, Portkey, Helicone, and OpenRouter, and how to add one without rewriting your code.</description>
    </item>
    <item>
      <title>Multi-Agent Systems in 2026 — Production Patterns That Work</title>
      <link>https://blog.rajpoot.dev/posts/ai/multi-agent-systems-production-patterns-2026/</link>
      <pubDate>Thu, 30 Apr 2026 08:50:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/multi-agent-systems-production-patterns-2026/</guid>
      <description>Multi-agent systems explained — supervisor / worker, writer / reviewer, hierarchical and swarm patterns, and the production gotchas in 2026.</description>
    </item>
    <item>
      <title>Cursor vs Windsurf vs Claude Code in 2026 — An Honest Comparison</title>
      <link>https://blog.rajpoot.dev/posts/ai/cursor-vs-windsurf-vs-claude-code-2026/</link>
      <pubDate>Thu, 30 Apr 2026 08:00:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/cursor-vs-windsurf-vs-claude-code-2026/</guid>
      <description>Cursor vs Windsurf vs Claude Code in 2026 — pricing, agentic features, context windows, multi-file editing, and which tool fits which workflow.</description>
    </item>
    <item>
      <title>Fine-Tuning vs RAG vs Prompting in 2026 — How to Pick the Right Approach</title>
      <link>https://blog.rajpoot.dev/posts/ai/fine-tuning-vs-rag-vs-prompting-2026/</link>
      <pubDate>Wed, 29 Apr 2026 10:00:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/fine-tuning-vs-rag-vs-prompting-2026/</guid>
      <description>When to fine-tune, when to RAG, and when to just prompt — a practical 2026 decision guide for LLM applications, with cost, quality, and ops tradeoffs.</description>
    </item>
    <item>
      <title>Claude Code Skills and Agentic Coding Patterns in 2026</title>
      <link>https://blog.rajpoot.dev/posts/ai/claude-code-skills-agentic-coding-2026/</link>
      <pubDate>Wed, 29 Apr 2026 09:50:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/claude-code-skills-agentic-coding-2026/</guid>
      <description>Claude Code Skills explained — what they are, when to use them, how to write a SKILL.md, the multi-session and writer/reviewer patterns that reshape coding workflows in 2026.</description>
    </item>
    <item>
      <title>Model Context Protocol (MCP) Explained — The USB-C of AI Tools</title>
      <link>https://blog.rajpoot.dev/posts/ai/model-context-protocol-mcp-explained/</link>
      <pubDate>Tue, 28 Apr 2026 21:00:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/model-context-protocol-mcp-explained/</guid>
      <description>Model Context Protocol (MCP) explained from first principles — what it is, how it works, why it matters, and how to build an MCP server for your own tools and data.</description>
    </item>
    <item>
      <title>Self-Hosted LLMs in 2026 — Ollama, vLLM, and When to Skip the API</title>
      <link>https://blog.rajpoot.dev/posts/ai/self-hosted-llms-vllm-ollama-2026/</link>
      <pubDate>Tue, 28 Apr 2026 20:50:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/self-hosted-llms-vllm-ollama-2026/</guid>
      <description>When to self-host LLMs in 2026 — Ollama for dev, vLLM and SGLang for production, model choice, hardware sizing, and the latency/cost tradeoffs vs hosted APIs.</description>
    </item>
    <item>
      <title>LLM Evaluations — How to Test Prompts and Agents Like a Pro</title>
      <link>https://blog.rajpoot.dev/posts/ai/llm-evaluations-test-prompts-agents/</link>
      <pubDate>Tue, 28 Apr 2026 16:40:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/llm-evaluations-test-prompts-agents/</guid>
      <description>A practical, no-fluff guide to evaluating LLM applications — what to measure, how to build a starter eval set, LLM-as-judge done right, and how to wire evals into CI.</description>
    </item>
    <item>
      <title>Prompt Engineering Patterns That Survive Production</title>
      <link>https://blog.rajpoot.dev/posts/ai/prompt-engineering-production-patterns/</link>
      <pubDate>Tue, 28 Apr 2026 16:30:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/prompt-engineering-production-patterns/</guid>
      <description>Prompt engineering patterns that hold up in production — system prompts, structured outputs, few-shot, reasoning steps, role separation, and the anti-patterns that look clever but quietly fail.</description>
    </item>
    <item>
      <title>Anthropic Claude API &#43; Tool Use — A Practical Guide for 2026</title>
      <link>https://blog.rajpoot.dev/posts/ai/anthropic-claude-api-tool-use-guide/</link>
      <pubDate>Tue, 28 Apr 2026 16:20:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/anthropic-claude-api-tool-use-guide/</guid>
      <description>A no-fluff guide to the Anthropic Claude API in 2026 — messages, tool use, prompt caching, structured outputs, streaming, and the patterns that ship.</description>
    </item>
    <item>
      <title>AI Agents with LangGraph in 2026 — A Practical Tutorial</title>
      <link>https://blog.rajpoot.dev/posts/ai/ai-agents-with-langgraph-tutorial/</link>
      <pubDate>Tue, 28 Apr 2026 16:10:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/ai-agents-with-langgraph-tutorial/</guid>
      <description>Build a real AI agent with LangGraph — tools, state, memory, conditional routing, and the production patterns that separate working agents from demoware.</description>
    </item>
    <item>
      <title>Build a Production RAG App with pgvector and FastAPI in 2026</title>
      <link>https://blog.rajpoot.dev/posts/ai/build-rag-app-pgvector-fastapi/</link>
      <pubDate>Tue, 28 Apr 2026 16:00:00 +0530</pubDate>
      <guid>https://blog.rajpoot.dev/posts/ai/build-rag-app-pgvector-fastapi/</guid>
      <description>A complete, copy-paste guide to building a Retrieval-Augmented Generation (RAG) backend with PostgreSQL &#43; pgvector and FastAPI — chunking, embeddings, hybrid search, and the parts most tutorials skip.</description>
    </item>
  </channel>
</rss>
