AI/LLM Cheatsheet 11 — Cost Optimization

Cheatsheet: prompt caching, batching, model selection, output limits.

May 26, 2026 · 3 min · 427 words · Manvendra Rajpoot

AI/LLM Cheatsheet 12 — Local LLMs (Ollama, vLLM)

Cheatsheet: Ollama, vLLM, llama.cpp, when to self-host.

May 26, 2026 · 3 min · 506 words · Manvendra Rajpoot

AI/LLM Cheatsheet 13 — Fine-tuning

Cheatsheet: when to fine-tune, LoRA, QLoRA, OpenAI fine-tune.

May 26, 2026 · 2 min · 385 words · Manvendra Rajpoot

AI/LLM Cheatsheet 14 — Multimodal LLMs

Cheatsheet: vision LLMs, image inputs, audio, video.

May 26, 2026 · 3 min · 452 words · Manvendra Rajpoot

SQLAlchemy + Postgres Cheatsheet 14 — pgvector for Embeddings

Cheatsheet: Vector column, cosine_distance / l2 / inner_product, HNSW index, hybrid filter + ANN.

May 13, 2026 · 3 min · 476 words · Manvendra Rajpoot

AI/LLM Cheatsheet 15 — Security and Prompt Injection

Cheatsheet: prompt injection, defenses, PII, jailbreaks.

May 26, 2026 · 3 min · 556 words · Manvendra Rajpoot

AI/LLM Cheatsheet 16 — Vector DBs Deep Dive

Cheatsheet: vector DBs, HNSW, hybrid search, sharding.

May 26, 2026 · 3 min · 553 words · Manvendra Rajpoot

AI/LLM Cheatsheet 17 — Observability for LLMs

Cheatsheet: logging, traces, metrics, evals in prod.

May 26, 2026 · 3 min · 485 words · Manvendra Rajpoot

AI/LLM Cheatsheet 18 — LLM Application Patterns

Cheatsheet: classification, extraction, summarization, routing, decomposition.

May 26, 2026 · 3 min · 525 words · Manvendra Rajpoot

FastAPI Cheatsheet 18 — Streaming and LLM Integration

Cheatsheet: streaming Claude / GPT / vLLM tokens via SSE, tool-call loops, cancellation, prompt caching.

May 11, 2026 · 3 min · 470 words · Manvendra Rajpoot