AI/LLM Cheatsheet 14 — Multimodal LLMs

Cheatsheet: vision LLMs, image inputs, audio, video.

May 26, 2026 · 3 min · 452 words · Manvendra Rajpoot

Multimodal LLMs in 2026 — Vision, Audio, and What's Actually Useful

Practical multimodal: vision-aware document understanding, audio transcription + reasoning, image-from-text, video understanding, and where multimodal pays off.

May 2, 2026 · 4 min · 797 words · Manvendra Rajpoot