By 2026, agentic coding has moved from novelty to default. AI coding agents (Claude Code, Cursor, Codex CLI, Gemini CLI, Antigravity) ship production code under human direction. The unit of leverage is no longer the prompt — it’s the Skill.
This post is the working guide to Claude Code Skills: what they are, the SKILL.md format, the multi-session patterns that produce reliable code, and the agentic patterns I keep reaching for.
What a Skill is
A Skill is a folder with a SKILL.md file and optional supporting assets (scripts, templates, reference docs). When Claude Code encounters a task that matches the Skill’s description, it loads the Skill and follows the playbook.
.claude/skills/release-notes/
├── SKILL.md
├── template.md
└── scripts/
└── git-log-since-tag.sh
A SKILL.md:
---
name: release-notes
description: Generate release notes from git history since the last tag.
---
# Release Notes
When asked to generate release notes for a release:
1. Run `scripts/git-log-since-tag.sh` to get the commit log.
2. Group commits by type using Conventional Commits (`feat`, `fix`, `chore`, ...).
3. Use `template.md` as the markdown structure.
4. Highlight breaking changes in a **Breaking Changes** section.
5. Include a **Migration** section if any breaking changes exist.
6. Save to `CHANGELOG.md` and open a PR titled `chore: release notes for vX.Y.Z`.
## Examples
For a typical patch release, the output should look like:
\`\`\`
## v1.4.2 — 2026-04-29
### Fixes
- Auth: handle empty `Authorization` header safely (#412)
\`\`\`
## Notes
- Use the project's existing changelog conventions if `CHANGELOG.md` already exists.
- Don't include merge commits.
That’s a complete Skill. Frontmatter (name, description), playbook in markdown, optional scripts and templates. The description is what Claude pattern-matches against the user’s request.
Why Skills, not just better prompts
Three reasons Skills outperform “write a better system prompt”:
- Composability. A coding agent can have dozens of skills loaded. Each is small and focused. A monolithic system prompt becomes hard to maintain.
- Versioning. Skills are files in a repo. They diff. They review. They roll forward and back like code.
- Scope. Skills load only when the task matches. Token budget stays small. The model sees one playbook at a time.
If you’ve been stuffing instructions into your CLAUDE.md, a skill is probably the cleaner answer.
Anatomy of a good Skill
A Skill that earns its place has:
- A description that matches user phrasings. “Generate release notes” matches more queries than “produce a CHANGELOG.md from git log.”
- An explicit step list. Numbered. Verifiable.
- Examples of the output. Models follow examples better than they follow rules.
- Notes on edge cases. “If
CHANGELOG.mddoesn’t exist, create it. If it has a different format, follow that format.” - References to scripts/templates where useful. Don’t make Claude reinvent the wheel.
Three rules I follow:
- Skills should be specific, not generic. “Refactor code” is too broad. “Migrate a Django ModelForm to a Pydantic Schema” is right.
- Skills should be self-contained. Don’t reference files outside the skill directory unless you have to.
- Skills should fail loudly. If a precondition isn’t met (no
CHANGELOG.mdtemplate, dirty git tree), the skill says so instead of guessing.
Skills vs MCP servers — the divide
I see these conflated. They’re complementary:
| Skill | MCP server | |
|---|---|---|
| What it is | Instructions + assets | Code that exposes tools/resources |
| When loaded | When the description matches | Always available |
| Where it runs | In Claude’s context | As a separate process |
| Best for | Workflows, playbooks, repeatable tasks | Integrating external systems (DB, APIs, files) |
A typical setup uses both. Skills describe the playbook (“how do I generate release notes?”). MCP servers integrate the systems (“read my GitHub PRs”, “query my Postgres”). See Model Context Protocol Explained .
Skills you should have on day one
These are the skills I install in every new repo:
1. code-review
---
name: code-review
description: Review pull requests for correctness, style, security, and tests.
---
Open the PR. For each changed file, evaluate: (a) does it solve the problem
described in the PR; (b) any obvious bugs; (c) tests cover the change; (d)
naming and structure consistent with the rest of the repo. Output a markdown
review with **Must change** and **Nice to have** sections.
2. git-commit
---
name: git-commit
description: Stage and commit changes with a Conventional Commits message.
---
Run `git status` and `git diff` to see changes. Group by logical scope. For
each group, stage the files and commit with `<type>(<scope>): <message>` where
`<type>` is one of feat/fix/chore/docs/refactor/test/perf.
3. repo-bootstrap
A skill that knows your house’s standard project layout, picks the right template, scaffolds it, opens an editor. See Modern Python Tooling 2026 for a Python equivalent.
4. db-migration
---
name: db-migration
description: Generate and apply a database migration safely.
---
1. Read schema changes from the diff or user description.
2. Generate a migration file with the project's tool (alembic, drizzle-kit, etc.).
3. Review for backwards-compatibility (no DROP COLUMN without a deprecation phase).
4. Apply to a local DB and run tests.
5. Open a PR with the migration and a deployment note.
These four skills cover ~80% of recurring tasks in any backend repo.
Multi-session patterns
The 2026 unlock isn’t bigger context windows — it’s multiple coordinated sessions.
Writer / Reviewer
Session A (writer): implements the change
Session B (reviewer): fresh context, reviews A's diff
The reviewer doesn’t see how the code was written; only what the diff says. This kills the “I just wrote it, it must be right” bias and catches more issues than self-review.
Run pattern in Claude Code:
- Session A: “implement the change.”
- Save the branch.
- Session B (fresh): “review branch
feature/xfor correctness, style, and tests.” - Apply review feedback in session A or a new session C.
Spec / Implement
Session A (spec): writes a Markdown spec from the user's requirements
Session B (implement): reads the spec, implements
Session C (verify): reads the spec and the diff, confirms requirements met
Useful for non-trivial features. The spec becomes a permanent artifact in the PR.
Triage / Fix
Session A (triage): reads a bug report, finds the root cause, writes a plan
Session B (fix): applies the plan
Used heavily by Anthropic’s internal SRE team. Triage requires breadth (find the cause); fix requires focus (apply the change). Different sessions, different system prompts.
How to keep agents productive
A few patterns I keep reaching for:
1. Tight loops
Let the agent iterate fast. Run tests on every change. Compile on every change. The fewer round-trips of “agent writes code, you run tests, you copy back errors,” the better.
In Claude Code, configure the test runner so the agent can invoke it directly. Auto-run mode handles this.
2. Small commits
Commit frequently. Each commit is a “save point” you can roll back to. When the agent goes off the rails (it will), git reset --hard is your friend.
3. Explicit approval gates
For risky operations (destructive deletes, force-push, schema migrations on production), require explicit confirmation. Configure a Skill that pauses before such actions and waits for approval.
4. Persistent rules in CLAUDE.md
Project-wide constants — naming conventions, where things live, what tools to use — go in CLAUDE.md at the repo root. Per-task playbooks go in Skills. Per-session context goes in the chat. Three layers, three lifetimes.
5. Don’t argue, redirect
If the agent goes the wrong direction, don’t reason it back. Stop, give a one-line course correction, restart from a clean state. Saves tokens and time.
Skills marketplace
By 2026 there are thousands of community skills:
- Anthropic’s official skills repo (the canonical examples).
- Skill registries like skill-libraries and various community indexes.
- Vendor-shipped skills (Drizzle, Prisma, AWS CDK, Terraform, etc.).
Pick conservatively. A skill in your loaded set costs context. Curate, don’t hoard.
Production-grade agentic coding
The pattern I see at companies actually shipping AI-written production code in 2026:
- Strict scope: agents handle “small but real” changes — bug fixes, refactor passes, dependency updates, test additions. Not green-field architecture.
- Always reviewed. Either by another agent (fresh context) or a human. No direct-to-main from an agent.
- CI as the safety net. Tests, type checks, linters, security scanners all run on agent PRs.
- Skills for repeatable patterns. Each common task is a skill, not a re-derived plan.
- Observability. Track which PRs an agent opened, time-to-merge, defect rate. Treat agents like a new junior dev — measure, coach, iterate.
The reason this works: small changes + thorough review + automation catches errors before they ship. The agent is fast and tireless; the system catches its mistakes.
Common failure modes
1. Skills too vague
A skill named “fix bugs” with description “fix the bug” is just a system prompt. Skills earn their cost when they encode specific knowledge.
2. Too many skills
Loading 50 skills bloats context, slows responses, confuses the model. Curate to ~10–20 active skills per repo.
3. Skills that depend on undeclared context
A skill that says “use the standard release format” without saying what the standard is will guess. Embed the format or reference a template.
4. No verification step
A skill that ends “and that’s done” without a test/verify step often produces something that compiles but doesn’t work. Always include “run the tests” or “verify by …”.
5. Single-session for everything
Long-running sessions accumulate context drift. The agent forgets what was decided 80 messages ago. Use multiple sessions for distinct phases.
What’s coming
- Multi-agent orchestration built into IDEs (writer/reviewer/critic patterns as a first-class feature).
- Skill discovery — Claude Code suggesting skills you might want based on your repo.
- Sandboxed skills — running skill-provided scripts in a hermetic environment.
- Org-level skills — companies publishing internal skill catalogs.
The trajectory is clear: skills become the unit of organizational knowledge for AI workflows.
Read this next
- Anthropic Claude API + Tool Use Guide — the API foundation.
- Model Context Protocol (MCP) Explained — how skills compose with MCP.
- AI Agents with LangGraph in 2026 — A Practical Tutorial — agent shapes for non-coding tasks.
- Prompt Engineering Patterns That Survive Production — patterns that complement skills.
If you want a starter Claude Code Skills repo with the four core skills above wired up for a Python+TypeScript monorepo, it’s at rajpoot.dev .
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .