In 2026 the question for SaaS companies isn’t “should we build an API?” It’s “do we have an MCP server?” Millions of users now interact with their data through AI clients that speak the Model Context Protocol — Claude, Cursor, Zed, Windsurf, internal LLM platforms. A SaaS without an MCP server is invisible to all of them.
This post is the practical guide to shipping one.
If MCP is new to you, start with Model Context Protocol (MCP) Explained .
Why this is a distribution play
When a customer asks Claude “find my recent invoices in Stripe and email a summary,” what happens depends on whether Stripe has an MCP server.
- No MCP server: Claude can’t help. The user does it manually.
- Has MCP server: Claude calls Stripe’s MCP server, lists invoices, summarizes. The user is delighted.
The companies winning AI-mediated distribution in 2026 — GitHub, Linear, Notion, Stripe, Postgres, Slack — all ship MCP servers. The pattern is settling: the MCP server is the AI-era equivalent of a REST API.
If you sell a SaaS product, your MCP server is now part of your distribution surface. Treat it that way.
What to expose
Rule one: expose the high-frequency user surface, not every API method.
For a typical CRUD SaaS, the 5–10 tools are often:
list_<thing>— paginated list with filters.get_<thing>(id)— by ID.search_<thing>(query)— semantic / keyword search.create_<thing>(...)— primary creation flow.update_<thing>(id, ...)— common updates.archive/delete_<thing>(id)— destructive (require confirmation).list_<related>(parent_id)— relations.- A handful of domain-specific actions (
assign,mark_done,send).
Skip:
- Admin-only operations.
- Dangerous bulk ops without explicit confirmation.
- Anything where the user wouldn’t say “Claude, do X.”
Add resources for listable, addressable things:
- “List of recent invoices” → resource list.
- “This specific invoice” → resource by URI.
A good MCP server has 5–15 tools and a few resource types. Not 100 tools.
OAuth for hosted MCP
The 2026 pattern: your MCP server is an HTTP service speaking MCP over SSE. Users authenticate via your existing OAuth layer.
User opens Claude → adds your MCP server URL
↓
Claude tries to call list_invoices
↓
Server responds 401 + WWW-Authenticate header pointing to /authorize
↓
Claude opens browser to your /authorize endpoint
↓
User logs in, approves scopes, redirected back
↓
Claude has bearer token, calls list_invoices
↓
Server validates token, scopes, tenant — returns data
This is OAuth 2.1 with PKCE. Standard stuff. See Authentication in 2026 for the OAuth fundamentals.
The MCP-specific bits:
- Use a
mcpscope or scoped subset of your OAuth scopes. - Bind the token to the user + tenant context.
- Audit every tool call as
(user, tenant, tool, args, result).
A working TypeScript MCP server
The cleanest 2026 stack: Hono on Bun + the official MCP SDK + your Postgres. You can deploy this anywhere — Fly, Render, your own Kubernetes, Cloudflare Workers.
// server.ts
import { McpServer, ResourceTemplate } from "@modelcontextprotocol/sdk/server/mcp.js";
import { SSEServerTransport } from "@modelcontextprotocol/sdk/server/sse.js";
import { Hono } from "hono";
import { z } from "zod";
import { db } from "./db";
import { invoices } from "./schema";
import { eq, ilike, desc, and } from "drizzle-orm";
const app = new Hono();
// Mint a server per session so we can bind it to a user
function buildServer(userId: number, tenantId: number): McpServer {
const server = new McpServer({ name: "billing-mcp", version: "1.0.0" });
server.tool(
"list_invoices",
"List recent invoices for the current user. Returns up to 25.",
{
query: z.string().optional().describe("Optional search term, matches invoice number or description"),
limit: z.number().int().min(1).max(100).default(25),
},
async ({ query, limit }) => {
const where = and(
eq(invoices.tenantId, tenantId),
query ? ilike(invoices.number, `%${query}%`) : undefined,
);
const rows = await db.query.invoices.findMany({
where,
orderBy: desc(invoices.createdAt),
limit,
});
const text = rows
.map((r) => `${r.number} ${r.amountCents / 100} ${r.currency} ${r.status} ${r.createdAt.toISOString()}`)
.join("\n");
return { content: [{ type: "text", text: text || "No invoices found." }] };
}
);
server.tool(
"get_invoice",
"Fetch a single invoice by ID.",
{ id: z.string() },
async ({ id }) => {
const inv = await db.query.invoices.findFirst({ where: eq(invoices.id, id) });
if (!inv || inv.tenantId !== tenantId) {
return { content: [{ type: "text", text: "Not found." }], isError: true };
}
return { content: [{ type: "text", text: JSON.stringify(inv, null, 2) }] };
}
);
return server;
}
app.get("/sse", async (c) => {
const auth = await verifyBearer(c.req.header("authorization"));
if (!auth) return c.text("unauthorized", 401);
const server = buildServer(auth.userId, auth.tenantId);
const transport = new SSEServerTransport("/sse", c.res);
await server.connect(transport);
// Hono returns the Response object the transport writes to
});
export default { port: 8080, fetch: app.fetch };
The key shape:
- One
McpServerper session, bound to the authenticated user/tenant. - Tools take Zod schemas; parameters arrive validated and typed.
- Each tool enforces tenant authorization before reading.
- Output is plain text formatted for the LLM to read (not JSON for code).
For the surrounding stack see Modern TypeScript Backend with Hono on Bun and Drizzle ORM Deep Dive .
How users install it
Two flows:
1. Per-user (Claude Code, Cursor)
User adds:
// ~/.claude/mcp_settings.json
{
"mcpServers": {
"billing": {
"url": "https://api.example.com/mcp/sse",
"auth": { "type": "oauth", "client_id": "..." }
}
}
}
The client handles the OAuth flow on first use. Subsequent calls re-use the refresh token.
2. In-product
If your SaaS has its own AI assistant, your MCP server is just one of many tools that assistant can call. The assistant authenticates as a service account; the user’s identity is passed through context.
Both flows are common. Support both if you want maximum reach.
Observability
Every tool call should log:
- Timestamp + duration.
- Tool name + arg fingerprint (sha256 of args; don’t log PII).
- User + tenant.
- Success / error.
- Tokens estimated (for billing).
server.tool("list_invoices", "...", { ... }, async (args, ctx) => {
const start = performance.now();
try {
const result = await actuallyList(args);
log({ tool: "list_invoices", user: userId, tenant: tenantId, ms: performance.now() - start, ok: true });
return result;
} catch (e) {
log({ tool: "list_invoices", user: userId, tenant: tenantId, ms: performance.now() - start, ok: false, error: String(e) });
throw e;
}
});
Pair with OpenTelemetry End-to-End so MCP traces correlate with the rest of your service.
Rate limiting and cost
MCP traffic looks different from REST API traffic. An LLM might call list_invoices 10× in a row exploring options. Rate-limit accordingly:
- Per-user: 100/min for read tools; 10/min for write tools.
- Per-tenant: a higher ceiling.
- Per-tool: tighter on expensive operations.
See Design a Rate Limiter for the algorithm patterns and the Redis-backed implementation.
Safety: destructive operations
For tools that modify or delete:
server.tool(
"delete_invoice",
"Permanently delete an invoice. This cannot be undone.",
{ id: z.string(), confirm: z.literal("yes-i-am-sure") },
async ({ id, confirm }) => {
// Confirmation gate baked into the schema
await db.delete(invoices).where(eq(invoices.id, id));
return { content: [{ type: "text", text: `Deleted ${id}` }] };
}
);
The client UI shows the LLM’s intent before execution; the user approves. The confirm literal is belt-and-suspenders — catches an LLM that calls without explicit user intent.
For very risky operations, require step-up auth (recent re-authentication) at the OAuth layer.
The MCP server as marketing
A polished MCP server is a marketing surface. Tweet about it. Add a “Connect to Claude” button on your docs. List your server in community catalogs and the various MCP registries.
The companies investing here in 2026 are the companies showing up when users say “Claude, do X with my Y.” That visibility compounds.
Checklist
For a production MCP server:
- OAuth 2.1 + PKCE auth flow.
- Per-tenant authorization enforced in every tool.
- 5–15 well-described tools (not 100).
- Resources for listable, addressable data.
- Rate limiting per user / per tool.
- Audit log of every tool call.
- Confirmation for destructive operations.
- OpenTelemetry instrumented.
- Documented for users (how to add the server, scopes, what each tool does).
- Listed in MCP catalogs.
Read this next
- Model Context Protocol (MCP) Explained
- Anthropic Claude API + Tool Use Guide
- Authentication in 2026 — Passkeys, OAuth 2.1, OIDC
- Modern TypeScript Backend with Hono on Bun
If you want a complete TypeScript MCP server starter (Hono + Drizzle + Postgres + OAuth + OTel + rate limiting), it’s at rajpoot.dev .
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .