Chapter 8: JSON Schema. Pydantic models → JSON Schema → OpenAPI / docs / SDK generation / LLM tool calling.

model_json_schema

class User(BaseModel):
    id: int
    email: str
    name: str | None = None

User.model_json_schema()

Returns a JSON Schema dict.

{
  "type": "object",
  "properties": {
    "id": {"type": "integer", "title": "Id"},
    "email": {"type": "string", "title": "Email"},
    "name": {"anyOf": [{"type": "string"}, {"type": "null"}], "default": null, "title": "Name"}
  },
  "required": ["id", "email"]
}

Used by FastAPI for OpenAPI generation. Also used for LLM tool-calling schemas.

Mode

User.model_json_schema(mode="validation")     # default; for input
User.model_json_schema(mode="serialization")  # for output

Different shape with computed fields, exclude, etc.

json_schema_extra

class User(BaseModel):
    email: str
    
    model_config = {
        "json_schema_extra": {
            "examples": [
                {"email": "[email protected]"},
            ]
        }
    }

Adds to the schema. Examples surface in Swagger UI.

Field-level

class User(BaseModel):
    email: str = Field(..., description="User's email address", examples=["[email protected]"])

Per-field title, description, examples.

Custom schema for type

class MyType:
    @classmethod
    def __get_pydantic_json_schema__(cls, schema, handler):
        result = handler(schema)
        result["title"] = "MyType"
        result["description"] = "Custom type for X"
        result["examples"] = ["example1"]
        return result

Customize how a type appears in JSON Schema.

OpenAPI in FastAPI

FastAPI calls model_json_schema(mode="serialization") for response models, mode="validation" for request models. Schemas dedup via $ref.

To inspect:

schema = app.openapi()
import json
print(json.dumps(schema, indent=2))

Refs / definitions

For nested models, Pydantic generates $ref:

{
  "type": "object",
  "properties": {
    "user": {"$ref": "#/$defs/User"}
  },
  "$defs": {
    "User": {...}
  }
}

OpenAPI uses #/components/schemas/User. FastAPI translates between formats.

Discriminated union schema

Discriminator generates oneOf with discriminator:

{
  "oneOf": [
    {"$ref": "#/$defs/Cat"},
    {"$ref": "#/$defs/Dog"}
  ],
  "discriminator": {"propertyName": "kind", "mapping": {...}}
}

Cleaner SDKs / docs.

For LLM tool calling

class WeatherInput(BaseModel):
    city: str = Field(..., description="City name")
    units: Literal["c", "f"] = "c"

tools = [{
    "name": "get_weather",
    "description": "Get current weather",
    "input_schema": WeatherInput.model_json_schema(),
}]

LLM providers consume the JSON schema. See LLM Tool Use Patterns .

Examples in OpenAPI

class UserCreate(BaseModel):
    email: str
    age: int
    
    model_config = {
        "json_schema_extra": {
            "examples": [
                {"email": "[email protected]", "age": 30},
                {"email": "[email protected]", "age": 25},
            ]
        }
    }

Multiple examples; Swagger lets users pick.

Nullable types

class M(BaseModel):
    x: int | None

JSON Schema:

{"x": {"anyOf": [{"type": "integer"}, {"type": "null"}]}}

OpenAPI 3.0 used "nullable": true; 3.1 uses anyOf with null. FastAPI emits 3.1 by default.

Strict types in schema

class M(BaseModel):
    age: StrictInt

Schema looks the same as int; strictness is runtime-only. JSON consumers can’t tell.

Generating SDKs

OpenAPI Generator / openapi-typescript-codegen / fastapi-codegen consume the schema:

openapi-typescript-codegen --input openapi.json --output sdk

Produces a typed TS client. Per-endpoint methods, request / response types from your Pydantic models.

Schema for documentation

For non-OpenAPI use (e.g., a README JSON Schema):

import json
print(json.dumps(User.model_json_schema(), indent=2))

Drop into docs.

Ref naming

By default, Pydantic uses module-qualified names. To customize:

class User(BaseModel):
    model_config = {"json_schema_serialization_defaults_required": True}

Or override __get_pydantic_json_schema__ for specific types.

Common mistakes

1. Using stringified types in schema

some_field: "User" (forward ref) without model_rebuild() → empty schema.

2. Wrong mode

mode="validation" shows input schema. mode="serialization" shows output. They differ for models with computed fields, aliases, exclude.

3. Massive json_schema_extra

Bloats the schema; harder to read. Examples + description; not novels.

4. Custom schema breaking validation

__get_pydantic_json_schema__ doesn’t change runtime validation. Easy to drift.

5. Trusting LLMs to produce schema-correct output

Even with structured output: validate at receipt. Don’t trust schema alone.

What’s next

Chapter 9: Settings.

Read this next


Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .