Cheatsheet for Pydantic perf.

Schemas compile once

Module-scope models compile their core schema once. Subsequent validations reuse.

TypeAdapter at module scope

from pydantic import TypeAdapter

# Module-scope (good)
USERS_ADAPTER = TypeAdapter(list[User])

def parse(data):
    return USERS_ADAPTER.validate_python(data)

Don’t recreate per-call:

# BAD
def parse(data):
    adapter = TypeAdapter(list[User])   # rebuilds schema per call
    return adapter.validate_python(data)

model_validate_json over json.loads + model_validate

# Slower
data = json.loads(raw)
user = User.model_validate(data)

# Faster (Rust JSON path)
user = User.model_validate_json(raw)

Use model_validate_json in hot paths.

model_dump_json over json.dumps + model_dump

# Slower
json.dumps(user.model_dump())

# Faster (Rust serializer)
user.model_dump_json()

defer_build

class M(BaseModel):
    model_config = {"defer_build": True}

Schema compiles on first use, not class definition. Speeds startup with many models.

Frozen for hashability

class Coord(BaseModel):
    model_config = {"frozen": True}
    x: float
    y: float

# Now usable as dict key / in sets

Micro-benefit; real for hot paths.

Discriminated unions over plain unions

Untagged Cat | Dog: Pydantic tries each. Tagged with Field(discriminator="kind"): O(1) dispatch.

Pydantic v2 vs msgspec

Pydantic v2msgspec
EngineRustC
Validation speedFastFaster (esp. JSON)
FeaturesRich (validators, serializers, JSON Schema)Minimal
EcosystemMassiveSmaller
FastAPI integrationNativeManual

For most apps: Pydantic. For 100k+ msg/s decode: msgspec.

msgspec example

import msgspec

class User(msgspec.Struct):
    id: int
    email: str

user = msgspec.json.decode(raw_bytes, type=User)

5-10× faster than Pydantic for narrow decode-heavy work.

Avoid validation overhead in trusted code

# Slow: validation every time
user = User.model_validate(data)

# Faster: model_construct (skips validation; you swear data is correct)
user = User.model_construct(**data)

Don’t trust input from outside the system. model_construct for internal trusted conversions.

Schema caching

cache_logger_on_first_use-style behavior is automatic. Set explicit model_config["arbitrary_types_allowed"] only when needed (slows validation).

Avoid huge default mutations

# BAD: shared mutable default
class M(BaseModel):
    items: list[Item] = []

Use Field(default_factory=list).

Streaming validation for huge inputs

For very large lists:

adapter = TypeAdapter(User)

def stream_parse(records):
    for r in records:
        yield adapter.validate_python(r)

Don’t accumulate in memory.

Concurrent validation

Validation is CPU. Threads share the GIL (mostly):

import asyncio

async def validate_many(items):
    return await asyncio.gather(*[
        asyncio.to_thread(User.model_validate, it) for it in items
    ])

Marginal benefit since Pydantic’s Rust core releases GIL. Profile.

Profiling

python -X dev -m cProfile -o profile.out script.py
snakeviz profile.out

Find where validation dominates.

When perf matters

  • API endpoints serving > 5k req/s.
  • Streaming pipelines processing > 100k msg/s.
  • Long batch jobs with tight cost budget.

Most apps: Pydantic isn’t the bottleneck.

Common mistakes

  • Recreating TypeAdapter / schema per call.
  • json.dumps(model.model_dump()) in hot paths.
  • Wide unions without discriminator.
  • Strict mode globally — adds checks; sometimes slows things.

Read this next

If you want my pydantic-vs-msgspec benchmark, it’s at rajpoot.dev .


Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .