Cheatsheet for Pydantic perf.
Schemas compile once
Module-scope models compile their core schema once. Subsequent validations reuse.
TypeAdapter at module scope
from pydantic import TypeAdapter
# Module-scope (good)
USERS_ADAPTER = TypeAdapter(list[User])
def parse(data):
return USERS_ADAPTER.validate_python(data)
Don’t recreate per-call:
# BAD
def parse(data):
adapter = TypeAdapter(list[User]) # rebuilds schema per call
return adapter.validate_python(data)
model_validate_json over json.loads + model_validate
# Slower
data = json.loads(raw)
user = User.model_validate(data)
# Faster (Rust JSON path)
user = User.model_validate_json(raw)
Use model_validate_json in hot paths.
model_dump_json over json.dumps + model_dump
# Slower
json.dumps(user.model_dump())
# Faster (Rust serializer)
user.model_dump_json()
defer_build
class M(BaseModel):
model_config = {"defer_build": True}
Schema compiles on first use, not class definition. Speeds startup with many models.
Frozen for hashability
class Coord(BaseModel):
model_config = {"frozen": True}
x: float
y: float
# Now usable as dict key / in sets
Micro-benefit; real for hot paths.
Discriminated unions over plain unions
Untagged Cat | Dog: Pydantic tries each. Tagged with Field(discriminator="kind"): O(1) dispatch.
Pydantic v2 vs msgspec
| Pydantic v2 | msgspec | |
|---|---|---|
| Engine | Rust | C |
| Validation speed | Fast | Faster (esp. JSON) |
| Features | Rich (validators, serializers, JSON Schema) | Minimal |
| Ecosystem | Massive | Smaller |
| FastAPI integration | Native | Manual |
For most apps: Pydantic. For 100k+ msg/s decode: msgspec.
msgspec example
import msgspec
class User(msgspec.Struct):
id: int
email: str
user = msgspec.json.decode(raw_bytes, type=User)
5-10× faster than Pydantic for narrow decode-heavy work.
Avoid validation overhead in trusted code
# Slow: validation every time
user = User.model_validate(data)
# Faster: model_construct (skips validation; you swear data is correct)
user = User.model_construct(**data)
Don’t trust input from outside the system. model_construct for internal trusted conversions.
Schema caching
cache_logger_on_first_use-style behavior is automatic. Set explicit model_config["arbitrary_types_allowed"] only when needed (slows validation).
Avoid huge default mutations
# BAD: shared mutable default
class M(BaseModel):
items: list[Item] = []
Use Field(default_factory=list).
Streaming validation for huge inputs
For very large lists:
adapter = TypeAdapter(User)
def stream_parse(records):
for r in records:
yield adapter.validate_python(r)
Don’t accumulate in memory.
Concurrent validation
Validation is CPU. Threads share the GIL (mostly):
import asyncio
async def validate_many(items):
return await asyncio.gather(*[
asyncio.to_thread(User.model_validate, it) for it in items
])
Marginal benefit since Pydantic’s Rust core releases GIL. Profile.
Profiling
python -X dev -m cProfile -o profile.out script.py
snakeviz profile.out
Find where validation dominates.
When perf matters
- API endpoints serving > 5k req/s.
- Streaming pipelines processing > 100k msg/s.
- Long batch jobs with tight cost budget.
Most apps: Pydantic isn’t the bottleneck.
Common mistakes
- Recreating TypeAdapter / schema per call.
json.dumps(model.model_dump())in hot paths.- Wide unions without discriminator.
- Strict mode globally — adds checks; sometimes slows things.
Read this next
If you want my pydantic-vs-msgspec benchmark, it’s at rajpoot.dev .
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .