Pydantic v2 is the typed-data layer of modern Python. FastAPI, Pydantic AI, Instructor, every modern config library, half the LangChain ecosystem — all built on it. This post is the deep dive: model design, validators, serializers, settings, performance, and the patterns that pay off in production.
If you’re still on v1 in 2026, this is also a migration nudge. The API is mostly compatible, the perf is dramatically better.
Why v2 changed everything
Pydantic v2 ships a Rust core (pydantic-core). v1 was pure Python. The rewrite produced:
- 5–50× faster validation.
- Half the memory.
- Strict mode (no implicit coercions).
- Proper discriminated unions.
- JSON Schema 2020-12 conformance.
For an API that processes thousands of requests per second, v2 lifts a tax that v1 quietly imposed.
Model basics
from pydantic import BaseModel, ConfigDict, Field, EmailStr
from datetime import datetime
from decimal import Decimal
class Order(BaseModel):
model_config = ConfigDict(
str_strip_whitespace=True,
validate_assignment=True, # re-validate on attribute set
extra="forbid", # raise on unexpected fields
)
id: int
customer_email: EmailStr
total: Decimal = Field(..., gt=0, decimal_places=2)
currency: str = Field(..., pattern=r"^[A-Z]{3}$")
notes: str = ""
created_at: datetime = Field(default_factory=datetime.utcnow)
Things that just work:
EmailStrvalidates RFC-compliant emails.Decimalwithdecimal_places=2rejects fractional cents.pattern=r"..."regex on the string.default_factoryfor non-constant defaults.extra="forbid"rejects unknown fields — important for API safety.
Validators
Field validators
from pydantic import field_validator
class User(BaseModel):
email: str
full_name: str
@field_validator("email", mode="before")
@classmethod
def lowercase_email(cls, v: str) -> str:
return v.lower().strip()
@field_validator("full_name")
@classmethod
def name_must_not_be_empty(cls, v: str) -> str:
if not v.strip():
raise ValueError("full_name required")
return v.strip()
mode="before" runs before the type coercion. mode="after" (the default) runs after. Use before for normalization (lowercase, strip), after for invariants.
Model validators
from pydantic import model_validator
from typing import Self
class DateRange(BaseModel):
start: datetime
end: datetime
@model_validator(mode="after")
def end_after_start(self) -> Self:
if self.end <= self.start:
raise ValueError("end must be after start")
return self
Cross-field rules go here. Always return self from mode="after".
Strict mode — the speedup
By default, Pydantic coerces: "1" becomes 1, "true" becomes True. Convenient but sometimes wrong. Strict mode disables coercion:
class Config(BaseModel):
model_config = ConfigDict(strict=True)
port: int
debug: bool
# This now raises:
Config(port="8080", debug="true") # ⛔ ValidationError
Config(port=8080, debug=True) # ✅
Strict mode is faster (no coercion attempts) and safer (no surprise type promotion). Reach for it when input shapes are stable. For HTTP/JSON APIs you usually want lax (HTTP querystrings are all strings); for config and internal types, strict.
Serialization
class User(BaseModel):
id: int
email: str
password_hash: str = Field(..., exclude=True) # never serialized
created_at: datetime
user = User(id=1, email="[email protected]", password_hash="...", created_at=datetime.now())
user.model_dump() # → dict
user.model_dump_json() # → JSON string
user.model_dump(exclude={"created_at"})
user.model_dump(exclude_unset=True) # only fields the user explicitly set
user.model_dump(by_alias=True) # use field aliases
Field aliases
class APIPayload(BaseModel):
full_name: str = Field(..., alias="fullName") # accept camelCase from JS
model_config = ConfigDict(populate_by_name=True) # also accept snake_case
Useful at API boundaries where your Python is snake_case but JSON clients are camelCase.
Custom serializers
from pydantic import field_serializer
from datetime import datetime
class Event(BaseModel):
ts: datetime
@field_serializer("ts")
def serialize_ts(self, ts: datetime) -> str:
return ts.isoformat()
For full control over how a field renders.
Settings
Pydantic-settings turns env vars into typed config:
from pydantic import PostgresDsn, RedisDsn, SecretStr
from pydantic_settings import BaseSettings, SettingsConfigDict
class Settings(BaseSettings):
model_config = SettingsConfigDict(
env_file=".env",
env_prefix="APP_",
env_nested_delimiter="__",
)
env: str = "dev"
database_url: PostgresDsn
redis_url: RedisDsn | None = None
jwt_secret: SecretStr
log_level: str = "INFO"
feature_flags: dict[str, bool] = {}
export APP_DATABASE_URL=postgresql://user:pass@host/db
export APP_JWT_SECRET=...
export APP_FEATURE_FLAGS__NEW_UI=true # nested via __
SecretStr redacts the value in logs ('**********') but exposes .get_secret_value() when you actually need it. Worth using for any sensitive value.
For the broader app skeleton see FastAPI + Pydantic v2 + SQLAlchemy 2.0 .
Discriminated unions
When a field can be one of several types:
from typing import Literal, Annotated
from pydantic import Field
class TextEvent(BaseModel):
type: Literal["text"]
content: str
class ClickEvent(BaseModel):
type: Literal["click"]
x: int
y: int
Event = Annotated[
TextEvent | ClickEvent,
Field(discriminator="type"),
]
class Stream(BaseModel):
events: list[Event]
Pydantic uses the type field to pick the right model. Faster than trying every variant; gives clear errors.
RootModel
For when the value isn’t a dict:
from pydantic import RootModel
class UserList(RootModel[list[User]]):
pass
UserList.model_validate_json('[{"id":1,...}]')
Useful for top-level arrays in JSON APIs.
Performance tips
For hot paths:
Use model_validate not parse_obj (v1 alias)
parse_obj exists for compat but is slower. Always model_validate(data) and model_validate_json(json_str) directly.
Avoid Any
metadata: dict # implicit Any values; slow path
metadata: dict[str, str | int] # typed; fast path
Avoid arbitrary_types_allowed=True
Pydantic skips validation on arbitrary types. Saves you from defining a wrapper but kills the performance benefit. Only when you must.
Consider model_config = ConfigDict(strict=True)
Coercion isn’t free.
TypeAdapter for one-off types
from pydantic import TypeAdapter
UserList = TypeAdapter(list[User])
parsed = UserList.validate_python(raw)
Use a TypeAdapter when you don’t want a full BaseModel for a one-off shape. ~2× faster than wrapping in a model.
SQLAlchemy and Pydantic together
The 2026 pattern: SQLAlchemy 2.0 ORM models for the DB, Pydantic models for API shapes. Convert at the boundary:
class UserDB(Base):
__tablename__ = "users"
id: Mapped[int] = mapped_column(primary_key=True)
email: Mapped[str]
full_name: Mapped[str]
is_active: Mapped[bool] = mapped_column(default=True)
class UserOut(BaseModel):
id: int
email: str
full_name: str
is_active: bool
model_config = ConfigDict(from_attributes=True)
# Convert
api_user = UserOut.model_validate(user_db_row)
from_attributes=True (formerly orm_mode=True in v1) reads attributes off the SQLAlchemy row. No manual mapping.
Computed fields
class Order(BaseModel):
subtotal: Decimal
tax_rate: Decimal
@computed_field
@property
def total(self) -> Decimal:
return (self.subtotal * (1 + self.tax_rate)).quantize(Decimal("0.01"))
computed_field makes the property show up in model_dump() and JSON schemas. Don’t use a regular @property — it’s invisible to Pydantic.
JSON Schema
schema = User.model_json_schema()
# → fully-formed JSON Schema 2020-12
OpenAPI specs, JSON-LD, AI tool definitions, JSON Schema validators — all consume this. Use it.
Common mistakes
1. Mixing v1 and v2 idioms
@validator (v1) vs @field_validator (v2). parse_obj vs model_validate. dict() vs model_dump(). The compat shims work but produce deprecation warnings; clean up.
2. extra="allow" by default
class Order(BaseModel):
id: int
# Order(id=1, hidden_field="...") # silently accepted
For APIs, set extra="forbid". Prevents typos from being silently accepted as new fields.
3. Validating in __init__
class User(BaseModel):
name: str
def __init__(self, **kw):
super().__init__(**kw)
if not self.name.startswith("user-"):
raise ValueError(...) # ⛔ won't be caught by ValidationError
Use validators. They produce proper ValidationError instances with field info.
4. Returning unvalidated dicts from APIs
return {"users": [{"id": u.id, "email": u.email} for u in users]}
The shape isn’t enforced. Use response_model=UserListOut (FastAPI) or wrap in a Pydantic model. Otherwise the response shape can drift silently.
5. SecretStr leaks via __repr__ accidentally
If you log a Pydantic object that contains a SecretStr field, __repr__ redacts. But if you access .get_secret_value() and log that — naked. Audit your logging carefully.
Read this next
- FastAPI + Pydantic v2 + SQLAlchemy 2.0 Production Patterns
- Modern Python Tooling 2026
- Structured Output for LLMs — Pydantic at the AI boundary.
- Modern AsyncIO Patterns
If you want a Pydantic v2 cheat sheet (with all of the above plus less-known patterns), it’s at rajpoot.dev .
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .