Should I use Pydantic v2 over dataclasses?

Pydantic for anything crossing a system boundary — API I/O, config, data files. Dataclasses for purely internal types where you control all inputs. Pydantic costs a little more memory; gives you validation, serialization, JSON Schema, and runtime type enforcement in return.

Is Pydantic v2 fast enough for hot paths?

Yes. v2's Rust core is 5–50× faster than v1. For most API workloads, validation cost is dwarfed by the database query. Use strict mode and avoid arbitrary-type fields for max speed.

Should I use validators or model_validator?

field_validator for per-field rules; model_validator for cross-field constraints (e.g., 'end_date must be after start_date'). Use mode='before' for input coercion, mode='after' for invariants on the parsed value.

Pydantic v2 Deep Dive — The Patterns Every Backend Python Developer Needs

Pydantic v2 is the typed-data layer of modern Python. FastAPI, Pydantic AI, Instructor, every modern config library, half the LangChain ecosystem — all built on it. This post is the deep dive: model design, validators, serializers, settings, performance, and the patterns that pay off in production.

If you’re still on v1 in 2026, this is also a migration nudge. The API is mostly compatible, the perf is dramatically better.

Why v2 changed everything

Pydantic v2 ships a Rust core (pydantic-core). v1 was pure Python. The rewrite produced:

5–50× faster validation.
Half the memory.
Strict mode (no implicit coercions).
Proper discriminated unions.
JSON Schema 2020-12 conformance.

For an API that processes thousands of requests per second, v2 lifts a tax that v1 quietly imposed.

Model basics

from pydantic import BaseModel, ConfigDict, Field, EmailStr
from datetime import datetime
from decimal import Decimal


class Order(BaseModel):
    model_config = ConfigDict(
        str_strip_whitespace=True,
        validate_assignment=True,        # re-validate on attribute set
        extra="forbid",                  # raise on unexpected fields
    )

    id: int
    customer_email: EmailStr
    total: Decimal = Field(..., gt=0, decimal_places=2)
    currency: str = Field(..., pattern=r"^[A-Z]{3}$")
    notes: str = ""
    created_at: datetime = Field(default_factory=datetime.utcnow)

Things that just work:

EmailStr validates RFC-compliant emails.
Decimal with decimal_places=2 rejects fractional cents.
pattern=r"..." regex on the string.
default_factory for non-constant defaults.
extra="forbid" rejects unknown fields — important for API safety.

Validators

Field validators

from pydantic import field_validator

class User(BaseModel):
    email: str
    full_name: str

    @field_validator("email", mode="before")
    @classmethod
    def lowercase_email(cls, v: str) -> str:
        return v.lower().strip()

    @field_validator("full_name")
    @classmethod
    def name_must_not_be_empty(cls, v: str) -> str:
        if not v.strip():
            raise ValueError("full_name required")
        return v.strip()

mode="before" runs before the type coercion. mode="after" (the default) runs after. Use before for normalization (lowercase, strip), after for invariants.

Model validators

from pydantic import model_validator
from typing import Self

class DateRange(BaseModel):
    start: datetime
    end: datetime

    @model_validator(mode="after")
    def end_after_start(self) -> Self:
        if self.end <= self.start:
            raise ValueError("end must be after start")
        return self

Cross-field rules go here. Always return self from mode="after".

Strict mode — the speedup

By default, Pydantic coerces: "1" becomes 1, "true" becomes True. Convenient but sometimes wrong. Strict mode disables coercion:

class Config(BaseModel):
    model_config = ConfigDict(strict=True)

    port: int
    debug: bool

# This now raises:
Config(port="8080", debug="true")    # ⛔ ValidationError
Config(port=8080, debug=True)        # ✅

Strict mode is faster (no coercion attempts) and safer (no surprise type promotion). Reach for it when input shapes are stable. For HTTP/JSON APIs you usually want lax (HTTP querystrings are all strings); for config and internal types, strict.

Serialization

class User(BaseModel):
    id: int
    email: str
    password_hash: str = Field(..., exclude=True)     # never serialized
    created_at: datetime


user = User(id=1, email="[email protected]", password_hash="...", created_at=datetime.now())

user.model_dump()                 # → dict
user.model_dump_json()            # → JSON string
user.model_dump(exclude={"created_at"})
user.model_dump(exclude_unset=True)   # only fields the user explicitly set
user.model_dump(by_alias=True)        # use field aliases

Field aliases

class APIPayload(BaseModel):
    full_name: str = Field(..., alias="fullName")     # accept camelCase from JS
    model_config = ConfigDict(populate_by_name=True)  # also accept snake_case

Useful at API boundaries where your Python is snake_case but JSON clients are camelCase.

Custom serializers

from pydantic import field_serializer
from datetime import datetime

class Event(BaseModel):
    ts: datetime

    @field_serializer("ts")
    def serialize_ts(self, ts: datetime) -> str:
        return ts.isoformat()

For full control over how a field renders.

Settings

Pydantic-settings turns env vars into typed config:

from pydantic import PostgresDsn, RedisDsn, SecretStr
from pydantic_settings import BaseSettings, SettingsConfigDict


class Settings(BaseSettings):
    model_config = SettingsConfigDict(
        env_file=".env",
        env_prefix="APP_",
        env_nested_delimiter="__",
    )

    env: str = "dev"
    database_url: PostgresDsn
    redis_url: RedisDsn | None = None
    jwt_secret: SecretStr
    log_level: str = "INFO"
    feature_flags: dict[str, bool] = {}

export APP_DATABASE_URL=postgresql://user:pass@host/db
export APP_JWT_SECRET=...
export APP_FEATURE_FLAGS__NEW_UI=true        # nested via __

SecretStr redacts the value in logs ('**********') but exposes .get_secret_value() when you actually need it. Worth using for any sensitive value.

For the broader app skeleton see FastAPI + Pydantic v2 + SQLAlchemy 2.0 .

Discriminated unions

When a field can be one of several types:

from typing import Literal, Annotated
from pydantic import Field

class TextEvent(BaseModel):
    type: Literal["text"]
    content: str

class ClickEvent(BaseModel):
    type: Literal["click"]
    x: int
    y: int

Event = Annotated[
    TextEvent | ClickEvent,
    Field(discriminator="type"),
]


class Stream(BaseModel):
    events: list[Event]

Pydantic uses the type field to pick the right model. Faster than trying every variant; gives clear errors.

RootModel

For when the value isn’t a dict:

from pydantic import RootModel

class UserList(RootModel[list[User]]):
    pass

UserList.model_validate_json('[{"id":1,...}]')

Useful for top-level arrays in JSON APIs.

Performance tips

For hot paths:

Use `model_validate` not `parse_obj` (v1 alias)

parse_obj exists for compat but is slower. Always model_validate(data) and model_validate_json(json_str) directly.

Avoid `Any`

metadata: dict                      # implicit Any values; slow path
metadata: dict[str, str | int]      # typed; fast path

Avoid `arbitrary_types_allowed=True`

Pydantic skips validation on arbitrary types. Saves you from defining a wrapper but kills the performance benefit. Only when you must.

Consider `model_config = ConfigDict(strict=True)`

Coercion isn’t free.

TypeAdapter for one-off types

from pydantic import TypeAdapter

UserList = TypeAdapter(list[User])
parsed = UserList.validate_python(raw)

Use a TypeAdapter when you don’t want a full BaseModel for a one-off shape. ~2× faster than wrapping in a model.

SQLAlchemy and Pydantic together

The 2026 pattern: SQLAlchemy 2.0 ORM models for the DB, Pydantic models for API shapes. Convert at the boundary:

class UserDB(Base):
    __tablename__ = "users"
    id: Mapped[int] = mapped_column(primary_key=True)
    email: Mapped[str]
    full_name: Mapped[str]
    is_active: Mapped[bool] = mapped_column(default=True)


class UserOut(BaseModel):
    id: int
    email: str
    full_name: str
    is_active: bool
    model_config = ConfigDict(from_attributes=True)


# Convert
api_user = UserOut.model_validate(user_db_row)

from_attributes=True (formerly orm_mode=True in v1) reads attributes off the SQLAlchemy row. No manual mapping.

Computed fields

class Order(BaseModel):
    subtotal: Decimal
    tax_rate: Decimal

    @computed_field
    @property
    def total(self) -> Decimal:
        return (self.subtotal * (1 + self.tax_rate)).quantize(Decimal("0.01"))

computed_field makes the property show up in model_dump() and JSON schemas. Don’t use a regular @property — it’s invisible to Pydantic.

JSON Schema

schema = User.model_json_schema()
# → fully-formed JSON Schema 2020-12

OpenAPI specs, JSON-LD, AI tool definitions, JSON Schema validators — all consume this. Use it.

Common mistakes

1. Mixing v1 and v2 idioms

@validator (v1) vs @field_validator (v2). parse_obj vs model_validate. dict() vs model_dump(). The compat shims work but produce deprecation warnings; clean up.

2. `extra="allow"` by default

class Order(BaseModel):
    id: int
# Order(id=1, hidden_field="...")    # silently accepted

For APIs, set extra="forbid". Prevents typos from being silently accepted as new fields.

3. Validating in `init`

class User(BaseModel):
    name: str
    def __init__(self, **kw):
        super().__init__(**kw)
        if not self.name.startswith("user-"):
            raise ValueError(...)        # ⛔ won't be caught by ValidationError

Use validators. They produce proper ValidationError instances with field info.

4. Returning unvalidated dicts from APIs

return {"users": [{"id": u.id, "email": u.email} for u in users]}

The shape isn’t enforced. Use response_model=UserListOut (FastAPI) or wrap in a Pydantic model. Otherwise the response shape can drift silently.

5. SecretStr leaks via `repr` accidentally

If you log a Pydantic object that contains a SecretStr field, __repr__ redacts. But if you access .get_secret_value() and log that — naked. Audit your logging carefully.

Why v2 changed everything#

Model basics#

Validators#

Field validators#

Model validators#

Strict mode — the speedup#

Serialization#

Field aliases#

Custom serializers#

Settings#

Discriminated unions#

RootModel#

Performance tips#

Use model_validate not parse_obj (v1 alias)#

Avoid Any#

Avoid arbitrary_types_allowed=True#

Consider model_config = ConfigDict(strict=True)#

TypeAdapter for one-off types#

SQLAlchemy and Pydantic together#

Computed fields#

JSON Schema#

Common mistakes#

1. Mixing v1 and v2 idioms#

2. extra="allow" by default#

3. Validating in __init__#

4. Returning unvalidated dicts from APIs#

5. SecretStr leaks via __repr__ accidentally#

Read this next#