Django ORM Deep Dive: QuerySets, N+1, and Making the Database Behave

The Django ORM is one of those things that makes you feel productive on day one and then humbles you in production six months later. Most people learn enough of it to define models, run migrations, and write .objects.filter(...). But there’s a deep, well-designed system underneath, and once you know how it actually works, you’ll write queries that are 10× faster without writing more code.

This post is the deep dive I wish I’d read earlier. We’ll cover what a queryset really is, why your code is making 200 queries when it should make 2, and the patterns that keep production Django apps fast.

What a queryset actually is

When you write:

posts = Post.objects.filter(author=user)

no SQL is executed. That line just creates a QuerySet object. The query runs only when you do something that needs the data:

Iterate (for post in posts:)
Index it (posts[0])
Slice it (posts[:10]) — though slicing returns another queryset
Convert to a list (list(posts))
Call .count(), .exists(), .first(), .get(), .aggregate()
Use it in a template

This is laziness. Querysets defer execution as long as possible, which lets you compose them:

posts = Post.objects.filter(published=True)
posts = posts.filter(author=user)
posts = posts.order_by("-created_at")
posts = posts[:10]
# Still no SQL has run.

for p in posts:  # ← this runs ONE query
    print(p.title)

Each filter/order/slice returns a new queryset — none of them touch the database. Internally Django builds up the SQL and fires it once when you finally consume the result.

The N+1 problem

Now the most common Django performance bug:

posts = Post.objects.all()                # 1 query
for post in posts:                        # iterate
    print(post.author.name)               # ← 1 query EACH TIME

If you have 100 posts, that’s 101 queries — 1 to get the posts, then 1 per post to fetch each author. This is N+1.

The same code with select_related:

posts = Post.objects.select_related("author")   # 1 query, with JOIN
for post in posts:
    print(post.author.name)                     # no extra query

Two queries total become one. When you’re iterating thousands of records (typical for batch jobs, exports, dashboards), this difference can be the gap between “fast enough” and “the request times out.”

The trick is recognizing the pattern: any time you access a foreign key inside a loop, you have an N+1 problem unless you’ve prefetched.

`select_related` vs `prefetch_related`

The two main tools:

`select_related` — for one-to-one and many-to-one (`ForeignKey`)

Generates a SQL JOIN. Brings the related object back in the same query.

# 1 query with INNER JOIN
posts = Post.objects.select_related("author", "category")

Use it for: ForeignKey relations (Post.author, Comment.post), OneToOneField.

`prefetch_related` — for many-to-many and reverse foreign keys

Issues a second query, then matches the results in Python.

# 2 queries: one for posts, one for ALL related comments,
# then Django joins them in memory.
posts = Post.objects.prefetch_related("comments")
for post in posts:
    for comment in post.comments.all():       # no extra queries
        print(comment.body)

Use it for: ManyToManyField, reverse ForeignKey (Post.comments if Comment has post = ForeignKey(Post)), reverse OneToOneField.

Combine them freely:

posts = (
    Post.objects
        .select_related("author", "category")
        .prefetch_related("comments", "tags")
)

That’s 3 queries total: posts + comments + tags. Without it, you’d be in N+1 hell.

Detecting N+1 in development

Don’t eyeball it — measure it. The two best tools:

`django-debug-toolbar`

pip install django-debug-toolbar

Adds a panel to every page showing every query the request made. If you see “Posts: 100 queries”, you have N+1.

`django-silk`

Heavier but better for APIs and async views. Shows query counts per view and lets you replay slow queries.

A quick assertion in tests

from django.test.utils import CaptureQueriesContext
from django.db import connection

def test_post_list_is_efficient():
    with CaptureQueriesContext(connection) as ctx:
        response = client.get("/posts/")
    assert len(ctx) <= 5, f"Too many queries: {len(ctx)}"

Lock query counts in tests and your CI catches regressions automatically.

QuerySet chaining: the tools you’ll use most

# Filtering
Post.objects.filter(published=True)
Post.objects.exclude(status="draft")

# Field lookups
Post.objects.filter(title__icontains="django")          # case-insensitive contains
Post.objects.filter(created_at__gte=last_week)          # >= last_week
Post.objects.filter(author__email__endswith="@x.com")   # follow FK with __

# Q objects for OR / complex logic
from django.db.models import Q
Post.objects.filter(Q(title__icontains="django") | Q(title__icontains="python"))

# Ordering
Post.objects.order_by("-created_at", "title")

# Limiting
Post.objects.all()[:10]                                 # LIMIT 10
Post.objects.all()[10:20]                               # LIMIT 10 OFFSET 10

The double-underscore (__) lookup syntax is the most powerful part of the ORM. It lets you traverse relationships and use any of dozens of built-in lookups (__exact, __iexact, __in, __range, __date__year, __regex, etc.).

`only`, `defer`, and `values` — controlling columns

By default, .filter() selects every column. For wide tables, that’s wasteful.

# Only fetch these columns from the DB
Post.objects.only("id", "title")

# The opposite: fetch everything EXCEPT these
Post.objects.defer("body", "html")

# Skip the ORM and get dicts (super fast for read-only data)
Post.objects.values("id", "title")

# Or get tuples
Post.objects.values_list("id", "title")

values() is much faster than full ORM queries when you only need a couple of fields, because Django doesn’t instantiate model objects.

Aggregations

from django.db.models import Count, Avg, Sum, Max

# How many posts does each user have?
User.objects.annotate(post_count=Count("posts"))

# What's the average rating per category?
Category.objects.annotate(avg_rating=Avg("posts__rating"))

# Site-wide stats
Post.objects.aggregate(
    total=Count("id"),
    avg_views=Avg("views"),
    most_views=Max("views"),
)

annotate() adds a calculated column to each row in the queryset; aggregate() collapses everything to a single dict. Both push the work to the database, where it belongs.

Using `Q` and `F` for advanced queries

Q is for OR / NOT logic. F is for “use the value of another column in this row”:

from django.db.models import F

# Posts where view_count > like_count
Post.objects.filter(view_count__gt=F("like_count"))

# Increment a counter atomically — no race condition
Post.objects.filter(id=post_id).update(view_count=F("view_count") + 1)

That last line is important: doing post.view_count += 1; post.save() has a race condition (two requests can both read 5, both write 6, and you’ve lost an increment). The F() version pushes the math to SQL where it’s atomic.

Bulk operations

Loops of obj.save() are slow. Use bulk operations:

# Insert thousands of rows in one query
Post.objects.bulk_create([Post(title=t) for t in titles])

# Update many rows in one query
Post.objects.filter(status="draft").update(status="archived")

# Update specific objects (Django 4+)
posts = list(Post.objects.filter(...))
for post in posts:
    post.status = compute_status(post)
Post.objects.bulk_update(posts, ["status"])

Going from 10,000 individual saves to one bulk insert can take a job from 30 seconds to 30 milliseconds.

Transactions

Wrap multi-step operations in transactions so they succeed or fail together:

from django.db import transaction

with transaction.atomic():
    order = Order.objects.create(user=user, total=100)
    Payment.objects.create(order=order, amount=100)
    user.balance -= 100
    user.save()

If any of those raise, the whole block rolls back. For methods that should always be transactional, use the decorator:

@transaction.atomic
def create_order(user, items):
    ...

Pair this with select_for_update() to lock specific rows when you read them:

with transaction.atomic():
    user = User.objects.select_for_update().get(id=user_id)
    user.balance -= amount
    user.save()

Raw SQL escape hatch

When the ORM gets in your way, drop down to SQL:

posts = Post.objects.raw(
    "SELECT * FROM blog_post WHERE created_at > %s ORDER BY views DESC LIMIT 10",
    [last_week],
)

Or fully bypass the ORM:

from django.db import connection

with connection.cursor() as cur:
    cur.execute("SELECT count(*) FROM blog_post WHERE published = true")
    (count,) = cur.fetchone()

Don’t be afraid of raw SQL for analytics queries, complex window functions, or anything Postgres-specific. The ORM is great; raw SQL is sometimes greater.

Production tips

Index your foreign keys. Django does this automatically for ForeignKey, but if you query by other fields a lot (status, created_at), add db_index=True or a composite Meta.indexes.
Use connection.queries in dev to see exactly what SQL Django generated.
Cache .count() results when possible — counting all rows in a big table is surprisingly expensive.
Use iterator() for huge result sets to avoid loading everything into memory: for p in Post.objects.iterator(chunk_size=2000): ....
Use EXPLAIN ANALYZE on slow queries — see PostgreSQL Fundamentals for a primer.

Conclusion

The Django ORM rewards the time you spend understanding it. The “secret” to fast Django code isn’t writing less ORM — it’s writing the right ORM. Lazy querysets, select_related/prefetch_related, F expressions, bulk operations, transactions — these are the tools that separate “works” from “scales.”

If you’re using PostgreSQL with Django, also check out:

Happy querying!

Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .

What a queryset actually is#

The N+1 problem#

select_related vs prefetch_related#

select_related — for one-to-one and many-to-one (ForeignKey)#

prefetch_related — for many-to-many and reverse foreign keys#

Detecting N+1 in development#

django-debug-toolbar#

django-silk#

A quick assertion in tests#

QuerySet chaining: the tools you’ll use most#

only, defer, and values — controlling columns#

Aggregations#

Using Q and F for advanced queries#

Bulk operations#

Transactions#

Raw SQL escape hatch#

Production tips#

Conclusion#

What a queryset actually is

The N+1 problem

`select_related` vs `prefetch_related`

`select_related` — for one-to-one and many-to-one (`ForeignKey`)

`prefetch_related` — for many-to-many and reverse foreign keys

Detecting N+1 in development

`django-debug-toolbar`

`django-silk`

A quick assertion in tests

QuerySet chaining: the tools you’ll use most

`only`, `defer`, and `values` — controlling columns

Aggregations

Using `Q` and `F` for advanced queries

Bulk operations

Transactions

Raw SQL escape hatch

Production tips

Conclusion