Aggregation pipeline.
Core stages
db.orders.aggregate([
{ $match: { status: "completed", created: { $gte: ISODate("2026-01-01") } } },
{ $group: { _id: "$user_id", total: { $sum: "$amount" }, count: { $sum: 1 } } },
{ $sort: { total: -1 } },
{ $limit: 100 },
{ $project: { user_id: "$_id", total: 1, count: 1, avg: { $divide: ["$total", "$count"] }, _id: 0 } },
])
$match early
Push $match first; uses indexes. Same for $sort (with index).
$group operators
$sum $avg $min $max
$first $last
$push (array)
$addToSet
$count
$lookup (join)
{ $lookup: {
from: "users",
localField: "user_id",
foreignField: "_id",
as: "user",
} },
{ $unwind: "$user" },
Slower than relational JOIN. Use when needed; index foreignField.
$lookup with pipeline
{ $lookup: {
from: "posts",
let: { uid: "$_id" },
pipeline: [
{ $match: { $expr: { $eq: ["$author_id", "$$uid"] } } },
{ $sort: { created: -1 } },
{ $limit: 5 },
],
as: "recent_posts",
} }
$facet (multiple pipelines)
{ $facet: {
counts: [{ $group: { _id: "$status", count: { $sum: 1 } } }],
top: [{ $sort: { score: -1 } }, { $limit: 10 }],
avg: [{ $group: { _id: null, avg: { $avg: "$score" } } }],
} }
Runs sub-pipelines on same input.
$unwind
{ $unwind: "$tags" } // explode array
{ $unwind: { path: "$tags", preserveNullAndEmptyArrays: true } }
$addFields / $set
{ $addFields: { full_name: { $concat: ["$first", " ", "$last"] } } }
$project
{ $project: { name: 1, email: 1, _id: 0 } }
Window functions ($setWindowFields)
{ $setWindowFields: {
partitionBy: "$category",
sortBy: { created: 1 },
output: {
cumTotal: { $sum: "$amount", window: { documents: ["unbounded", "current"] } },
rank: { $rank: {} },
},
} }
$merge / $out
{ $merge: "reports.daily" } // upsert
{ $out: "reports.daily" } // replace collection
Date operators
$year $month $dayOfMonth $hour
$dateToString $dateFromString
$dateAdd $dateSubtract $dateDiff
$dateTrunc
{ $project: { day: { $dateToString: { format: "%Y-%m-%d", date: "$created" } } } }
String operators
$concat $substr $toLower $toUpper
$split $trim $regex
$strLenCP $indexOfCP
Conditional
$cond $ifNull $switch
{ $project: { status: { $cond: [{ $gt: ["$age", 18] }, "adult", "minor"] } } }
Array operators
$size $arrayElemAt $slice $map $filter $reduce
$in $allElementsTrue $anyElementTrue
{ $project: { active_tags: { $filter: { input: "$tags", cond: { $eq: ["$$this.active", true] } } } } }
Performance
- Index for
$matchand$sortat start. - Avoid
$lookupat scale; denormalize. allowDiskUse: truefor big aggregations.- Limit early.
explain
db.coll.explain("executionStats").aggregate([...])
Common mistakes
- $match after $group → can’t use index.
- $sort without index → in-memory sort (32MB limit).
- $lookup on unindexed field → full scan per doc.
- Forgetting allowDiskUse on big runs.
- Trying to do everything in DB; sometimes app-side is clearer.
Read this next
If you want my aggregation cookbook, it’s at rajpoot.dev .
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .