MongoDB sharding.

Architecture

[clients] → [mongos] → [shard1 (RS)] [shard2 (RS)] [shard3 (RS)]
                       [config servers (RS)]

Enable

sh.enableSharding("myapp")
sh.shardCollection("myapp.events", { user_id: "hashed" })

Shard key choice

Good shard key:

  • High cardinality: many unique values.
  • Even distribution: avoids hot shards.
  • Matches common queries: queries can be targeted.
  • Monotonically NOT increasing: avoid _id: 1 (all writes to one shard).

Examples:

  • { user_id: "hashed" }: even distribution; bad for range queries.
  • { tenant_id: 1, _id: 1 }: tenant-isolated; queries within tenant fast.
  • { created_date: 1, user_id: 1 }: time-based + secondary.

Hashed shard key

sh.shardCollection("myapp.events", { user_id: "hashed" })

Auto-creates hashed index. Writes distributed.

Targeted vs scatter-gather

Query with shard key → routed to one shard (fast).
Query without → all shards queried (slow).

Try to include shard key in queries.

Balancer

sh.getBalancerState()
sh.startBalancer()
sh.stopBalancer()

Background process redistributes chunks. Run during off-hours if needed:

sh.setBalancerState(true)
db.settings.updateOne(
    { _id: "balancer" },
    { $set: { activeWindow: { start: "23:00", stop: "06:00" } } }
)

Chunks

sh.status()
db.adminCommand({ getShardDistribution: 1 })

Default chunk: 128MB. Splits + migrates as it grows.

Jumbo chunks

Chunk exceeds size and can’t split (bad shard key choice). Manual fix:

sh.splitFind("myapp.events", { user_id: "x" })
sh.moveChunk("myapp.events", { user_id: "x" }, "shard02")

Zones

sh.addShardToZone("shard1", "us")
sh.updateZoneKeyRange("myapp.users", { region: "us", _id: MinKey }, { region: "us", _id: MaxKey }, "us")

Geo / tenant routing.

Refining shard key

sh.reshardCollection("myapp.events", { user_id: 1, created: 1 })

Available in newer versions; expensive operation.

Connection

mongodb://mongos1,mongos2:27017/myapp

Connect to mongos, not shards.

When to shard

  • single-node RAM for working set.

  • Write throughput exceeding single node.
  • Geographic distribution.

If unsure: scale vertically first; shard last.

Common mistakes

  • Sharding too early.
  • _id: 1 shard key (all writes to one shard).
  • No shard key in queries → scatter-gather.
  • Imbalanced data (one tenant huge).
  • Forgetting hashed index for hashed shard key.

Read this next

If you want my sharding playbook, it’s at rajpoot.dev .


Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .