What's the biggest K8s cost lever?

Right-sizing requests. Most clusters waste 50–70% of capacity because requests are set at peak rather than realistic usage. VPA recommendations + a tight review process typically reclaims 30–50% of compute spend.

Should I use Karpenter or Cluster Autoscaler?

Karpenter for new clusters in 2026 — faster scaling, better bin packing, native spot support. Cluster Autoscaler still works but Karpenter is the modern default for AWS / GCP.

Kubernetes Cost Engineering in 2026 — Where the Money Actually Goes

A typical Kubernetes cluster wastes 50%+ of capacity. The cost is real, the fixes are boring. This post is the practical playbook.

Right-size requests

Every pod has CPU/memory requests. The scheduler reserves that capacity. Over-provisioning means nodes provision more than needed.

resources:
  requests:
    cpu: "100m"        # actual usage in your monitoring
    memory: "256Mi"
  limits:
    cpu: "500m"        # higher than request to allow bursts
    memory: "512Mi"

Use VPA recommendations :

kubectl describe vpa my-app | grep -A3 "Recommendation"

Apply recommendations; recheck monthly.

Karpenter for node provisioning

Cluster Autoscaler has node-group constraints; Karpenter provisions nodes per pod’s requirements:

apiVersion: karpenter.sh/v1
kind: NodePool
metadata: { name: default }
spec:
  template:
    spec:
      requirements:
        - key: kubernetes.io/arch
          operator: In
          values: [amd64, arm64]
        - key: karpenter.sh/capacity-type
          operator: In
          values: [spot, on-demand]
      nodeClassRef:
        name: default
  limits: { cpu: 1000 }
  disruption:
    consolidationPolicy: WhenUnderutilized

Bin-packs better. Spins up exact-size nodes. Consolidates underutilized nodes. Typical savings: 20–40% on node spend.

Spot capacity

For interruptible workloads (CI, background jobs, batch ML), use spot:

Up to 90% off on EC2.
Workloads must handle SIGTERM (graceful shutdown).
Karpenter handles spot pool fallback automatically.

For stateful or user-facing services, stick with on-demand. Mixed pool gets you most of the savings.

HPA + VPA

HPA: scales replicas based on CPU / memory / custom metric.
VPA: adjusts requests for existing pods.

Run both:

VPA recommends realistic requests.
HPA scales replicas based on load.

Conflict: HPA on CPU + VPA on CPU same metric. Solve by using different metrics — VPA on CPU usage, HPA on RPS or queue depth via KEDA .

Topology spread

Don’t over-spread:

topologySpreadConstraints:
  - maxSkew: 1
    topologyKey: topology.kubernetes.io/zone
    whenUnsatisfiable: ScheduleAnyway

ScheduleAnyway lets the scheduler bin-pack better. DoNotSchedule forces strict spread; expensive.

Cluster sizing

Single huge cluster vs. many small:

Huge: better bin-packing, lower per-cluster overhead, cheaper.
Small: blast radius bounded, easier RBAC, easier upgrade.

For most teams under 500 nodes: one cluster per environment.

Observability of cost

Per-namespace, per-team cost visibility:

OpenCost: open-source, works with Prometheus.
Kubecost: commercial, polished.

Show teams their cost. Magic culturally.

For broader observability see Observability 2.0 .

Common mistakes

1. Setting requests = limits

Wastes capacity. Limits should be higher than requests for bursts.

2. CPU limits aggressive

CPU limits cause throttling, sometimes unnecessarily. For most workloads, set high limits or no CPU limit at all (let memory be the constraint).

3. Idle dev clusters running 24/7

Suspend at night, weekends. Easy 30%+ savings on non-prod.

4. PVCs orphaned

Pods get deleted; PVCs linger; you keep paying. Sweep monthly.

5. NAT Gateway egress

In AWS, NAT Gateway egress is $$$. Use VPC endpoints for AWS service traffic.

What I’d ship today

For a 2026 Kubernetes cluster:

Karpenter for node provisioning.
VPA recommendations applied quarterly.
HPA on real load metrics (KEDA-driven).
Spot pool for interruptibles.
OpenCost dashboards visible to teams.
Quarterly cost review as a calendar event.

Boring habits. 30–50% lower bills than the typical cluster.

Read this next

If you want my Karpenter + VPA + OpenCost setup, it’s at rajpoot.dev .

Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .

Right-size requests#

Karpenter for node provisioning#

Spot capacity#

HPA + VPA#

Topology spread#

Cluster sizing#

Observability of cost#

Common mistakes#

1. Setting requests = limits#

2. CPU limits aggressive#

3. Idle dev clusters running 24/7#

4. PVCs orphaned#

5. NAT Gateway egress#

What I’d ship today#

Read this next#