In 2026, eBPF is the cloud-native networking layer. Cilium ships as the default CNI on GKE, EKS, and AKS. The “service mesh” conversation has moved from sidecars to per-node eBPF programs. The CPU overhead is under 1%; the operational overhead has nearly vanished.

This post is the working knowledge: why Cilium won, what’s in the modern eBPF observability stack, when sidecars still win, and how to migrate.

What changed

Three years ago, eBPF was novel. The skepticism was reasonable: it’s literally a virtual machine inside the kernel. Today:

  • Cilium is the default CNI on every major managed Kubernetes (GKE Dataplane V2, EKS with Cilium CNI, AKS Advanced Networking).
  • Sidecarless service meshes (Cilium Service Mesh, Istio’s ambient mode) are mainstream.
  • 67% of Kubernetes-at-scale teams in CNCF surveys use at least one eBPF observability tool.

The “scary kernel-level magic” reputation is gone. eBPF programs are signed, verified, sandboxed; the verifier rejects unsafe ones at load time.

The mental model

A sidecar mesh injects a proxy (Envoy) into every pod. Two containers, double the network hops, double the memory:

[ pod A: app + envoy ] ←→ [ pod B: envoy + app ]

A Cilium mesh runs one eBPF program per node that handles routing, policy, mTLS, and observability for every pod on that node. No sidecar:

[ pod A: app ] ── eBPF (kernel) ── eBPF (kernel) ── [ pod B: app ]

For L7 features (HTTP routing, gRPC), Cilium uses a per-node Envoy (one shared instance) instead of one per pod. The cost goes from N sidecars to 1 daemon.

Result: under 1% CPU overhead per node, no pod restarts to enable, no sidecar memory tax.

What Cilium does

CNI (the table stakes)

Pod-to-pod networking, IPAM, NodePort, LoadBalancer, ExternalIPs. Faster than kube-proxy because eBPF replaces iptables for service routing — important once your cluster has 1000+ services.

Network Policy

Beyond Kubernetes’ built-in NetworkPolicy, Cilium supports L7 policies:

apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata: { name: api-allow-frontend }
spec:
  endpointSelector:
    matchLabels: { app: api }
  ingress:
    - fromEndpoints:
        - matchLabels: { app: frontend }
      toPorts:
        - ports: [{ port: "8080", protocol: TCP }]
          rules:
            http:
              - method: "GET"
                path: "/users(/|$)(.*)"
              - method: "POST"
                path: "/users$"

The frontend can GET /users or POST /users on the api, but not DELETE. Encoded in YAML, enforced at line rate.

Service mesh

mTLS between pods, traffic shifting, retries, circuit breaking, header rewrites — same features as Istio, no sidecar:

apiVersion: cilium.io/v2alpha1
kind: CiliumEnvoyConfig
metadata: { name: api-canary }
spec:
  services:
    - { name: api, namespace: default }
  resources:
    - "@type": type.googleapis.com/envoy.config.route.v3.RouteConfiguration
      virtualHosts:
        - name: api
          domains: ["*"]
          routes:
            - match: { prefix: "/" }
              route:
                weightedClusters:
                  clusters:
                    - { name: api-v1, weight: 90 }
                    - { name: api-v2, weight: 10 }

90/10 traffic split for canary. Lives in Cilium’s CRDs, not bolted on.

Observability — Hubble

$ hubble observe --namespace api
Apr 29 09:42:01.234 default/frontend-7c4fb7  → default/api-6cd45b  HTTP 200 GET /users
Apr 29 09:42:01.245 default/api-6cd45b       → default/postgres    TCP  to:5432

Hubble is Cilium’s flow-level observability. Every connection is captured by the eBPF program — no app instrumentation. Pair with Hubble UI for a service map that auto-builds from real traffic.

Runtime security — Tetragon

Tetragon hooks into the kernel for security events:

apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata: { name: detect-pod-shell }
spec:
  kprobes:
    - call: "sys_execve"
      syscall: true
      args:
        - { index: 0, type: "string" }
      selectors:
        - matchArgs:
            - { index: 0, operator: "Equal", values: ["/bin/sh", "/bin/bash"] }

Detects shell execution inside pods. Pair with rules to alert (or kill) on tripped conditions. Replaces what Falco was doing, faster, more deterministic.

The 2026 stack

For a production Kubernetes cluster in 2026, the eBPF stack:

LayerTool
CNI + service meshCilium
Network observabilityHubble
App-level observability (auto)Pixie
Runtime securityTetragon
Custom tracesOpenTelemetry (see OpenTelemetry End-to-End )

Pixie is worth special mention: it auto-instruments HTTP, gRPC, MySQL, Postgres traffic from kernel observation. No app code change. You get a service map and per-request traces “for free.”

Cilium vs Istio in 2026

CiliumIstio (sidecar)Istio Ambient (sidecarless)
Per-pod overheadNone~50 MB + 2% CPU~5 MB + small CPU
L7 featuresYes (per-node Envoy)Yes (per-pod)Yes (per-node)
Operational simplicityHighestLowestMid
Multi-cluster meshCilium Cluster MeshYesYes
Maturity (2026)Production-gradeBattle-testedStable, gaining ground
Default in managed K8sYes (GKE, EKS, AKS option)PossiblePossible

For new clusters in 2026: Cilium. For Istio shops happy with sidecars: stay where you are; the migration cost is real. For Istio shops struggling with sidecar overhead: ambient mode or Cilium.

Migration from Istio — high level

If you’re moving from Istio sidecar to Cilium, the rough sequence:

  1. Install Cilium as the CNI (or migrate from existing CNI). This is the heaviest step; managed clusters often need recreation.
  2. Run both in parallel for a phase. Cilium handles networking; Istio sidecars stay for mesh features.
  3. Migrate policies namespace by namespace from Istio AuthorizationPolicy to CiliumNetworkPolicy.
  4. Migrate mTLS — enable Cilium mutual auth; remove Istio mTLS.
  5. Migrate traffic management — VirtualService → CiliumEnvoyConfig. The hardest step; complex VirtualServices don’t all translate cleanly.
  6. Remove sidecars namespace by namespace.
  7. Migrate observability to Hubble + Tetragon.

Plan a quarter. Don’t try to do it in a sprint.

Network policy that’s not painful

NetworkPolicy gets a bad reputation because the default-deny model is unforgiving. The 2026 approach:

# Default deny in every workload namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata: { name: default-deny, namespace: api }
spec:
  podSelector: {}
  policyTypes: [Ingress, Egress]

Then explicit allows per service. Cilium’s L7 policies make this practical without exploding the policy count.

For day-to-day developer ergonomics, ship a default policy generator with each service template (Platform Engineering and IDPs ). Developers don’t write policy by hand; they declare app→app dependencies and the platform generates the policy.

mTLS without certs

Cilium mTLS uses SPIFFE identities (workload identities, not user certs). Each pod gets a SPIFFE ID derived from its service account. Cilium handles cert issuance and rotation. Application code does not change.

apiVersion: cilium.io/v2alpha1
kind: CiliumNetworkPolicy
metadata: { name: api-mtls }
spec:
  endpointSelector: { matchLabels: { app: api } }
  ingress:
    - authentication:
        mode: required           # mTLS required
      fromEndpoints:
        - matchLabels: { app: frontend }

Done. mTLS is enforced at the kernel; no app changes; no per-pod overhead.

Multi-cluster

Cilium Cluster Mesh federates services across clusters. A pod in cluster A can call api.default.svc.cluster.local and reach an instance in cluster B transparently:

apiVersion: v1
kind: Service
metadata:
  name: api
  namespace: default
  annotations:
    service.cilium.io/global: "true"     # makes it cluster-mesh aware
spec:
  selector: { app: api }
  ports: [{ port: 8080 }]

Failover, cross-cluster load balancing, and policy-aware routing all just work. Compare to multi-Istio meshes which are real engineering effort.

eBPF observability: the part that surprises

The single biggest unlock from eBPF isn’t networking. It’s observability that needs no code change:

  • Pixie auto-traces HTTP, gRPC, Kafka, MySQL, Postgres. Gives you a service map, per-request flame graphs, query latency breakdowns.
  • Hubble captures every flow. You can ask “what services talked to Postgres in the last hour?” and the kernel knows.
  • Tetragon captures every exec, file open, network connect. Forensics, compliance, intrusion detection.

For a team that hasn’t built application-level instrumentation, this is a lot of value for almost no work.

The 1% myth (and reality)

Cilium has measurable overhead. A few production data points:

  • CNI vs kube-proxy: Cilium is faster on connection establishment (eBPF replaces iptables); slightly heavier on long-running connections.
  • L7 policy: ~50 microseconds per packet for HTTP parsing. Negligible for HTTP traffic; can matter for very high RPS gRPC.
  • Hubble flow capture: <1% CPU per node when sampled; ~3% when capturing full flow logs.

For 99% of clusters, this is below the noise floor. Compare to sidecar overhead: 50–100 MB per pod, 5–10% CPU per pod.

When sidecars still win

A few honest cases where Istio sidecars (or per-pod Envoy) earn their cost:

  • Heavy custom Envoy filter chains. If you’ve built domain-specific WASM filters and need them per-pod, sidecars deliver.
  • Strict per-pod resource isolation. Per-node shared Envoy means one pod’s traffic affects another’s.
  • Specific Istio features (e.g., very fine-grained AuthorizationPolicy) you don’t want to translate.

For everyone else, Cilium is the path of least resistance.

A pragmatic adoption path

  1. Start with CNI. Move to Cilium on a new cluster (managed K8s makes this easy).
  2. Add Hubble for visibility. Free win even before you adopt mesh features.
  3. Adopt L4 NetworkPolicy with default-deny. The platform team writes the templates.
  4. Add L7 policy for high-value services (auth, billing, admin).
  5. Enable mTLS per namespace. Audit, then enforce.
  6. Add Tetragon for runtime security signals.
  7. Add Pixie for app-level traces if your apps aren’t already OTel-instrumented.

Each phase ships value on its own.

Read this next

If you want a Cilium + Hubble + Tetragon Helm chart with sane defaults and a migration runbook from Istio, it’s at rajpoot.dev .


Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .