Service Mesh — Network Intelligence Without Code Changes

Offload cross-cutting network concerns (mTLS, retries, circuit breaking, observability) to a sidecar proxy — no app code changes needed.

When to use

5 services needing consistent security and observability policies
Polyglot environments where library-based approaches don't scale
Zero-trust networking inside the cluster

Tradeoffs

Latency overhead per hop (sidecar proxy in path)
Operational complexity: control plane becomes critical infrastructure

YAML (Istio)
YAML (Linkerd)

# VirtualService: retry policy
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: payments-vs
spec:
  hosts: ["payments"]
  http:
    - retries:
        attempts: 3
        perTryTimeout: 25ms
        retryOn: gateway-error,connect-failure
---
# DestinationRule: circuit breaker
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: payments-dr
spec:
  host: payments
  trafficPolicy:
    outlierDetection:
      consecutive5xxErrors: 5
      interval: 10s
      baseEjectionTime: 30s

# ServiceProfile: per-route retry budget
apiVersion: linkerd.io/v1alpha2
kind: ServiceProfile
metadata:
  name: payments.default.svc.cluster.local
  namespace: default
spec:
  routes:
    - name: POST /charge
      condition:
        method: POST
        pathRegex: /charge
      retryBudget:
        retryRatio: 0.2
        minRetriesPerSecond: 10
        ttl: 10s
      timeout: 25ms

Gotcha: A service mesh is infrastructure, not application code. If you're implementing retries in application code AND in the mesh, you're doubling up and creating confusion about which layer owns the policy.