Skip to content
Dev Dump

Load Balancer Fundamentals

A load balancer sits in front of service instances and routes traffic so no single backend becomes a bottleneck or single point of failure.

Load balancer request flow

  • Scalability: add/remove instances without changing client behavior.
  • Availability: route away from unhealthy nodes.
  • Performance: distribute load based on policy and health.
  • Safe deployments: rolling, blue/green, canary traffic splits.

Without a load balancer, one server outage or traffic spike can affect the full system.

Load balancing algorithms

AlgorithmHow it picks backendBest whenRisk
Round robinRotate servers in orderSimilar instancesIgnores live load
Weighted round robinRotate with capacity weightsMixed-size instancesWeights can become stale
Least connectionsFewest active connectionsLong-lived requests/websocketsConnection count != true load
Least response timeFastest current responderVariable latency workloadsNeeds continuous telemetry
IP hash / consistent hashSame client -> same serverAffinity/state localityUneven load possible
Random (power of two choices)Random sample, pick betterVery large poolsLess predictable fairness

Rule of thumb:

  • Start with weighted round robin + health checks.
  • Move to least-connections/least-latency when workload is uneven.

Production balancers should:

  • use active health checks (/health, TCP checks),
  • eject failed backends quickly,
  • retry with bounded budget,
  • avoid retry storms with backoff/circuit breaking,
  • support connection draining during deploys.

Interview line:

“I treat load balancing and health checks as one system; routing without good health signals causes cascading failures.”

Stateful vs stateless balancing

ModeHow it worksProsCons
Stateful affinitySame user/session tends to hit same backendSession locality, cache hitsHarder failover and rebalance
Stateless routingEvery request routed independentlyEasy scaling and recoveryNeeds shared session/state store

Preferred pattern:

  • Keep balancer stateless.
  • Externalize session/state (Redis/DB) when possible.
  • Use sticky sessions only when unavoidable.

Layer 4 vs Layer 7 balancing

LayerRoutes byStrengthLimitationTypical use
L4 (transport)IP + port + connection metadataHigh throughput, low overheadNo HTTP-aware routingTCP/UDP services, simple pass-through
L7 (application)URL, host, header, cookie, methodSmart routing, auth/policy awareMore CPU/latency overheadAPI gateways, multi-tenant HTTP apps

Examples:

  • Route /api/* to API service and /static/* to CDN/backend at L7.
  • Use L4 for raw TCP services where deep inspection is unnecessary.