p:: System Design

A Single Point of Failure (SPOF) is a component of a system whose failure causes the entire system to fail. High-availability architectures eliminate SPOFs through redundancy.

Examples

  • A single database server with no replicas
  • A single cache server (no fallback)
  • Vertical scaling — adding power to one machine has hardware limits and creates a SPOF risk

Mitigation Strategies

  • Replication — multiple database replicas (see Database)
  • Multiple cache servers across data centers (see System Design)
  • Load balancers — distribute traffic across redundant servers
  • Horizontal scaling — add more servers rather than one big server