p:: System Design
A Single Point of Failure (SPOF) is a component of a system whose failure causes the entire system to fail. High-availability architectures eliminate SPOFs through redundancy.
Examples
- A single database server with no replicas
- A single cache server (no fallback)
- Vertical scaling — adding power to one machine has hardware limits and creates a SPOF risk
Mitigation Strategies
- Replication — multiple database replicas (see Database)
- Multiple cache servers across data centers (see System Design)
- Load balancers — distribute traffic across redundant servers
- Horizontal scaling — add more servers rather than one big server