p:: Computer Science

Database

Scaling

Vertical

  • Scale Up
  • Add more power (CPU, RAM, …)
  • Has hard limit
  • No failover and redundancy

Horizontal

  • Scale Out
  • Add more servers

Load Balancer

  • User Public IP of Load Balancer Private IPs of Servers

Cache

Redis, Memcached

Considerations

  • Decide when to use
    • data is read frequently but modified infrequently
  • Expiration policy
    • neither be too long nor too short
  • Consistency
    • Keep DB and cache in sync
  • Mitigating failures
    • Avoid Single Point of Failure (SPOF) using multiple cache servers across different data centers
    • Overprovision required memory by certain percentages
  • Eviction Policy
    • once cache is full, existing items need to be removed to add new items
    • Cache Eviction Policies
      • Least Recently Used (LRU)
        • most popular
      • Least Frequently Used (LFU)
      • First In First Out (FIFO)

Content Delivery Network (CDN)

  • network of geographically dispersed servers used to deliver static content

  • images, videos, CSS, JavaScript files, …

  • Dynamic content caching

  • client requests file from CDN

  • if not in CDN, CDN requests file from origin

  • origin returns file with optional Time-To-Live (TTL) header

  • file remains cached in CDN until TTL expires

Considerations

  • Cost
    • charged for data transfers in and out of the CDN
  • Cache Expiry time
    • neither be too long nor too short
  • CDN fallback
    • clients should be able to detect CDN outage and request resources from origin
  • Invalidating files
    • Can remove a file before it expires
      • Using provided API
      • Use object versioning to serve a different version
        • image.png?v=2

Stateless web tier

  • to scale horizontally, we need to move state out of the web tier
  • for example, store user session data in database

Stateful architecture

  • server remembers client data (state) from one request to the next
  • every request from the same client must be routed to the same server
  • can be done with sticky sessions in most load balancers but adds overhead
  • challenging to handle server failures

Data centers

  • users are geoDNS-routed, also known as geo-routed, to the closest data center
  • geoDNS is a DNS service that allows domain names to be resolved to IP addresses based on the location of a user

Challenges

  • Traffic redirection
    • GeoDNS can be used to direct traffic to the nearest data center depending on where a user is located
  • Data synchronization
    • Users from different regions could use different local databases or caches. In failover cases, traffic might be routed to a data center where data is unavailable.
  • Test and deployment
    • Automated deployment tools are vital to keep services consistent through all the data centers

Message queue

  • durable component, stored in memory, that supports asynchronous communication
  • Basic architecture
    • Input services, called producers/publishers, create messages, and publish them to a message queue
    • Other services or servers, called consumers/subscribers, connect to the queue, and perform actions defined by the messages
  • Decoupling
    • producer can post a message to the queue when the consumer is unavailable to process it
    • consumer can read messages from the queue even when the producer is unavailable
    • producer and consumer can be scaled independently

Logging, metrics, automation

  • Logging
    • monitor error logs at per server level or use tools to aggregate them to a centralized service for easy search and viewing
  • Metrics
    • Host level metrics: CPU, Memory, disk I/O, etc.
    • Aggregated level metrics: the performance of the entire database tier, cache tier, etc.
    • Key business metrics: daily active users, retention, revenue, etc.
  • Automation
    • continuous integration
    • improve dev productivity

Millions of users and beyond

  • Keep web tier stateless
  • Build redundancy at every tier
  • Cache data as much as you can
  • Support multiple data centers
  • Host static assets in CDN
  • Scale your data tier by sharding
  • Split tiers into individual services
  • Monitor your system and use automation tools

A Framework for System Design Interviews

Step 1 - Understand the problem and establish design scope

3 - 10 minutes

Ask questions to understand the exact requirements.

  • What specific features are we going to build?
  • How many users does the product have?
  • How fast does the company anticipate to scale up? What are the anticipated scales in 3 months, 6 months, and a year?
  • What is the company’s technology stack? What existing services you might leverage to simplify the design?

Step 2 - Propose high-level design and get buy-in

10 - 15 minutes

  • Come up with an initial blueprint for the design
  • Draw box diagrams with key components
  • Do back-of-the-envelope calculations

Step 3 - Design deep dive

10 - 25 minutes

Step 4 - Wrap up

3 - 5 minutes

The interviewer might ask you a few follow-up questions or give you the freedom to discuss other additional points

  • Design a rate limiter

  • rate limiter

    • Step 1 - Understand the problem and establish design scope
      • requirements
        • Accurately limit excessive requests
        • Low latency
        • Use as little memory as possible
        • Distributed rate limiting
        • Exception handling
        • High fault tolerance
    • Step 2 - Propose high-level design and get buy-in
      • Algorithms
        • Token bucket
        • Leaking bucket
        • Fixed window counter
        • Sliding window log
        • Sliding window counter