System Design Interview
System Design Interviews (SDIs) are unstructured, open-ended design problems usually centered around a large scale system.
1. Clarify Requirements
Ask questions about the exact scope of the problem we are solving. Clarify ambiguities, define end goals, and understand which parts of the system we're focusing on.
2. Define System API
Define what APIs are expected from the system.
3. Perform a Back-Of-The-Envelope Estimation
- What is the scale expected from the system (e.g. number of tweets, number of timeline generations per second, etc)?
- Will the system be read or write heavy?
- How much storage will we need?
- What network bandwidth usage are we expecting?
4. Define a Data Model
Identify the entities of the system, how they will interact, and different aspects of data management like storage, transportation, encryption, etc.
Which database system should we use? Relational, document, or graph?
5. Design at a High-Level
Draw a block diagram with 5-6 boxes representing the core components of our system that address the problem end-to-end.
6. Perform Detailed Design
Dig deeper into two or three components; interviewer's feedback should always guide us on what parts of the system need further discussion. Compare and contrast different approaches.
- Could we partition data into multiple databases?
- Can we optimize for heavily used data?
- How much and at which layer should we introduce caching?
- What components need better load balancing?
7. Identify and Resolve Bottlenecks
Try to discuss as many bottlenecks as possible and different approaches to mitigate them.
- Is there any single point of failure in our system?
- Do we have enough replicas (redundancy) of the data as that if we lose a few servers we can still serve our users?
- How are we monitoring the performance of our service?
Consider both functional and non-functional requirements.
- Use REST.
- Some APIs should take an API developer key.
- Mention pagination if necessary.
- Compare number of reads to number of writes.
- Estimate traffic in Queries Per Second (QPS).
- Estimate total storage from given data.
- Note that we might need to chunk data if it's too large to send.
- The average server can take 10K requests.
- Create a schema with entities and relationships.
- Choose a database type (relational, document, graph).
- Consider object storage.
- Consider a search index.
- Consider a message queuing service.
- If you need sequenced events on a distributed system, consider keeping a sequence number with every message for each client. This sequence number will determine the exact ordering of messages for EACH user.
- Consider load balancing, caching, partitioning, replication.
- If keys need to be used in many locations consider a key generation service.
- Go through scalability, reliability, and maintainability.
- Mention monitoring.
- Mention redundancy.
- Notification service.
- User service.
- Aggregation service.
- Key generation service.