System design interviews are a crucial piece of every software engineer's career trajectory. Success in the SDI is a huge piece of what paves the way to more senior and challenging roles. These interviews are also highly valued in the FAANG/MAANG interview process and play a key role in determining the candidate’s starting level.

In system design interviews, a candidate is asked to design a large-scale distributed system, requiring diversified knowledge from different domains such as computer networks, distributed systems and architecture, operating systems, etc.

Here are a few commonly asked design problems you might encounter in an interview:

Design a YouTube system
Design a Netflix system
Design an Amazon Prime system
Design a WhatsApp system
Design a Facebook Messenger system
Design an Uber system
Design a Lyft system
Design an Instagram system
Design a TikTok system
Design Facebook reel system
Design a Dropbox system.
Design a Google Drive system

These are obviously different systems, that require different solutions. However, you might be surprised to learn that this long list of questions can be broken down into just a few categories, or types of system design problems. For instance, YouTube, Netflix, and Amazon Prime are all video streaming systems with backend services that largely resemble each other. These systems require streaming servers to stream videos, encoding/transcoding servers to divide videos into different segments and resolutions, blob storage for storing thumbnails, etc.

In fact, the majority of system design problems you will encounter can divided into five specific categories (detailed in the next section). This categorization can be a helpful tool for simplifying the system design interview prep process, allowing you to group similar problems and focus on specific design aspects and requirements, while applying concepts across different scenarios.

Here's the real secret: Having an understanding of each main category will prepare you for unexpected questions in the system design interview. That way when you encounter a new problem, you can quickly assess which category the problem falls into, and prepare your solution accordingly.

Tip: If you are new to the system design domain and want to understand the process of system design interviews, read A beginner’s guide to system design interviews at FAANG/MAANG.

If you are ready to take your learning to the next level, check out the following popular course:

Grokking Modern System Design Interview for Engineers & Managers

Now let's dive into the five categories of system design problems, and learn some best practices for approaching each.

The 5 categories of system design problems

The similarity of problems in each category is primarily based on their functional requirements; therefore, this blog presents the requirements and the proposed generalized high-level design for each category.

The five different categories are mentioned below:

Video streaming systems (e.g. YouTube, Netflix)
Real-time communication systems (e.g. WhatsApp, Facebook Messenger)
Ride-hailing systems (e.g. Uber, Lyft)
Feed-based social network systems (e.g. Instagram, TikTok)
Cloud-based collaborative editing systems (e.g. Dropbox, Google Drive)

Let’s start with our first category of video streaming systems below:

1) Video streaming systems

The video streaming systems include services such as YouTube, Netflix, and Amazon Prime videos. These systems are unique in providing uninterrupted quality video streaming. Using adaptive bitrate streaming technology, they deliver the best possible experience to users based on their internet speed and device capabilities.

Let’s consider the following functional and nonfunctional requirements for a streaming service:

Functional requirements:

Stream a video: The system should stream videos to users upon their request.
Upload a video: The users should be able to upload a video. Some services don’t provide this functionality, such as Netflix; for such systems, we can skip this functionality.
Search for a video: The users should be able to search for a video.
Like or dislike a video: Users can like or dislike a video.
Provide comments on a video: If a system provides this functionality, users should be able to comment on videos.

Nonfunctional requirements:

High availability: The system should provide a good percentage of uptime, e.g., above 99 percent.
Low latency: The system should provide a smooth streaming experience.
Scalability: The system should be scalable enough to handle many users.
Reliability: Videos uploaded to the system should be stored persistently and not damaged.

Based on the above requirements, the following is the high-level design of a streaming service.

A high-level design of a video streaming service

In the high-level design of a video streaming service, clients upload videos via application servers. The application servers assign the uploaded videos to encoders, which compress and transcode videos into different formats and resolutions and store them in blob storage. The clients’ and videos’ metadata is stored in databases. The purpose of CDN is to provide a smooth and continous video streaming experience to clients.

Note: You can further explore the detailed design of a video streaming system like YouTube on Educative.

2) Real-time communication systems

Real-time communication (RTC) systems enable users to communicate instantly or with minimum delay. Such systems include WhatsApp, Facebook Messenger, Twitch chat, etc., providing bidirectional communication. In an RTC system, the main focus is to design a fast, bidirectional, and real-time communication system. For this, let’s consider the following requirements:

Functional requirements:

Real-time communication: Users should be able to communicate instantly with a minimum possible delay.
Sharing of media content: Users should be able to share media files such as videos, images, and audio.
Chat storage: The system should provide a facility to persistently store messages. Some systems, such as WhatsApp, store messages intermittently until they are delivered to the intended users.
Notifications: The system should notify users if they have a new message available for them.

Nonfunctional requirements:

Low latency: The system should deliver messages to users with low latency.
Scalability: The system should be flexible to support increasing users.
Availability: The system should be highly available for users to experience minimum downtime.
Consistency: The messages should be delivered in the order they were sent, and users should see the same chat history on all their devices.
Security: The communication channel should be secure and the messages should be end-to-end encrypted.

Following is the high-level design of an RTC system based on the above requirements:

A high-level design of a real-time communication system

In a real-time communication system, senders and receivers are connected to chat servers. Chat servers deliver messages from sender to receiver via a messaging queue. Various protocols, such as WebSocket, WebRTC, and real-time transport protocol, can be utilized for real-time communication. For this purpose, protocol managers establish real-time connections between clients and chat servers; for instance, assume the WebSocket manager to establish WebSocket connections between users and different chat servers. Similarly, the messages can be persistently stored in the database.

Note: You can further explore the detailed design of a real-time communication system like WhatsApp on Educative.

3) Ride-hailing systems

The ride-hailing system provides transportation services to passengers. Passengers are connected with the drivers via mobile applications. These systems allow users to book on-demand rides, similar to taxis. The key aspects of such systems are matching algorithms, dynamic pricing, cashless payment integration, and route optimization algorithms. Some of the well-known ride-hailing systems are Uber, Lyft, and Curb.

To design a ride-hailing system, let’s consider the following requirements:

Functional requirements:

Locate nearby drivers: The passenger should be able to see nearby available drivers.
Ride request: The users should be able to place a ride request.
Drive location live update: The system should update the driver’s location at regular intervals, also, the passenger should see the recent location of the driver.
Payment management: The system should manage and provide a receipt of trip charges to the passenger and driver.
Driver arrival time: The passenger should be able to see the driver’s estimated arrival time.
Pickup confirmation: Drivers should be able to confirm a rider, and the system should notify the passenger about the ride confirmation.
Trip updates: Once a rider is confirmed, the driver and passenger should see trip updates, such as the driver’s location and arrival time.
End the trip: The driver should be able to end the trip upon reaching the destination.

Nonfunctional requirements

Scalability: The system should have the capability to support a large number of users (passengers and drivers) over time.
Availability: The system should be highly available to provide uninterrupted services to users.
Reliability: The system should ensure error-free services such as correct location updates, arrival time, travel time, etc.
Consistency: The system should be consistent in providing the same information and view to both drivers and riders

Based on the above requirements, the following is the high-level design:

A high-level design of a ride-hailing system

The high-level design of a ride-hailing system consists of various services. Initially, the passengers place a ride request. The driver finder and tracker service finds a nearby driver. While moving toward passengers, the driver’s location is continuously updated by the location manager and shared with the passengers. The trip manager creates a trip in the database and stores the relevant information about the trip in the database. The trip manager also provides the trip information to the payment service. The payment service is responsible for collecting payment from passengers and depositing it in the driver’s account. The QuadTree map service calculates the shortest route from origin to destination and computes the travel time. The webSocket manager enables live location-based updates between passengers and drivers.

Note: You can further explore the detailed design of a ride-hailing system like Uber.

**Question: **You might have noticed that real-time communication and ride-hailing systems use WebSocket managers. What distinguishes their designs?

Answer: Both the system design problems use WebSocket managers for different purposes. The ride-hailing system uses WebSocket to enable live location-based updates between passengers and drivers. Conversely, the real-time communication system focuses on constant streams of messages, including text, images, audio, and videos, and provides uninterrupted services. We can assume that the WebSocket server serves the purpose of a walkie-talkie for the ride-hailing system, while for the real-time communication system, it serves the purpose of a continuous line enabling live video or audio calls.

4) Feed-based social networks

Feed-based social networks structure user experience around continuously generated content, including text, images, and videos, collectively called feed. Some well-known feed-based social networks include Facebook, Instagram, and Twitter. These systems allow users to post text, photos, and videos. In turn, these systems use these posted contents to generate feeds for each user’s friends and followers. The generated feeds depend on several factors, such as a user’s social connections, browsing and search history, and demographic information.

Let’s design a feed-based social network just like Instagram, using the following requirements:

Functional requirements:

Create a post: The users should be able to create a post that may include text, photos, and videos.
Delete a post: The users should be able to delete their posts.
Edit a post: The users should be able to edit their posts.
Share a post: The users should be able to share a public post.
Follow and unfollow users: The system should allow users to follow or unfollow others.
Search content: The system should enable users to search content based on the content’s captions.
View the system’s generated feed: The users should be able to see the system’s generated feed from their friends and followers.
Like and dislike content: The users should also like or dislike content from their friends and followers.

Nonfunctional requirements:

Scalability: The system should be horizontally scalable to support many users, e.g., a million.
Availability: The system should be highly available to provide a good user experience.
Low latency: The system should be fast enough to generate a newsfeed seamlessly.
Durability: The content users upload should be stored permanently and never get lost.
Fault tolerance: The system should function correctly despite any errors or failures in any hardware or software components.

Based on the above requirements, let’s create a high-level design of a feed-based social system like Instagram.

A high-level design of a feed-based social network

The high-level design of a feed-based social network includes posts, feed generation, feed publishing service, and feed ranking and recommendation engine. The post-service handles the clients’ posts, and the post is published on the client’s wall (page). Similarly, for followers, feeds are generated for friends and followers by the feed generation service. The feed generation service utilizes the feed ranking and recommendation engine, which ranks and recommends the top N posts to followers based on their interests, searches, and watch history. The generated feed is stored in the database on demand, and the feed publishing service is responsible for publishing and showing the generated feeds to followers. Since the feed could contain videos, the CDN is responsible for delivering the videos to followers with low latency.

Note: You can further explore the detailed design of a feed-based social network like Instagram on Educative.

5) Cloud-based collaborative editing systems

A cloud-based collaborative editing system allows users to review and comment on a document while it’s being edited. The system keeps a history of previous versions, which users can restore anytime. It enables users to work from anywhere they can access the internet. A unique aspect of such a system is concurrency management while editing a document. Some well-known cloud-based collaborative editing systems include Google Drive, Dropbox, and Microsoft 365.

Let’s consider the following requirements for the system:

Functional requirements:

Upload documents: The users should be able to upload documents to the storage servers.
Edit documents: The users should be able to edit documents to which they have edit access.
Download documents: The system should allow users to download documents from the storage servers.
Share documents: The system should allow users to share documents with other users with different access rights.
Create and delete directories: Users should also be able to create and delete directories from the storage servers.
Documents synchronization: The system should synchronize documents across different devices and users.

Nonfunctional requirements

Consistency: The changes made to a document should be visible and similar for all the collaborators.
Durability and reliability: The system should ensure that users never lose their data and should perform the intended function correctly.
Availability: Users should be able to access their data anytime, anywhere they want, provided they have a good internet connection.
Scalability: The system should support unlimited storage and a large number of users’ read and write requests.
Security: The system should prevent unauthorized access to the files apart from those with access rights.

A high-level design of the cloud-based collaborative editing systems

A high-level design of a cloud-based collaborative editing system consists of load balancers, application servers, chunk servers, cloud storage, metadata servers, messaging queues, and sync servers. The files are uploaded or downloaded to the cloud storage via the chunk servers. The metadata associated with each user and file is handled by the metadata servers and is stored in the metadata database. Similarly, The sync servers are responsible for syncing files across multiple devices. These services send updates to users that are made to files on the cloud storage or any change made to the metadata.

Note: You can further explore detailed design of a cloud-based collaborative editing system like Google Docs.

Conclusion

This blog discusses the five generalized categories of commonly asked design problems in system design interviews. The categories are devised based on each system's requirements and common aspects.

To recap, the categories devised in this blog include:

Video streaming systems
Real-time communication systems
Ride-hailing systems
Feed-based social network systems
Cloud-based collaborative editing systems

These generalized categories enable readers to understand how different system works and what could be the unique aspects of each design problem. Understanding these 5 generic design problems and practicing at least 3 specific design problems from each category will enable you to understand dozens of the industry's most popular design problems. For example, suppose you understand the detailed design of how streaming systems work. In that case, it will enable you to easily understand YouTube, Netflix, Amazon Prime Video, and other streaming systems. Furthermore, these design problems enable you to brush up on system design concepts just a few hours before your interview.

What’s next?
The following courses on the Educative platform will help you expand your knowledge of system design concepts and prepare for a challenging interview:

Hack the System Design Interview: 5 Core Problem Types to Master

The 5 categories of system design problems

1) Video streaming systems

2) Real-time communication systems

3) Ride-hailing systems

4) Feed-based social networks

5) Cloud-based collaborative editing systems

Conclusion