CAP Theorem

Aniket Patel - Sep 3 - - Dev Community

Image description

It is impossible for a distributed data store to provide all three guarantees simultaneously.

1. Consistency (C): Every read receives the most recent write or an error.

  • This means that all working nodes in a distributed system will return the same data at any given time.
  • Consistency is crucial for applications where having the most up-to-date data is critical: Banking System

2. Availability (A): It guarantees that every request (read/write) receives a response, without ensuring that it contains the more recent writes.

  • This ensures that the system remains operational and responsive, even if responses from some of the nodes don't reflect the most up-to-date.
  • Availability is crucial for applications requiring continuous operation, such as online retail systems.

3. Partition Tolerance (P): It means that the system can continue to function even if some of the nodes in the network are unable to communicate with each other due to network partitions.

A network partition occurs when a network failure causes a distributed system to split into 2 or more groups of nodes that cannot communicate with each other.

  • When there is a network partition, the system must choose between Consistency and Availability.

The CAP Trade-off: Choosing 2 out of 3

1. CP (Consistency and Partition Tolerance): Traditional relational databases, such as MySQL and PostgreSQL, when configured for strong consistency, prioritize consistency over availability during network partitions.

  • When a system is being partitioned, it might reject certain requests to preserve consistency.
  • Example: Banking Systems typically prioritize consistency and accuracy of data over availability. Consider an ATM network for a bank. When you withdraw money, the system must ensure that your balance is updated accurately across all nodes (consistency) to prevent overdrafts or other errors.

2. AP (Availability and Partition Tolerance): NoSQL databases such as Cassandra and DynamoDB are designed to prioritize high availability and partition tolerance, even at the expense of strong consistency.

  • Example: Amazon's shopping cart is designed to always accept items which is like prioritizing the availability, even during the pick high traffic periods like Black Friday.

3. CA (Consistency and Availability): In the absence of network partitions, a system can be consistent and available. However, network partitions are inevitable in distributed systems, making this combination impractical.

  • Example: In the world of distributed systems, the quest for a database that offers both consistency and availability is a challenging one. Single-node databases excel in providing these qualities but falter when it comes to tolerating partitions. This creates a theoretical impasse in a distributed environment.
.
Terabox Video Player