Nowadays, most applications(even rudimentary ones) are distributed applications—they invoke external APIs and services. Hence, understanding and applying the CAP theorem to application design is crucial for building robust applications.
CAP theorem says that you have to choose either consistency or availability in the face of network failures; you cannot have both.
CAP theorem is the programmer’s version of you cannot have your cake and eat it too.
Let us take a hypothetical application that exposes an API to register a user. For storing the user data, the application uses two datastores—MongoDB and MySQL. The datastores and the application backend are on different servers. When the client invokes the user registration API, the backend makes a network call and stores the data in MySQL and MongoDB.
For the sake of this post, I will define application availability as the backend returning a success response. I will define application consistency as the user data being available in MySQL and MongoDB on a success API response from the backend.
The above definitions are not strictly in line with the definitions of availability and consistency, as defined by the CAP theorem. I am taking liberties to convey the idea.
Designing the application using the CAP theorem
The backend and the datastores are on different servers. Since network failures are a given, the datastores might become temporarily unavailable to the backend. How should the application behave during network glitches?
Let us work with a specific scenario.
The client invokes the API; the payload reaches the backend, and the backend is trying to store the data in MySQL and MongoDB. The backend successfully stores the data in MySQL, but the backend cannot access MongoDB due to a temporary network failure.
You can design your application in two ways to handle this scenario.
Design option one
The backend stores the data in MySQL and returns a success response to the client. A periodic reconciliation job runs that queries MySQL, compares the data with MongoDB, and inserts the missing records into MongoDB.
With the above design, your application favors availability over consistency. When there are network failures, the backend is available(does not throw an error), but the data is inconsistent(present in MySQL but not in MongoDB). With this design, the application is eventually consistent—once the reconciliation job runs, the application becomes consistent.
Design option two
The backend returns an error response to the client when there are network glitches(when backend cannot communicate with the datastores). You put the onus on the client to retry under such circumstances.
With this design, your application favors consistency over availability. The application does not return a success response when it cannot store the data in both MongoDB and MySQL. You do not need reconciliation jobs in this design.
Every application these days is a distributed application. Hence, you need to understand the CAP theorem and use the CAP theorem’s essence while designing applications and thinking about the tradeoffs.
PS—To simplify the explanation and make it accessible, I have washed down the CAP theorem and the application failure modes. I will write a follow-up post, expanding the idea, and make it rigorous.
Diagram created using the amazing https://www.diagrams.net/.