COSMOS DB is a globally distributed database that can be accessed anywhere and anytime and tailored to one’s own needs. When we use COSMOS DB for transactional databases and data is replicated across multiple regions, we should set specific rules on the consistency and availability of data that we query from the database. Consistency and availability settings of the data changes based on the type of application. Let us try to understand what they are and then explore the different options that COSMOS DB provides and how to choose them.
To understand consistency and availability, we should discuss the data storage regions. When creating a COSMOS DB, we can choose multiple locations to store data to avoid latency in retrieval. If person P retrieves data from the primary region, the data will likely be most updated. Instead, if the person retrieves the data from the secondary area, there can be a delay in the data updated in the secondary region. In such a situation, specific rules are set based on the application to view the data without any latency after querying – irrespective of the other transactions in the past few seconds. The rules can also help us view only the most updated data with some delay in generating the results after querying. We have a range of levels to choose from, called consistency levels.
Now that we understand consistency and availability, we know how and where to choose consistency levels in COSMOS DB in the following steps.
Once you create a COSMOS DB in the database settings, we can choose the option to choose the consistency level in the default consistency option.
There are five distinct levels of consistency to choose from that vary from Strong to Eventual, the consistency of the data decreases and availability increases.
We choose strong when all the data centres should reflect the same data at any point in time. We agree to wait for a few seconds to enable this to happen.
We tolerate the stale data until a certain time or the number of operations, whichever happens, earlier. The screenshot below shows that the time or number of operations can be chosen when Bounded Staleness is selected.
In this level, we can group a certain set of regions that will always maintain the update order as a consistent prefix, whereas the rest would maintain the eventual consistency.
We choose a consistent prefix; it is still not necessary that all the data centres are updated with the latest data. Instead, the results shown will be in the order of updates that occurred.
For instance, if the database writes are performed in order A, B, C, the query results in A, or A, B, or A, B, C but not B or B, A or C, A or A, C.
It is exactly opposite to the Strong level. We are not ready to wait even for a few seconds after running a query. We need some data to be returned and are not concerned if the data is the latest updated.
It should be noted that we can manually change to weaker consistency when we are querying data from a database. If the session is chosen as the default consistency level, it can be manually overridden to Consistent Prefix or Eventual.