Most non-relational systems typically maintain multiple copies of the data for availability and scalability purposes. These databases can impose different guarantees on the consistency of the data across copies. Non-relational databases tend to be categorized as either consistent or eventually consistent.
With a consistent system, writes by the application are immediately visible in subsequent queries. With an eventually consistent system writes are not immediately visible. As an example, when rePecting inventory levels for products in a product catalog, with a consistent system each query will see the current inventory as inventory levels are updated by the application, whereas with an eventually consistent system the inventory levels may not be accurate for a query at a given time, but will eventually become accurate. For this reason application code tends to be somewhat different for eventually consistent systems -rather than updating the inventory by taking the current inventory and subtracting one, for example, developers are encouraged to issue idempotent queries that explicitly set the inventory level.
Each application has different requirements for data consistency. For many applications, it is imperative that the data be consistent at all times. As development teams have worked under a model of consistency with relational databases for decades, this approach is more natural and familiar. In other cases, eventual consistency is an acceptable trade-off for the flexibility it allows in the system’s availability.
Document databases and graph databases can be consistent or eventually consistent. MongoDB provides tunable consistency. By default, data is consistent — all writes and reads access the primary copy of the data. As an option, read queries can be issued against secondary copies where data maybe eventually consistent if the write operation has not yet been synchronized with the secondary copy; the consistency choice is made at the query level.
Eventually Consistent Systems
With eventually consistent systems, there is a period of time in which all copies of the data are not synchronized. This may be acceptable for read-only applications and data stores that do not change often, like historical archives. By the same token, it may also be appropriate for
write-intensive use cases in which the database is capturing information like logs, which will only be read at a later point in time. Key-value and wide column stores are typically eventually consistent.
Eventually consistent systems must be able to accommodate conflicting updates in individual records. Because writes can be applied to any copy of the data, it is possible and not uncommon for writes to conflict with one another. Some systems like Riak use vector clocks to determine the ordering of events and to ensure that the most recent operation wins in the case of a conflict. Other systems like CouchDB retain all conflicting values and push the responsibility to resolving conflict back to the user. Another approach, followed by Cassandra, is simply to assume the latest value is the correct one. For these reasons, inserts tend to perform well in eventually consistent systems, but updates and deletes can involve trade-offs that complicate the application significantly.
- Most applications and development teams expect consistent systems.
- Different consistency models pose different trade-offs for applications in the areas of consistency and availability.
- MongoDB provides tunable consistency, defined at the query level.
- Eventually consistent systems provide some advantages for inserts at the cost of making reads, updates and deletes more complex, while incurring performance overhead through read repairs and compactions.