Top 5 Considerations When Evaluating NoSQL Databases: #1 the Data Model

nosql-namesData Model

The primary way in which non-relational databases differ from relational databases is the data model. Although there are dozens of non-relational databases, they primarily fall into one of the following three categories:

Document Model

Whereas relational databases store data in rows and columns, document databases store data in documents. These documents typically use a structure that is like JSON (JavaScript Object Notation), a format popular among developers. Documents provide an intuitive and natural way to model data that is closely aligned with object-oriented programming – each document is effectively an object. Documents contain one or more fields, where each field contains a typed value, such as a string, date, binary, or array. Rather than spreading out a record across multiple  columns and tables connected with foreign keys, each record and its associated (i.e., related) data are typically stored together in a single document. This simplifies data access and, in many cases, eliminates the need for expensive JOIN operations and complex, multi-record  transactions. In a document database, the notion of a schema is dynamic: each document can contain different fields. This flexibility can be particularly helpful for modeling unstructured and polymorphic data. It also makes it easier to evolve an application during development, such as adding new fields. Additionally, document databases generally provide the query robustness that developers have come to expect from relational databases. In particular, data can be queried based on any combination of fields in a document.

Applications: Document databases are general purpose, useful for a wide variety of applications due to the flexibility of the data model, the ability to query on any field and the natural mapping of the document data model to objects in modern programming languages.

Examples: MongoDB and CouchDB.

Graph Model

Graph databases use graph structures with nodes, edges and properties to represent data. In essence, data is modeled as a network of relationships between specific elements. While the graph model may be counter-intuitive and takes some time to understand, it can be useful for a specific class of queries. Its main appeal is that it makes it easier to model and navigate relationships between entities in an application.

Applications: Graph databases are useful in cases where traversing relationships are core to the application, like navigating social network connections, network topologies or supply chains.

Examples: Neo4j and Giraph.

Key-Value and Wide Column Models

From a data model perspective, key-value stores are the most basic type of non-relational database. Every item in the database is stored as an attribute name, or key, together with its value. The value, however, is entirely opaque to the system; data can only be queried by the key. This  model can be useful for representing polymorphic and unstructured data, as the database does not enforce a set schema across key-value pairs. Wide column stores, or column family stores, use a sparse, distributed multi-dimensional sorted map to store data. Each record can vary in  the number of columns that are stored. Columns can be grouped together for access in column families, or columns can be spread across multiple column families. Data is retrieved by primary key per column family.

Applications: Key value stores and wide column stores are useful for a narrow set of applications that only query data by a single key value. The appeal of these systems is their performance and scalability, which can be highly optimized due to the simplicity of the data access patterns and opacity of the data itself.

Examples: Riak and Redis (Key-Value); HBase and Cassandra (Wide Column).


TAKEAWAYS:

  • All of these data models provide schema flexibility.
  • The key-value and wide-column data model is opaque in the system – only the primary key can be queried.
  • The document data model has the broadest applicability.
  • The document data model is the most natural and most productive because it maps directly to objects in modern object-oriented languages.
  • The wide column model provides more granular access to data than the key value model, but less flexibility than the document data model.