Couchbase NoSQL Database gets the SQL Religion

Couchbase 4.0 elegantly integrates SQL and JSON, maintaining schema-independence while adding support for joins and a range of BI tools.
Written by Andrew Brust, Contributor

What does it mean for a NoSQL database to offer a SQL query language? Ask the folks at Couchbase, because they announced just such a facility at their Couchbase Connect event yesterday in Santa Clara, CA. The company that created a scale-out key-value store NoSQL database, and later added JSON document store capabilities, has now added support for a dialect of SQL.

The new query language, dubbed "N1QL" (and pronounced "nickel") is part of Couchbase 4.0, which entered Beta yesterday. N1QL, designed specifically for JSON-based document data, does just what you would think: it allows users and, perhaps more important, developers, to query Couchbase "buckets" (tables) using SQL, and it extends the SQL syntax to negotiate the nested nature of the JSON document data those buckets contain.

Joins and structure
Not only does a parent.child.element syntax exist to address child data, but the UNNEST keyword allows that data to be flattened out. N1QL also allows joins, both between buckets and even between parent and child data in the same document. So the introduction of N1QL goes well beyond an alternative syntax; it enables familiar relational concepts and features as well.

N1QL's versatility doesn't end with its accommodations for nested data. It is also fully tolerant of unstructured data. So, for example, if you select a set of items/columns from several documents in a bucket and some of those items do not exist in some of those documents, they simply won't be listed. If a certain document matches the criteria in your WHERE clause but contains none of the items in the SELECT clause, it will simply be represented as an empty document ("{}").

If you'd like to give N1QL a spin without installing Couchbase, the company has created a combination tutorial/sample database/interactive query tool that you can use to get hands-on very quickly.

ORMs be gone
As I alluded to already, the addition of N1QL enables developers to use Couchbase in their applications. They get to use the familiar SQL syntax, and yet the data is returned to them as JavaScript objects (that's what the "JSO" in "JSON" stands for, after all).

So, for developers, "there is no longer an impedance mismatch and no need for a complex translation layer" (to quote Couchbase's press release) since they don't need to convert table structures to programming objects. An entire class of programming tools, called ORM (object/relational mapping) frameworks, exists to address this so-called impedance mismatch and Couchbase is clearly speaking to ORM-using developers.

BI, come hither
N1QL also opens up Couchbase as a queryable data source, compatible with a range of business intelligence (BI) tools. Couchbase's announcement of N1QL involved partnerships and well wishes from the likes of Informatica, Simba (which is supplying official ODBC and JDBC drivers for N1QL/Couchbase) and Tableau.

And in so doing, N1QL is availing Couchbase of the same opportunity that Apache Drill has enabled for the data sources it supports, which include files in HDFS, tables in HBase and documents in MongoDB. And don't forget another major NoSQL database, Apache Cassandra, has its own "CQL" (Cassandra Query Language) SQL dialect.

Distinction without difference?
All of this underscores an important point: the database industry is somewhat artificially siloed, and the siloes will continue to dissipate as categories converge. Couchbase, Cassandra and MongoDB - via Apache Drill - all support SQL dialects for query services. Most relational databases (including the next version of Microsoft SQL Server) offer support for JSON as a data type.

That doesn't mean that NoSQL and relational databases are now the same; they most certainly are not. But the hard distinctions are softening. Similar things are happening between Hadoop and Data Warehouse platforms. Witness technology like Cloudera's Impala, Pivotal's HAWQ and HP's Vertica for SQL on Hadoop, which are essentially MPP data warehouse engines running over the Hadoop Distributed File System.

Separate categories have their place, because they force innovation. Eventually, technologies that have a mutual affinity coalesce. Will it be long before we see a NoSQL database that supports schema-based tables alongside unstructured ones? I hope not.

Editorial standards