Because it began life as a developer-friendly database, in the eyes of enterprise IT MongoDB has always been guilty until proven innocent. With a successful IPO behind it, MongoDB is getting harder for enterprise IT to ignore. The latest release, MongoDB 3.6, is designed to narrow the gap further.
Today, MongoDB is announcing the makeup of the release, which will be out sometime in December. There is little surprise at the general tenor of the release, which was outlined at the company's annual user conference last June and has been in preview for months. We briefly spoke about features such as a revamped BI connector, stronger JSON document validation, and the closing of an embarrassing back door that left MongoDB instances open to the cold, cruel Internet.
But now we can fill in the details.
MongoDB has used the term "speed" to characterize its new release, as in "speed to develop," "speed to scale," and "speed to insight." By that, it means that richer features for developers and administrators, and enhanced BI connectors for analysts will help them get their jobs done faster. Admittedly, that metaphor may be a bit forced, as it could describe any new productivity enhancement. A more apropos theme is that the new release is another step in filling the gaps for features that would be expected in any enterprise database, like automatic retries for failed writes.
At the business user level, the BI Connector makes it more efficient to query by pushing those operations from the connector tier down into the database. Before this, you could only run left outer joins inside the database; in the new release, the connector takes advantage of enhancements in the aggregation pipeline to perform a wider variety of join operations. And the new release takes the next step in supporting data scientists; it already had a Python driver, and to that, it now has R covered as well.
While the new business user-oriented features might draw the spotlight, we find the developer and administrator features to be more significant in this release.
On the developer end, there is a new change-data-capture (CDC)-like feature called Change Stream that captures and streams changes from MongoDB database logs. Previously, developers would have had to write code to output real-time updates from database logs. Now, changes to MongoDB logs are available through an API that could stream real-time updates to gaming applications, dashboards, or IoT applications. It could also enable retraining of machine learning models. While Change Streams could feed a message queuing engine, this release does not (yet) support integration with Kafka (although that could be hand-coded).
The 3.6 release adds another feature that is expected of an enterprise grade database: automatic retries of writes that failed. The new Retryable Writes eliminates the need for DBAs or application developers to write code to redo failed writes. When used in conjunction with the self-healing functions supported by MongoDB's replica set feature, this could provide near-always on support for write operations. This does not mean that MongoDB has become an ACID database, but it will make the database more reliable. A related feature, causal consistency, ensures that users can read their own writes; until now, users couldn't count on that given Mongo's distributed architecture.
There's another interesting feature that extends data validation -- the assurance that each record has a consistent structure. A capability enshrined with traditional SQL databases, it has never been a strong point for JSON-based document data stores. Ironically, it's not that JSON is unstructured data -- it's quite the opposite. If anything, the structure of JSON documents is more complex than that of SQL. But for the use cases associated with JSON, like IoT data and user profiles, consistent structure has never been much in demand.
As far back as MongoDB 3.2, you could validate documents within a collection, but not across an entire database. The new 3.6 version takes advantage of the new IETF JSON Schema standard to enforce validation across collections of multiple documents, and to make the controls tunable by use case. By comparison, Couchbase, a rival NoSQL document database, can infer the structure of documents in a bucket (their equivalent of a MongoDB collection) and output the results in a JSON schema format. We don't expect that all JSON database implementations are going to become strictly structured. But as some get deployed for more enterprise critical use cases that may require some degree of audit for completeness (e.g., electronic health records), such features could find some MongoDB implementations treated as systems of record.
For DBAs and admins, 3.6 provides some new goodies. Ops Manager, the management pane for MongoDB, has borrowed some features already developed for Atlas (MongoDB's managed cloud service) and Compass, the visual DBA tool. There is the Data Explorer from Atlas that exposes the schema of the database. A new real-time performance advisor flags bottlenecks and makes indexing recommendations (and with a click, those new indexes can be autogenerated). Backups can now be queried; this feature will be useful if a replica has lost some data; rather than restore all of the backups, this new feature lets you query by point in time, and then restore selectively.
A bigger question is whether this cross-fertilization of features will be the first steps towards unifying MongoDB's admin and configuration tools for on-premise and in the cloud, where the back end engine is the same, but has different skins that are exposed to DBAs, developers, and operations admins. Having a common administrative and management experience across on-prem and in the cloud is potentially a key competitive differentiator for databases (like MongoDB) born in the data center vs. cloud-native offerings like DynamoDB, Cosmos DB, and Google Cloud Datastore.
As a whole, the enhancements in 3.6 are critical for MongoDB to up its enterprise game. What's interesting however is that, compared to other databases that have reached out beyond their roots to add features such as SQL query support, MongoDB has stuck to its roots. You may see a more robust BI connector, but you're never going to mistake MongoDB for Redshift or SQL Server.