MongoDB 3.0 gets ready to roll with WiredTiger engine onboard

With general availability for version 3.0 looming, MongoDB's CTO explains why he thinks it will be so significant for businesses.
Written by Toby Wolpe, Contributor
Eliot Horowitz: Changes the kinds of applications that use MongoDB.
Image: MongoDB

After seven release candidates so far, MongoDB 3.0 is now slated for general availability next month, sporting its new WiredTiger storage engine and Ops Manager toolset.

The open-source NoSQL document database has jumped from version 2.8 to 3.0 because MongoDB says the new release represents more than an incremental change and contains many of the features, including faster storage and better compression, originally planned for 3.0.

"3.0 really is in some ways a whole new Mongo. It has a new storage engine and Ops Manager, and those two things change a huge number of the aspects of how people interact, what kinds of applications you can use with Mongo and how it fits into enterprises," MongoDB co-founder and CTO Eliot Horowitz said.

According to MongoDB, for write-intensive applications WiredTiger, which it acquired in January, is seven to 10 times faster than the old engine, with an industry-standard 80 percent compression rate and improved memory-management features.

"It's a much more advanced storage engine that really changes the kinds of applications that use MongoDB. In the most write-intensive applications, the most mission-critical situations, you'll have the performance and the predictability," Horowitz said.

The company lists possible new classes of applications in internet of things, time-series data analysis, messaging and fraud detection.

"It's anything where you've got clients that are sensitive to time. In finance there are plenty of use cases where time matters and you don't want latency, or for telcos where you're dealing with phone calls and you need to be able to make changes in real time because they route things in real time," he said.

"Frankly, even with any website, if you've got to do a bunch of database queries to serve a page to the user, you don't want users to get a three-second pause once in a while because of some background operation. You want it to be very consistent, very predictable, very smooth with no glitches."

Ops Manager is the on-premise version of the MongoDB Management Service cloud automation tools launched in October.

"With a distributed database, you're not deploying a few servers; you're deploying hundreds and thousands of MongoDB servers. You really need the operational tooling to help you scale that out easily, to be able to manage very effectively at very large scale," Horowitz said.

"With Ops Manager you also have a one-stop shop for deploying, monitoring and backing up your systems, which will take care of everything for you.

"People who've been using it say doing a rolling upgrade of a big cluster is 20 times faster, from having to write scripts, manually log into hundreds of servers, to clicking a few buttons and drinking your coffee while you watch things go."

The WiredTiger storage engine, from the architects who originally developed the Berkeley DB open-source library, now owned by Oracle, offers document-level locking, a feature previously absent from MongoDB.

In addition to compression and record-level locking, WiredTiger also gives MongoDB multi-version concurrency control (MVCC), multi-document transactions, and support for log-structured merge-trees, or LSM trees, for very high insert workloads.

"It's not the default engine just because we're trying not to shock people too much because it's brand new. But it is fully supported and it's going to be recommended for a lot of use cases, if not almost all of them," Horowitz said.

"When you start it up, you're asked which storage engine you want. If you do it through Ops Manager, you'll have a dropdown for which storage engine you want. If you do it from the command line, you'll have a config file, which you pick.

"Both storage engines will be in the same binaries, the same builds. You will download one build, it will have both. You can mix and match or you can have a deployment where some nodes are one and some nodes are the other to test out and explore it as you're migrating."

Horowitz said doing a rolling migrate from engine to the other is relatively easy using replica sets.

"Let's say you have a three-node replica set. What you can do is add a fourth node. That fourth node is a WiredTiger node. Now when that one is ready, you take out one of the other ones. You go to three temporarily, then you bring up another WiredTiger node until you have three WiredTiger nodes and then you're done," he said.

"You can leave it in the middle state for a while. Let's say you want to bring up a WiredTiger node and leave it for a month just to test it out and see how it works or you can bring one up and leave your other ones on the old storage engine indefinitely, just so you can play with it. When you're comfortable you can get rid of your old ones."

Many users remain satisfied with the old storage engine, which has also been improved in version 3.0, he added.

The new release also offers an increase in the number of nodes in a replica set, as well as in the size of replica sets themselves, which is designed to make replica-set elections for high availability faster.

"You've always been able to have a very large number of replica sets, and we've increased the number of nodes in a replica set from 12 to 50 for people who want a lot of copies of data and with a very high replication factor for distribution," Horowitz said.

MongoDB 3.0 also features security improvements, with more advanced password encryption, and the ability to audit all chron operations, with auditing to everything in the system.

More on MongoDB and databases

Editorial standards