With its new release, MariaDB is bringing its separate analytic and transaction offerings into a single platform. It’s a good first move that could benefit in the future with more intelligent automation.
MariaDB is announcing general availability of a new offering, MariaDB Platform X3, which combines its former AX analytics and TX transaction platforms into a single SKU. The new offering still keeps transaction and analytics on separate nodes but unifies them under the same database engine and management umbrella.
The key to bringing the AX and TX platforms together are two new capabilities: a query router that directs queries to the right target, and a change data capture (CDC) facility that replicates data from transactional nodes with row storage to analytical nodes with columnar storage.
Until now, both were offered as separate SKUs that were updated on different schedules, with the row-based offering the original platform. Over time, MariaDB branched out from its initial InnoDB engine (part of its MySQL heritage) with MyRocks, a storage engine for web-scale applications based on technology first developed at Facebook. Both of these were conventional row-based stores. The original platform eventually morphed into MariaDB TX, which bundled additional beyond the original database such as the database firewall, automatic failover and dynamic data masking.
MariaDB later bootstrapped development of a columnar storage engine for analytics, which became MariaDB AX. It was sold separately from the TX offering.
For customers, the advantage is that there's now only one product to buy. You can apportion nodes in the cluster to transactions and analytics as you wish; in fact, if you wanted to go pure row store or pure columnar, you could do that as well. As MariaDB prices per node, not by the storage engine, customers do not have to pay for two data stores if they only want one. Hold that thought for a moment.
MariaDB's bringing together of row and column stores into the same product is hardly unique. Oracle, IBM, and Microsoft also offer hybrid row/column stores where the customer delineates which data sits in row tables and which are put in columns. In most cases, the row stores are either on disk or SSD, while column stores are in-memory. The move toward hybrid transaction and analytic processing is also a common theme for how data platforms as varied as Splice Machine, and behind the Spark connectors for operational platforms like DynamoDB, Azure SQL Database, MongoDB, and others.
The Query Router acts as the brains, either by directing queries based on schema, table, regular expressions (i.e. syntax) or provides "hints" based on preset rules. We believe that in the future, it would make sense to automate this further as another example of how machine learning could optimize databases. Data is ingested, either into the transaction store, and then, via the CDC utility, trickle updated to the column store, or directly loaded to the column store for sources such as clickstreams that would normally not populate a transaction database.
This is a good beginning for MariaDB in consolidating its transaction and analytic databases. As we noted above, customers can configure their deployments flexibly, and therefore are not locked into buying two separate data stores if they don't wish.
And in fact, for transaction applications that embed some analytics, it may not make sense to have a separate column anyway. Down the road, we'd like MariaDB to make that a more attractive option by adding capabilities to its row store, such as automated tiering of data between media such as memory, NVRAM, or SSD (which is fast becoming the default storage for transaction systems). For operational applications embedding analytics, there's precedent for bypassing columnar storage; for instance, SQL Server 2019 lets you perform big data analytics by embedding the database engine on a Hadoop compute node along with Spark.
In conjunction with the MariaDB Platform X3 rollout, the company is also inching its way into its first managed cloud service. This is an area where AWS and Azure already offer managed services that are provisioned on a self-service basis and managed in lights-out manner.
MariaDB will differentiate its offering as a high touch white glove service that is, in essence, outsourcing your DBAs and architects for running MariaDB in the cloud. In so doing, it will not compete with Amazon RDS or Azure Database for MariaDB on price. It's a sensible first step for MariaDB for a couple reasons. First, it is not going to easily underprice AWS or Azure, which already have economies of scale. Secondly, as a smaller player, it will take time to scale the engineering to automate its DBaaS. However, stealing a page from Oracle, when it does automate the DBaaS, it would make sense to introduce machine learning to make the database self-running.