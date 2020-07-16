Beyond its origins as a fork of MySQL for true believers, MariaDB has been building a varied portfolio to differentiate itself with a choice of storage engines. The latest addition to the roster, Xpand, comes as part of the new X5 release that also adds some optimizations for on the fly configuration changes to its existing platform.

All this is very consistent with a core theme underlying MariaDB's commercial product portfolio that we noted a few months back with its customizable cloud Power Tier: it's a platform designed so customers can have it their way.

The new Xpand engine is the result of the Clustrix acquisition from early 2018. The deal brought a distributed SQL platform designed for transaction processing based on a sharded architecture. Until now, Clustrix was a standalone architecture requiring a separate install. Xpand transforms Clustrix into a native MariaDB engine on par with the others in the portfolio.

Xpand has some resemblance to platforms like MongoDB or Oracle RAC that also shard data. The guiding notion is enabling the database to scale out, distributing portions of the data on different nodes based on traffic patterns, typically keeping related data on the same shard. In such a setup, queries, writes, or joins involving data on multiple shards are routed automatically to the right node. It is a superset of the functionality of the Spider storage engine that added sharding and parallel query support to the open source MariaDB 10.4 release.

Xpand is not the only distributed storage engine in the MariaDB portfolio. Galera Cluster is an open source project supported on the MariaDB platform that has complete replicas on multiple nodes. It's described as a multi-master cluster, but that's a big of an exaggeration: it is a solution that allows promotion of different clusters, primarily for high-availability (HA) failover (rather than transaction writing) scenarios. Besides HA, Galera is best suited for distributed database deployments where there is relatively little chance of write conflicts, such as global applications where transactions in North America, Europe, and Asia are likely only to involve data pertaining to the local region.

Also with the X5 release are some new optimizations, such as the ability to automatically tweak some InnoDB storage engine parameters (such as changing the size of redo logs) on the fly without having to take the database offline. Additionally, the X5 release replaces cluster configuration scripts for the column store with more manageable APIs.

Given that MariaDB acquired Clustrix over two years ago, it's reasonable to ask what took so long to get it incorporated into the mother ship. Incorporating it required changes to data search and indexing functions, along with changes to load balancing and the MariaDB MaxScale proxy that decouples operational configurations (e.g., high availability, scalability, security) from the core database. Additionally, the original Clustrix platform did not support all of the SQL syntax and user-defined function capabilities of the mother ship.

If you don't count Galera, MariaDB now supports five storage engines as plug-ins. They include the column store for analytics, which was added as a plug-in back in January. The others include InnoDB, the original engine, designed for common, single node, transaction processing scenarios; MyRocks, aimed at more write-intensive scenarios such as with IoT; and Spider, for sharded, parallel query.

A single deployment can contain one or more of these engines, with built-in capabilities for replicating data between the engines without requiring separate mechanisms such as change-data-capture (CDC). Data can be joined between engines, such as between column and row stores, that in some cases, can provide alternatives to denormalization that would otherwise be required for performance with some analytic queries.

For now, Xpand only supports transaction processing, but given MariaDB's capabilities to run queries across multiple storage engines, we wouldn't be surprised if Xpand also expands support for analytics sometime down the road.