The elevator pitch for this post is that history travels in circles, especially when it comes to laying out how you deploy enterprise applications and databases. Splice Machine's latest announcement this week for differentiating itself in a world crowded with new databases is recasting itself as a predictive application platform. For the company founder, this represents traveling full circle back to his roots in supply chain optimization. And the same goes for the role that databases play with applications. Hold those thoughts.
Splice Machine has differentiated as a SQL relational, fully ACID compliant database that runs atop HBase that offers transaction processing, and thanks to Spark support, analytics. As my big data bro Andrew Brust pointed out last year, Splice Machine blazes a path that has the Apache Phoenix project nipping at its heels on ACID by integrating with the incubating Apache Tephra project. But both projects do not at this point deliver the level of concurrency that Splice Machine delivers. Nonetheless, Splice Machine doesn't want to get bogged down in HBase database battles. Instead it's pivoting to aim higher in the food chain: enabling data scientists to build predictive applications running in the database.
Splice Machine has been steadily been assembling the technology stack to make this pivot happen. The key is integration of Spark analytics and Zeppelin notebooks that help data scientists author, debug, and deploy their code. With the new AWS marketplace release, Splice Machine is introducing jumpstart templates, starting with building IoT-driven predictive supply chain optimization. This is where the travelling full circle metaphor kicks in: for company founder, Monte Zweben, it's a chance to revisit his roots in supply chain software.
Given that Splice Machine customers have already built applications for fraud detection, digital marketing next-nest offers, customer profitability analysis, and credit dispute resolution, we wouldn't be surprised that the next batch of templates or stubs from Splice Machine and/or its integrator partners could fall in similar domains.
The full circle metaphor kicks in here once again, as building predictive apps via notebooks that run inside the database represent throwbacks to how legacy applications were constructed in the decades predating Y2K -- but with a modern (and more flexible) twist. In the bad old days, homegrown applications hard-wired code with the data, which was typically stored in file systems or non-relational databases. In the run-up to Y2K, prevailing wisdom was to future proof applications by separating the application/business logic from the database tiers, so that theoretically, your database could remain intact regardless of what applications ran on it.
Reality proved much messier, as it often was far more expedient to run the business logic for transforming or manipulating data as stored procedures inside the database anyway. And then when data got really big, it grew more expedient to push down data processing into the data storage tier itself -- that's how Hadoop clusters and many MPP analytic databases were built. But then the cloud came along, offering the flexibility of elasticity, so database engines and applications got separated from the data again.
In a Splice Machine cloud demo, we saw how a predictive IoT supply chain application could be ginned up through the convenience of a managed cloud service in the AWS marketplace. The customer specifies the volume of data, the number of compute units to launch, and desired performance and concurrency levels. The Splice Machine service responds with estimates of monthly costs for transactional and analytic performance based on TPC-C and TPC-H benchmarks, respectively; the customer can adjust them up or down to fit their budgets.
Beneath the hood, Splice Machine generates Docker containers with automatically configured images for HBase, Spark, log search, Zeppelin notebooks, and Kafka that run in a self-healing Mesos architecture. Splice Machine estimates the process should take no more than 15 minutes to get running.
The notebook strategy for Splice Machine provides a low-risk approach for the company to avoid getting caught in the noise of becoming just another innovative database in a sea of them. It does so while appealing to a new audience (data scientists) via a tool that many are embracing (the Zeppelin notebook) without getting caught in the crosshairs of the Oracles or SAPs of the world. The challenge for Splice Machine is that its strategy involves putting together lots of moving parts, and requires getting its partner ecosystem energized to help the company move up the value chain from big data database to solution platform. A partnership with Intrigo, a consulting firm with SAP supply chain experience, is a good start.