SAP Vora is the German Enterprise software company's Apache Spark-based front-end for dimensional query of data stored in HANA, Hadoop-based data lakes and other data repositories, including Amazon S3. It allows users to define a dimensional model over their data and then leverages a push-down architecture to query the data sources without requiring the data to be moved or replicated.
Introduced in preview form a little over two years ago and made generally available in March, 2016 under the name HANA Vora, SAP is today announcing a refreshed version of the product: Vora 2.0.
Containers and Spark, two
This new version of Vora uses a container architecture for deployment and leverages the Kubernetes platform for deployment and cluster management. This makes the product more cloud-ready and, indeed, more hybrid-ready too, as container-based deployment is more dynamic and configurable.
The 2.0 version of Vora is now compatible with 2.x versions of Spark, and the new features they offer. Support for data stored in Microsoft's Azure Data Lake Store is planned for Vora while tighter integration with SAP HANA is here now. The latter is accomplished through Vora's utilization of HANA's wire protocol for connectivity, rather than its generic Spark controller. Ken Tsai, Global VP, Head of Cloud Platform & Data Management, Product Marketing at SAP, explained to me that this newer integration avoids unnecessary transformations being applied to the data. Ostensibly, this is due to more efficient processing of the data on the HANA platform before the data is passed off to Vora.
Tsai also reiterated to me that SAP's new Data Hub product, which I covered a couple of weeks ago, utilizes Vora as the engine on which Data Hub executes its data pipelines, and offers its own integrations with HANA. That means Data Hub will benefit from the new capabilities in Vora 2.0. Furthermore, SAP Data Network, the company's data marketplace platform, also uses Vora and will benefit as well.
Also read: SAP unveils its Data Hub
Customers always right
SAP mentioned that Houston-based CenterPoint Energy is one Vora's now 100 or so customers, and that it utilizes Vora for smart meter analytics. CenterPoint can utilize this capability for better customer service and, by using the data to forecast demand, can more efficiently take advantage of energy spot markets.
Meanwhile, on the Data Network side, SAP names Schindler Elevator as a customer, explaining that the company mashes up sensor data from its elevators with Enterprise Resource Planning (ERP), weather and project data, then makes the mashup data sets available, for a fee. SAP Data Network offers a fully-managed cloud environment which can be a great distribution platform for a company looking to monetize its own data.
Hadoop's not dead, and neither is Spark. Instead, these technologies live on, in the background, embedded in other products. That's where they belong. That's where they do the most good, even for an Enterprise applications company with its own data platform.