The future of SAP HANA

The message that came out of SAPPHIRE last week is that SAP remains all in on HANA, and that the future is making it cloud-native.

hasso-platnner.jpg

Hasso Plattner speaking at SAPPHIRE 2019

Credit: SAP

While the recently closed acquisition of Qualtrics provided many of the headlines at SAPPHIRE last week, the fact that SAP remains fully committed to HANA had the most immediate significance. The future of SAP HANA was very much the spotlight of SAP chairman Hasso Plattner's day two keynote.

Normally, a speech about HANA would not make headlines. But in the wake of major staff restructuring last winter that lead to considerable sturm und drang about the future of HANA, it was important for SAP to reaffirm that it was staying the course.

As Larry Dignan reported last week, HANA is the underpinning of the bevy of new services that SAP is releasing on its cloud platform. And SAP, like most of its enterprise technology brethren, is now adopting a cloud-first game plan for HANA development – where new features get introduced first to the cloud version before they trickle down to the on-premise editions. And it is backing it up with a new Embrace program encompassing reference architectures and jumpstart roadmaps and services offered by SAP in conjunction with its public cloud partners.

SAP has been professing its allegiance to the cloud for the last 3 – 4 years, but the changes beneath the hood that were announced this year make the strategy sustainable, both technically and economically.

It starts with the most important underlying change: refactoring the HANA database to separate storage from compute. That's the first step that any database platform must take to exploit the horizontal scale and elasticity of the cloud. That allows more flexibility, both in the way applications are deployed and how access is priced.

While the benefits of elasticity are more obvious for applications or use cases with volatile traffic patterns – think online gaming – enterprise transaction systems can have their own unique peaks and valleys. Just go to any accounting departments when they're busy churning out end of period reporting or retailers on Black Friday. And as next-generation ERP systems embed operational analytics to support real-time decisions, or integrate with IoT to support those decisions, resource usage will become less predictable.

The next step is containerization, which allows the cloud provider to manage and direct resources far more economically, and for opening the way to operate in a private or hybrid cloud that would allow SAP clients bound by policies or regulations to keep data on premises, to take advantage of modern deployment practices.

The backdrop to HANA's cloud-first metamorphosis is the sunsetting of development of its legacy enterprise ECC suite, for which development (but, obviously not support) will sunset in 2025. And while the next generation of SAP's enterprise suite, headlined by S/4HANA for ERP and C/4HANA for CRM, will also be available in on-premise editions, refactoring HANA for cloud-native deployment makes it better suited, not only for running the flagship products, but enabling partners to develop apps that could benefit from elasticity and containerization. The first product to come out of the new cloud-native architecture will be the SAP HANA Data Warehouse Cloud.

Among enterprise application providers, Oracle has already laid down the gauntlet with its release of the autonomous database service in its public cloud. Besides automating most routine DBA functions, the Oracle public cloud database service (on which Oracle applications are based) also supports elasticity.

SAP HANA is further exploiting the economics of cloud storage with new support for data tiering. As in-memory database, one could hit the wall on cost because DRAM memory is the most expensive form of storage. Admittedly, those limits were mitigated by the high compression factor for data stored on HANA and the fact that most transaction databases do not hit the multiple terabytes (or petabytes) of the largest analytic systems.

But, as SAP is positioning HANA as the core pillar of its future transaction, analytic, and machine learning platforms, a tiering strategy for data is the clear answer. Previously, SAP HANA users did have an option for accessing data on disk through Smart Data Access, originally developed by Sybase. But that was more of a data virtualization strategy; it would allow queries on HANA to be redirected to data sitting on disk as extended tables.

Data tiering allows data to be automatically directed to the most economical storage media based on the frequency of access, which is often expressed as "temperature." Traditionally encompassing fast disk, dense disk, and tape, the plunging costs of storage media in recent years have added DRAM memory and Flash to the list of options, and at the other end of the spectrum, cloud object storage. And here, SAP is making a bet on Optane, a new form of persistent memory storage developed by Intel based on 3DXpoint technology, which is supposed to provide almost the read performance of DRAM memory, but at price levels closer to those of Flash.

So SAP is going all-in on data tiering for HANA, and it is the cornerstone for SAP's embrace of Optane, which is intended to greatly expand the volume and range of data that gets the fast treatment.

Optane is based on a new, three-dimensional persistent memory architecture called 3DXpoint on which Intel and Micron technology originally partnered. Optane has had a slow start due to the fact that most machines have not implemented it optimally. But where there's smoke, there's fire. Micron remained bullish enough in the underlying 3DXpoint technology that it recently bought out Intel's share of their joint development and manufacturing partnership to develop its own implementation. New benchmarks are beginning to show Optane's promise.

For SAP HANA, Optane won't replace memory, but surround it with more massive fast storage. Memory will still be kept for writes and extremely volatile data, with Optane reserved for reads, where its strength is. That's possible out of the box with HANA because it already partitions volatile (changeable data) from nonvolatile data. In this case, memory can be slotted out with Optane for nonvolatile data.

But as we noted above, the trick to getting performance out of Optane is by using it properly. Optane has an AppDirect mode that maximizes read performance, but to take advantage of it, software providers like SAP must rewrite their applications (SAP is the first database provider to do so). That explains why initial take-up of Optane has not exactly been thundering. For SAP's Optane bet to pay off, it needs to prod its partners to do likewise.

As to the rest of SAP's data tiering strategy or HANA, it is looking to its the IQ data warehouse, the platform that came through the Sybase acquisition that was the first major commercial columnar data store, as the disk-based data lake for relational data. Beyond that, it is supporting options for storing data in Hadoop HDFS and cloud storage for virtual access. SAP includes a data lifecycle management tool at the table level that works for SAP HANA-native applications.

Although HANA is a relational database, it has supported storing of nonrelational data such as spatial, document, and graph. These nonrelational datatypes have typically been stored as BLOBs that are accessed relationally. But SAP's intention is to make HANA more extensible by making it into a multi-model database. That is a work in progress; currently, it takes a common approach among relational platforms by making non-relational data types accessible through SQL data calls.

To become fully multi-model, SAP HANA needs APIs that access data types such as graph, spatial, or document in their native form. It has ready access to relevant technology: As part of the Callidus Cloud acquisition, it gained the OrientDB open source database that supports graph, key-value, document, and object-oriented data models under a common API. That provides a clear path; however, for documents, we hope that SAP would also add a MongoDB-compatible API – not to replace or compete with MongoDB, but to provide access to a broader developer base.

Unlike Oracle, SAP is courting public cloud providers to host its managed HANA database as a service (and managed S/4 and C/4HANA SaaS services). For cloud providers, the draw is SAP's direct entrée to the crème of the enterprise computing base. Each of the cloud providers are upping the ante with support of new instances. For instance, Microsoft Azure will be making available single instances supporting up to 6 TBytes on a single VM, while AWS is leveraging its Nitro hypervisor to enable bare metal-like performance for platforms like HANA utilizing NVMe storage such as Optane. The Embrace program, mentioned above, is SAP's response to mobilizing customers for cloud deployment.

While SAP will continue to support on-premise deployment of HANA, its future is clearly in the cloud.