When Tom Cruise's character in Top Gun exclaimed "I feel the need, the need for speed", you'd be forgiven for mistaking it for a soundbite from a CIO discussing their transactional databases.
Whether it's a financial organisation predicting share prices, a bank knowing whether it can approve a loan or a marketing organisation targeting consumers with a new promotional offer, the need to access, store, process and analyse data as quickly as possible is an imperative for any business looking to gain a competitive edge.
The birth and emergence of big data
Back in the days of mainframe, you'd find the application and transactional data of reporting databases physically stored in the same system. This was due to applications, operating systems and databases being designed to maximise their hardware resources, which consequently meant you couldn't process transactions and reports simultaneously. The bottleneck here was cost, in that if you wanted to scale, you needed another mainframe.
After the advent of client servers where applications could run on a centralised database server via multiple and cost-effective servers, scalability was achieved by simply adding additional application servers.
Regardless of this, a new bottleneck was quickly established with systems relying on a single database server and requests from ever-increasing application servers that ended up causing I/O stagnation. This problem was exaggerated with OLTP (online transaction processing), where report creation required the system to concurrently read multiple tables in the database. Alongside this, servers and processors kept getting faster, while disks (despite the emergence of SSD) were quickly becoming the bottleneck to automated processes, which produced large amounts of data - consequently resulting in more report requests.
The net effect was a downward spiral where the increase of users requiring an increase of reports from the databases meant an increase in huge amounts of data being requested from disks that simply weren't up to the job.
Factor in the data proliferation from external users caused by the internet and pressure-inducing laws such as Sarbanes-Oxley, and the demand to analyse more data more quickly has reached a critical point.
With data and user volumes increasing by a factor of thousands compared to the I/O capability of databases, the transaction-based industry faced a challenge that required a dramatic shift and change. Cue the emergence of SAP's HANA.
HANA and the need for speed
In 2011, SAP announced its new in-memory platform HANA for enterprise applications, talking up the potential of real-time analytics. SAP claimed HANA would make databases dramatically faster, like traditional business warehouse accelerator systems, but also speed up the front end, enabling companies to run arbitrary, complex queries on billions of records in a matter of seconds as opposed to hours. The vendors of old legacy traditional databases were facing a major challenge, most notably the king of them all, Oracle.
One of the major advantages of SAP HANA's ability to run in real time is that it offers a non-requirement for data redundancy as it's built to run as a single database.
With clusters of affordable and scalable servers, transactional and analytical data are run on the same database, hence eliminating different types of databases for different application needs. Oracle, on the other hand, has built an empire on exactly the opposite.
Oracle has thrived on a model where generally companies start with a simple database that's utilised for checking sales orders and ensuring product delivery to customers, but as the business grows they need more databases with different and more demanding functions.
Functions such as managing customer relationships, complex reporting and analysis drive a need for new databases that are separate from the actual business, requiring data to be moved from one system to another.
Eventually, companies have a sprawl of databases as existing ones are unable to handle the workloads, making it almost impossible to track data movements, let alone attain real-time updates. So, while the Oracle marketing machine is also pitching the benefits of in-memory via its Exalytics appliance and in-memory database TimesTen, Oracle is certainly in no rush to break this traditional model of database sprawl and the money-spinning licences that come with it.
Looking closely at the Oracle Exalytics/TimesTen package, despite the hype, it is merely just an add-on product meaning that an end user will still need a licence for the transactional database, another licence for the data warehouse database and yet another licence for TimesTen for Oracle Exalytics.
Moreover, the Oracle bolt-on approach serves to sell more of hardware commodity and justify the acquisition of Sun, all at a cost to the customer.
Due to the Exalytics approach continuing the traditional requirement for transactional data to be duplicated from the application to the warehouse and once again to Exalytics, the end user not only ends up with three copies of the data, they also have to have three levels of storage and servers.
In contrast, SAP's HANA is designed to be a single database that runs both transactional applications and Business Warehouse deployments. Not only does SAP HANA's one copy of data replace the two or three required for Oracle it also eliminates the need for materialised views, redundant aggregates and indexes leaving a significantly reduced data footprint.
HANA vs TimesTen and Exalytics
As expected, Oracle has already unleashed its marketing teams to tout various claims about HANA as well as pushing a TimesTen like-for-like comparison.
Where this is hugely flawed is that Oracle fails to acknowledge or admit that SAP HANA is a completely new design as opposed to a bolt-on approach. With SAP HANA, data is completely managed and accessed in RAM consequently doing away with the requirement of MOLAP, multiple indexes and other tuning features that Oracle prides itself on.
Furthermore, despite what Oracle may claim, SAP HANA does indeed handle both unstructured and structured data, as well as utilise parallel queries for scaling out across server nodes. Oracle presumably is trying to prevent the market from realising that the TimesTen with Exalytics package still can't scale out beyond the 1TB RAM limit, unlike SAP HANA where each container can store up to 500TB of data all executable at high speed.
With an aggressive TCO and ROI model compared to a traditional Oracle deployment, SAP HANA also proves a lot more cost effective. With pricing based on an incremental of 64GB RAM and the total amount of data held in memory, licences are fully inclusive of production and test/development requirements as well as the necessary tools.
SAP embraces VMware
The recent announcement that SAP HANA is supporting VMware vSphere will provide the company with a competitive advantage, as it will enable customers to provision instances of SAP HANA in minutes as VM templates, as well as gain benefits such as Dynamic Resource Scheduling and vSphere vMotion.
By virtualising SAP HANA with VMware, end users can quickly have several smaller HANA instances all sharing a single physical server leading to better utilisation of existing resources. With the promise of certified pre-configured and optimised converged infrastructures such as the Vblock around the corner, SAP HANA appliances could be shipped with vSphere 5 and SAP HANA pre-installed within days, enabling rapid deployment for businesses.