Big data vs. traditional databases: Can you reproduce YouTube on Oracle's Exadata?
Summary: Want to know how disruptive so-called big data efforts can be to traditional database companies? Try replicating YouTube on Oracle hardware and software.
Increasing data requirements, especially the unstructured information such as video, are going to relegate relational databases to the enterprise scrap heap as an emerging breed of vendors chips away at traditional software powers.
That's the overview from Cowen & Co. analyst Peter Goldmacher. In a 75-page report, Goldmacher walks through the database landscape and concludes that the consensus view that the growth of data will boost traditional database vendors is dead wrong. Goldmacher said:
We believe the vast majority of data growth is coming in the form of data sets that are not well suited for traditional relational database vendors like Oracle. Not only is the data too unstructured and/or too voluminous for a traditional RDBMS, the software and hardware costs required to crunch through these new data sets using traditional RDBMS technology are prohibitive. To capitalize on the Big Data trend, a new breed of Big Data companies has emerged, leveraging commodity hardware, open source and proprietary technology to capture and analyze these new data sets. We believe the incumbent vendors are unlikely to be a major force in the Big Data trend primarily due to pricing issues and not a lack of technical know-how.
Oracle doesn't buy Goldmacher's take. On Oracle's most recent conference call, executives talked up big data and how it will benefit the company.
The crux of Goldmacher's argument that big data will crush traditional database companies revolves around cost. Emerging big data players can price better than large database players like Oracle that have margins to protect. In other words, Oracle would have to charge 9x more than the blended average of big data vendors to solve data conundrums.
Over time, this price differential as well as the growth of corporate unstructured data will mean so-called big data players win. That means the likes of big fish like Oracle and IBM and middle-tier players---HP Vertica, EMC Greenplum and Teradata---will have to deal with the likes of Infobright, 1010 Data, Splunk and Cloudera.
To illustrate this point, Goldmacher did an interesting exercise where outlined how to replicate YouTube on proprietary enterprise systems. Here's what happens to costs when YouTube meets Oracle Exadata machines.
First the assumptions: Goldmacher estimated that YouTube consumption---user uploads of 48 hours of video a minute and 3 billion videos a day along with roughly 45 petabytes of viewed videos a day---would require at least 9 full-rack Exadata machines at $1.5 million each. There would be at least 18 Exadata machines to handle spikes. Those machines would add up to 14 Exalogic devices to serve data at $1.1 million per system. The software stack under Oracle would include WebLogic middleware, Oracle databases, Exadata optimized storage and Oracle as operating system. The open source comparison included JBoss middleware, MySQL, Hadoop and Red Hat Enterprise Linux as the OS.
The bottom line looks like this (click to enlarge):
In a nutshell, the Oracle Exadata capital expenses for hardware and software total $589.4 million compared to an open source and commodity hardware cost of $104.2 million. Annual expenses (staff and support) are $99 million for Oracle Exadata and $15.1 million for an open source stack. The personnel costs are based on the nine engineer staff of the original YouTube team.
Here's a look at the hardware involved:
The open source hardware stack consists of HP server racks, storage with Cisco Nexus switches.
But hardware is fairly simple. The beauty of Oracle's integrated hardware/software stacks---at least for the company---is the licensing and maintenance revenue stream.
Goldmacher noted:
At first glance, total core hardware costs of roughly $155M, just roughly 5% of Google’s current CapEx seem reasonable. This line of thinking lasts until Oracle presents the bill for its software: a not-insignificant $400M for database and Exadata storage licenses alone, bringing the total upfront investment to $570M.
Here's a look at the software costs:
And the open source side.
Now there are a few caveats. Goldmacher didn't create assumptions for in-memory databases like Membase because support pricing wasn't readily available. But overall, you get the picture. Big data may mean some large headaches for established relational database players looking to preserve chunky profit margins.
Related:
- Yahoo, Benchmark Capital launching independent company for Apache Hadoop
- Big Data: Pervasive is making a big bet on Hadoop with accelerator technology
- Oracle's hardware business is all about the software
- IBM launches new Netezza appliance, eyes big data
- Tale of two data center strategies: Apple vs. Facebook
- IBM launches Hadoop-based analytics software, big data services
- Cloudera's latest Hadoop stack generally available
- eBay's Teradata implementation headed to 20 petabytes
- Cloudera, EMC Greenplum form data warehousing alliance
Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.




Talkback
RE: Big data vs. traditional databases: Can you reproduce YouTube on Oracle's Exadata?
RE: Big data vs. traditional databases: Can you reproduce YouTube on Oracle's Exadata?
The article makes a lot of sense. We bailed on Microsoft a few years back, shifting our whole infrastructure to LAMP software. The cost savings was incredible. The time is coming for us to potentially do the same with our enterprise Oracle databases, the technology is mature and we have been able to hire some excellent support staff during the downturn. The next support renewal invoice is going to get sent back to them stamped "Canceled".
LAMP?
All they have to do?
As will support personnel
RE: Big data vs. traditional databases: Can you reproduce YouTube on Oracle's Exadata?
Why China?
Do not you have India? A bit more secure for business. Still cheap workforce.
It's not a real comparison
RE: Big data vs. traditional databases: Can you reproduce YouTube on Oracle's Exadata?
Yeah. RH, IBM, Google, all do not cash on FLOSS they just print their own bucks in pavemants.
Think Strategic Open Source
While it is easy to load balance a lot of cheap hardware, and it looks good on paper, the machines are not designed for always-on, high demand, long-life-span applications. They over heat, they die, and people end up filling landfills with their waste. And don't forget the cost of the electricity (which you left off in the analysis above) for those cheaper machines (often many times more).
However, while I was working with the MySQL and GlassFish teams, we used to talk about "strategic open source" which meant that mission critical systems which take the CIO's main focus, will often stay on proprietary software, but the much larger number of mid-tier and departmental apps that a single enterprise will deploy, should go to a strategic open source player (lower cost, easier to use, more reliable (due to the simpler configuration and administration)). Liferay is reaping the rewards of that fact, but we are also seeing Liferay used for mission critical apps (over 340,000 external websites). So don't count Oracle out, but buy open source for your vast majority of applications and use that position to get your Oracle rep to charge you less.
Why "buy" open source
RE: Big data vs. traditional databases: Can you reproduce YouTube on Oracle's Exadata?
RE: Cisco launches router, switch for smart grids
Incredible unparallels....
2) The statements about cost of some "new" less entrenched players vs. those of the "bigs" are invalid. Our organization has a small Splunk footprint. We have had multiple requests rejected for expansion of use on additional systems. Why? The licensing cost. Yet our use of Oracle is pervasive, despite *its* large licensing costs. It's about the value perception by management, not the actual pricing. And I emphasize "perception", because all of us on the floor know that Splunk saves us countless hours in those limited usages, and could do the same in many others. Uptake of products/vendors often has to do with management savvy (or lack thereof), or their willingness (or not) to listen well enough to their technical people as to what will work or is needed. For organizations that do not have management that is technically knowledgeable and/or are not good critical thinkers will see licensing cost become the universal hammer for every nail...i.e., product licensing decision.
Cost one limiting factor for many enterprises
Costs associated with the initial software/hardware outlay and later operation of these systems in production (where vertical scaling brings hard limits and increased complexity) - used to cause early stage social startups and mature enterprises to defer/limit projects. For some companies Oracle is simply out of the question. For larger, more risk-averse enterprises, expensive data storage just doesn't make sense when the return on investment from a new application is not guaranteed. Even when return/risk is not an issue, higher costs take a bigger portion of the budget, meaning fewer new projects go forward in a given year.
Cost isn't the only issue. We also see a fair number of use cases that were previously considered technically difficult, even downright impossible, using traditional data storage systems.
Whether because of cost savings or new capabilities, the emerging generation of data storage solutions has started to find its way into the enterprise first where pressure to innovate and create new revenue streams is greatest.
Nobody sensible wants to get rid of relational DBMSs
The way forward is to make RDBMSs more scalable by adopting more efficient physical storage methods - abandoning the relational model makes no sense as there is no well proven viable alternative.
In any case haven't I been hearing about the death of the relational DBMS since at least 1992?
The RDBMS will definitely still be here in 20 years time and that is a very, very good thing.
RE: Big data vs. traditional databases: Can you reproduce YouTube on Oracle's Exadata?
Apples vs Oranges
Please read https://oracle.sys-con.com/node/2035475 where I explain how they work together, not against each other.
Ranko Mosic, CEO
Lotus CSP - The Oracle Cloud Experts
email: intk@lotus.in.rs
Web site: http://www.lotus.in.rs
phone: +381-60-33-00-464
Thank you.