Between the Lines

Larry Dignan, Andrew Nusca and Rachel King

Big data vs. traditional databases: Can you reproduce YouTube on Oracle's Exadata?

By | July 8, 2011, 2:09am PDT

Summary: Want to know how disruptive so-called big data efforts can be to traditional database companies? Try replicating YouTube on Oracle hardware and software.

Increasing data requirements, especially the unstructured information such as video, are going to relegate relational databases to the enterprise scrap heap as an emerging breed of vendors chips away at traditional software powers.

That’s the overview from Cowen & Co. analyst Peter Goldmacher. In a 75-page report, Goldmacher walks through the database landscape and concludes that the consensus view that the growth of data will boost traditional database vendors is dead wrong. Goldmacher said:

We believe the vast majority of data growth is coming in the form of data sets that are not well suited for traditional relational database vendors like Oracle. Not only is the data too unstructured and/or too voluminous for a traditional RDBMS, the software and hardware costs required to crunch through these new data sets using traditional RDBMS technology are prohibitive. To capitalize on the Big Data trend, a new breed of Big Data companies has emerged, leveraging commodity hardware, open source and proprietary technology to capture and analyze these new data sets. We believe the incumbent vendors are unlikely to be a major force in the Big Data trend primarily due to pricing issues and not a lack of technical know-how.

Oracle doesn’t buy Goldmacher’s take. On Oracle’s most recent conference call, executives talked up big data and how it will benefit the company.

The crux of Goldmacher’s argument that big data will crush traditional database companies revolves around cost. Emerging big data players can price better than large database players like Oracle that have margins to protect. In other words, Oracle would have to charge 9x more than the blended average of big data vendors to solve data conundrums.

Over time, this price differential as well as the growth of corporate unstructured data will mean so-called big data players win. That means the likes of big fish like Oracle and IBM and middle-tier players—HP Vertica, EMC Greenplum and Teradata—will have to deal with the likes of Infobright, 1010 Data, Splunk and Cloudera.

To illustrate this point, Goldmacher did an interesting exercise where outlined how to replicate YouTube on proprietary enterprise systems. Here’s what happens to costs when YouTube meets Oracle Exadata machines.

First the assumptions: Goldmacher estimated that YouTube consumption—user uploads of 48 hours of video a minute and 3 billion videos a day along with roughly 45 petabytes of viewed videos a day—would require at least 9 full-rack Exadata machines at $1.5 million each. There would be at least 18 Exadata machines to handle spikes. Those machines would add up to 14 Exalogic devices to serve data at $1.1 million per system. The software stack under Oracle would include WebLogic middleware, Oracle databases, Exadata optimized storage and Oracle as operating system. The open source comparison included JBoss middleware, MySQL, Hadoop and Red Hat Enterprise Linux as the OS.

The bottom line looks like this (click to enlarge):

In a nutshell, the Oracle Exadata capital expenses for hardware and software total $589.4 million compared to an open source and commodity hardware cost of $104.2 million. Annual expenses (staff and support) are $99 million for Oracle Exadata and $15.1 million for an open source stack. The personnel costs are based on the nine engineer staff of the original YouTube team.

Here’s a look at the hardware involved:

The open source hardware stack consists of HP server racks, storage with Cisco Nexus switches.

But hardware is fairly simple. The beauty of Oracle’s integrated hardware/software stacks—at least for the company—is the licensing and maintenance revenue stream.

Goldmacher noted:

At first glance, total core hardware costs of roughly $155M, just roughly 5% of Google’s current CapEx seem reasonable. This line of thinking lasts until Oracle presents the bill for its software: a not-insignificant $400M for database and Exadata storage licenses alone, bringing the total upfront investment to $570M.

Here’s a look at the software costs:

And the open source side.

Now there are a few caveats. Goldmacher didn’t create assumptions for in-memory databases like Membase because support pricing wasn’t readily available. But overall, you get the picture. Big data may mean some large headaches for established relational database players looking to preserve chunky profit margins.

Related:

Kick off your day with ZDNet's daily e-mail newsletter. It's the freshest tech news and opinion, served hot. Get it.

Topics

Larry Dignan is Editor in Chief of ZDNet and SmartPlanet as well as Editorial Director of ZDNet's sister site TechRepublic.

Disclosure

Larry Dignan

Larry Dignan has nothing to disclose. He doesn’t hold investments in the technology companies he covers.

Biography

Larry Dignan

Larry Dignan is Editor in Chief of ZDNet and SmartPlanet as well as Editorial Director of ZDNet's sister site TechRepublic. He was most recently Executive Editor of News and Blogs at ZDNet. Prior to that he was executive news editor at eWeek and news editor at Baseline. He also served as the East Coast news editor and finance editor at CNET News.com. Larry has covered the technology and financial services industry since 1995, publishing articles in WallStreetWeek.com, Inter@ctive Week, The New York Times, and Financial Planning magazine. He's a graduate of the Columbia School of Journalism and the University of Delaware.

For daily updates, follow Larry on Twitter.

Related Discussions on TechRepublic

Did you know you can take part in these discussions with your ZDNet membership?
18
Comments

Join the conversation!

Just In

Thank you.
bmonsterman 12th Mar
NT
So, all Oracle has to do is drop prices & eat off maintenance revenue which it does even today. What changed?
@mm71 Dropping license prices would cut their profit margin to the bone, and their stock price would go in the tank. Larry would go from a gazillionaire back down to a mere multi-billionaire.

The article makes a lot of sense. We bailed on Microsoft a few years back, shifting our whole infrastructure to LAMP software. The cost savings was incredible. The time is coming for us to potentially do the same with our enterprise Oracle databases, the technology is mature and we have been able to hire some excellent support staff during the downturn. The next support renewal invoice is going to get sent back to them stamped "Canceled".
0 Votes
+ -
LAMP?
bmonsterman 12th Mar
LAMP isn't LAMP if you're using Oracle. It's LAOP.
0 Votes
+ -
All they have to do?
Economister Updated - 8th Jul
@mm71

Drop prices by 85% or so to be competitive?

All high cost/margin proprietary solutions will eventually succumb to open source software/commodity hardware given sufficient volumes. I think the numbers in this blog highlight that fact very well. Businesses and consumers will not pay more than necessary forever.
0 Votes
+ -
As will support personnel
A Gray 8th Jul
@Economister All people who support MS and LAMP will eventually lose their job to cloud-based systems that can be run by people in China. The days of a USA infrastructure IT job are going out just like auto manufacturing did in the 80s. Become an executive or find a different career.
@Economister
Why China?
Do not you have India? A bit more secure for business. Still cheap workforce.
0 Votes
+ -
It's not a real comparison
happyharry_z 8th Jul
Structured data is just that, structured. Having a bunch of files that are semi or unstructured does not suit this approach. The points brought up in the article are moot. We all know open source software is free. We also know that the systems in question are highly customized. The support and development costs have to be included for this to be any kind of real comparison.
@happyharry_z
Yeah. RH, IBM, Google, all do not cash on FLOSS they just print their own bucks in pavemants.
0 Votes
+ -
Think Strategic Open Source
paul.hinz@... 8th Jul
You forget that Oracle owns GlassFish and MySQL too. So while I am very pro Open Source, given Liferay is my bread and butter, I would never discount the need for the ExaLogic machines. They are very well engineered.

While it is easy to load balance a lot of cheap hardware, and it looks good on paper, the machines are not designed for always-on, high demand, long-life-span applications. They over heat, they die, and people end up filling landfills with their waste. And don't forget the cost of the electricity (which you left off in the analysis above) for those cheaper machines (often many times more).

However, while I was working with the MySQL and GlassFish teams, we used to talk about "strategic open source" which meant that mission critical systems which take the CIO's main focus, will often stay on proprietary software, but the much larger number of mid-tier and departmental apps that a single enterprise will deploy, should go to a strategic open source player (lower cost, easier to use, more reliable (due to the simpler configuration and administration)). Liferay is reaping the rewards of that fact, but we are also seeing Liferay used for mission critical apps (over 340,000 external websites). So don't count Oracle out, but buy open source for your vast majority of applications and use that position to get your Oracle rep to charge you less.
0 Votes
+ -
Why "buy" open source
A Gray 8th Jul
@paul.hinz@... Thanks to all the open source developers who have great open source software that i use everyday. You're not getting a nickle from me, but strangely, I get paid a fortune to use your free stuff. If you're not busy, could you mow my lawn this weekend too? I'm using my cash to go on vacation.
Oracle's market is more the financial/telecom/oil & gas bunch. How much unstructured data is coming from these guys? My humble guess would be not much. Sure, Oracle would probably loose in the content/social market but they may not even care
Structured data is just that, structured. Having a bunch of files that are semi or unstructured does not suit this approach. The points brought up in the article are moot. We http://france-pharma.com | http://bluepillsau.com | http://edproblemsolver.com all know open source software is free. We also know that the systems in question are highly customized. The support and development costs have to be included for this to be any kind of real comparison.
0 Votes
+ -
Incredible unparallels....
techboy_z 11th Jul
1) What does this have to do with "big data"? The price ratio is equally (or more) extended when considering small and normalized datasets on Oracle vs. MySQL. There is nothing new about that licensing differential -- and it does not correlate with size or structure of the data.

2) The statements about cost of some "new" less entrenched players vs. those of the "bigs" are invalid. Our organization has a small Splunk footprint. We have had multiple requests rejected for expansion of use on additional systems. Why? The licensing cost. Yet our use of Oracle is pervasive, despite *its* large licensing costs. It's about the value perception by management, not the actual pricing. And I emphasize "perception", because all of us on the floor know that Splunk saves us countless hours in those limited usages, and could do the same in many others. Uptake of products/vendors often has to do with management savvy (or lack thereof), or their willingness (or not) to listen well enough to their technical people as to what will work or is needed. For organizations that do not have management that is technically knowledgeable and/or are not good critical thinkers will see licensing cost become the universal hammer for every nail...i.e., product licensing decision.
0 Votes
+ -
This article captured accurately one of two major hurdles to innovation created by traditional database technologies, as reported by our users (we make an open source data store designed to run on commodity hardware).

Costs associated with the initial software/hardware outlay and later operation of these systems in production (where vertical scaling brings hard limits and increased complexity) - used to cause early stage social startups and mature enterprises to defer/limit projects. For some companies Oracle is simply out of the question. For larger, more risk-averse enterprises, expensive data storage just doesn't make sense when the return on investment from a new application is not guaranteed. Even when return/risk is not an issue, higher costs take a bigger portion of the budget, meaning fewer new projects go forward in a given year.

Cost isn't the only issue. We also see a fair number of use cases that were previously considered technically difficult, even downright impossible, using traditional data storage systems.

Whether because of cost savings or new capabilities, the emerging generation of data storage solutions has started to find its way into the enterprise first where pressure to innovate and create new revenue streams is greatest.
Simple, elegant, highly flexible.

The way forward is to make RDBMSs more scalable by adopting more efficient physical storage methods - abandoning the relational model makes no sense as there is no well proven viable alternative.

In any case haven't I been hearing about the death of the relational DBMS since at least 1992?

The RDBMS will definitely still be here in 20 years time and that is a very, very good thing.
When to you tell MYSQL means non relational engine. Achille
1 Vote
+ -
Apples vs Oranges
ranko.mosic@... 4th Nov
As azrulhasni and happyharry_z, this is apples vs oranges comparison. You Tube is write once, read many times, unstructured data. Oracle is read/write, mostly online transaction processing. Oracle and Big Data are complementary, and Big Data is mostly read only data processing for large unstructured data.
Please read https://oracle.sys-con.com/node/2035475 where I explain how they work together, not against each other.

Ranko Mosic, CEO
Lotus CSP - The Oracle Cloud Experts
email: intk@lotus.in.rs
Web site: http://www.lotus.in.rs
phone: +381-60-33-00-464
0 Votes
+ -
Thank you.
bmonsterman 12th Mar
NT

Join the conversation!

Formatting +
BB Codes - Note: HTML is not supported in forums
  • [b] Bold [/b]
  • [i] Italic [/i]
  • [u] Underline [/u]
  • [s] Strikethrough [/s]
  • [q] "Quote" [/q]
  • [ol][*] 1. Ordered List [/ol]
  • [ul][*] · Unordered List [/ul]
  • [pre] Preformat [/pre]
  • [quote] "Blockquote" [/quote]
ie8 fix

The best of ZDNet, delivered

ZDNet Newsletters

Get the best of ZDNet delivered straight to your inbox

Facebook Activity

White Papers, Webcasts, & Resources
ie8 fix