Big data vs. traditional databases: Can you reproduce YouTube on Oracle's Exadata?

Big data vs. traditional databases: Can you reproduce YouTube on Oracle's Exadata?

Summary: Want to know how disruptive so-called big data efforts can be to traditional database companies? Try replicating YouTube on Oracle hardware and software.

SHARE:

Increasing data requirements, especially the unstructured information such as video, are going to relegate relational databases to the enterprise scrap heap as an emerging breed of vendors chips away at traditional software powers.

That's the overview from Cowen & Co. analyst Peter Goldmacher. In a 75-page report, Goldmacher walks through the database landscape and concludes that the consensus view that the growth of data will boost traditional database vendors is dead wrong. Goldmacher said:

We believe the vast majority of data growth is coming in the form of data sets that are not well suited for traditional relational database vendors like Oracle. Not only is the data too unstructured and/or too voluminous for a traditional RDBMS, the software and hardware costs required to crunch through these new data sets using traditional RDBMS technology are prohibitive. To capitalize on the Big Data trend, a new breed of Big Data companies has emerged, leveraging commodity hardware, open source and proprietary technology to capture and analyze these new data sets. We believe the incumbent vendors are unlikely to be a major force in the Big Data trend primarily due to pricing issues and not a lack of technical know-how.

Oracle doesn't buy Goldmacher's take. On Oracle's most recent conference call, executives talked up big data and how it will benefit the company.

The crux of Goldmacher's argument that big data will crush traditional database companies revolves around cost. Emerging big data players can price better than large database players like Oracle that have margins to protect. In other words, Oracle would have to charge 9x more than the blended average of big data vendors to solve data conundrums.

Over time, this price differential as well as the growth of corporate unstructured data will mean so-called big data players win. That means the likes of big fish like Oracle and IBM and middle-tier players---HP Vertica, EMC Greenplum and Teradata---will have to deal with the likes of Infobright, 1010 Data, Splunk and Cloudera.

To illustrate this point, Goldmacher did an interesting exercise where outlined how to replicate YouTube on proprietary enterprise systems. Here's what happens to costs when YouTube meets Oracle Exadata machines.

First the assumptions: Goldmacher estimated that YouTube consumption---user uploads of 48 hours of video a minute and 3 billion videos a day along with roughly 45 petabytes of viewed videos a day---would require at least 9 full-rack Exadata machines at $1.5 million each. There would be at least 18 Exadata machines to handle spikes. Those machines would add up to 14 Exalogic devices to serve data at $1.1 million per system. The software stack under Oracle would include WebLogic middleware, Oracle databases, Exadata optimized storage and Oracle as operating system. The open source comparison included JBoss middleware, MySQL, Hadoop and Red Hat Enterprise Linux as the OS.

The bottom line looks like this (click to enlarge):

In a nutshell, the Oracle Exadata capital expenses for hardware and software total $589.4 million compared to an open source and commodity hardware cost of $104.2 million. Annual expenses (staff and support) are $99 million for Oracle Exadata and $15.1 million for an open source stack. The personnel costs are based on the nine engineer staff of the original YouTube team.

Here's a look at the hardware involved:

The open source hardware stack consists of HP server racks, storage with Cisco Nexus switches.

But hardware is fairly simple. The beauty of Oracle's integrated hardware/software stacks---at least for the company---is the licensing and maintenance revenue stream.

Goldmacher noted:

At first glance, total core hardware costs of roughly $155M, just roughly 5% of Google’s current CapEx seem reasonable. This line of thinking lasts until Oracle presents the bill for its software: a not-insignificant $400M for database and Exadata storage licenses alone, bringing the total upfront investment to $570M.

Here's a look at the software costs:

And the open source side.

Now there are a few caveats. Goldmacher didn't create assumptions for in-memory databases like Membase because support pricing wasn't readily available. But overall, you get the picture. Big data may mean some large headaches for established relational database players looking to preserve chunky profit margins.

Related:

Topics: Storage, Data Centers, Data Management, Enterprise Software, Hardware, Open Source, Oracle, Software, Social Enterprise

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

19 comments
Log in or register to join the discussion
  • RE: Big data vs. traditional databases: Can you reproduce YouTube on Oracle's Exadata?

    So, all Oracle has to do is drop prices & eat off maintenance revenue which it does even today. What changed?
    mm71
    • RE: Big data vs. traditional databases: Can you reproduce YouTube on Oracle's Exadata?

      @mm71 Dropping license prices would cut their profit margin to the bone, and their stock price would go in the tank. Larry would go from a gazillionaire back down to a mere multi-billionaire.

      The article makes a lot of sense. We bailed on Microsoft a few years back, shifting our whole infrastructure to LAMP software. The cost savings was incredible. The time is coming for us to potentially do the same with our enterprise Oracle databases, the technology is mature and we have been able to hire some excellent support staff during the downturn. The next support renewal invoice is going to get sent back to them stamped "Canceled".
      terry flores
      • LAMP?

        LAMP isn't LAMP if you're using Oracle. It's LAOP.
        bmonsterman
    • All they have to do?

      @mm71 <br><br>Drop prices by 85% or so to be competitive?<br><br>All high cost/margin proprietary solutions will eventually succumb to open source software/commodity hardware given sufficient volumes. I think the numbers in this blog highlight that fact very well. Businesses and consumers will not pay more than necessary forever.
      Economister
      • As will support personnel

        @Economister All people who support MS and LAMP will eventually lose their job to cloud-based systems that can be run by people in China. The days of a USA infrastructure IT job are going out just like auto manufacturing did in the 80s. Become an executive or find a different career.
        A Gray
      • RE: Big data vs. traditional databases: Can you reproduce YouTube on Oracle's Exadata?

        @Economister
        Why China?
        Do not you have India? A bit more secure for business. Still cheap workforce.
        przemoli
  • It's not a real comparison

    Structured data is just that, structured. Having a bunch of files that are semi or unstructured does not suit this approach. The points brought up in the article are moot. We all know open source software is free. We also know that the systems in question are highly customized. The support and development costs have to be included for this to be any kind of real comparison.
    happyharry_z
    • RE: Big data vs. traditional databases: Can you reproduce YouTube on Oracle's Exadata?

      @happyharry_z
      Yeah. RH, IBM, Google, all do not cash on FLOSS they just print their own bucks in pavemants.
      przemoli
  • Think Strategic Open Source

    You forget that Oracle owns GlassFish and MySQL too. So while I am very pro Open Source, given Liferay is my bread and butter, I would never discount the need for the ExaLogic machines. They are very well engineered.

    While it is easy to load balance a lot of cheap hardware, and it looks good on paper, the machines are not designed for always-on, high demand, long-life-span applications. They over heat, they die, and people end up filling landfills with their waste. And don't forget the cost of the electricity (which you left off in the analysis above) for those cheaper machines (often many times more).

    However, while I was working with the MySQL and GlassFish teams, we used to talk about "strategic open source" which meant that mission critical systems which take the CIO's main focus, will often stay on proprietary software, but the much larger number of mid-tier and departmental apps that a single enterprise will deploy, should go to a strategic open source player (lower cost, easier to use, more reliable (due to the simpler configuration and administration)). Liferay is reaping the rewards of that fact, but we are also seeing Liferay used for mission critical apps (over 340,000 external websites). So don't count Oracle out, but buy open source for your vast majority of applications and use that position to get your Oracle rep to charge you less.
    paul.hinz@...
    • Why &quot;buy&quot; open source

      @paul.hinz@... Thanks to all the open source developers who have great open source software that i use everyday. You're not getting a nickle from me, but strangely, I get paid a fortune to use your free stuff. If you're not busy, could you mow my lawn this weekend too? I'm using my cash to go on vacation.
      A Gray
  • RE: Big data vs. traditional databases: Can you reproduce YouTube on Oracle's Exadata?

    Oracle's market is more the financial/telecom/oil & gas bunch. How much unstructured data is coming from these guys? My humble guess would be not much. Sure, Oracle would probably loose in the content/social market but they may not even care
    azrulhasni
    • RE: Cisco launches router, switch for smart grids

      Structured data is just that, structured. Having a bunch of files that are semi or unstructured does not suit this approach. The points brought up in the article are moot. We http://france-pharma.com | http://bluepillsau.com | http://edproblemsolver.com all know open source software is free. We also know that the systems in question are highly customized. The support and development costs have to be included for this to be any kind of real comparison.
      drumandyou
  • Incredible unparallels....

    1) What does this have to do with "big data"? The price ratio is equally (or more) extended when considering small and normalized datasets on Oracle vs. MySQL. There is nothing new about that licensing differential -- and it does not correlate with size or structure of the data.

    2) The statements about cost of some "new" less entrenched players vs. those of the "bigs" are invalid. Our organization has a small Splunk footprint. We have had multiple requests rejected for expansion of use on additional systems. Why? The licensing cost. Yet our use of Oracle is pervasive, despite *its* large licensing costs. It's about the value perception by management, not the actual pricing. And I emphasize "perception", because all of us on the floor know that Splunk saves us countless hours in those limited usages, and could do the same in many others. Uptake of products/vendors often has to do with management savvy (or lack thereof), or their willingness (or not) to listen well enough to their technical people as to what will work or is needed. For organizations that do not have management that is technically knowledgeable and/or are not good critical thinkers will see licensing cost become the universal hammer for every nail...i.e., product licensing decision.
    Techboy_z
  • Cost one limiting factor for many enterprises

    This article captured accurately one of two major hurdles to innovation created by traditional database technologies, as reported by our users (we make an open source data store designed to run on commodity hardware).

    Costs associated with the initial software/hardware outlay and later operation of these systems in production (where vertical scaling brings hard limits and increased complexity) - used to cause early stage social startups and mature enterprises to defer/limit projects. For some companies Oracle is simply out of the question. For larger, more risk-averse enterprises, expensive data storage just doesn't make sense when the return on investment from a new application is not guaranteed. Even when return/risk is not an issue, higher costs take a bigger portion of the budget, meaning fewer new projects go forward in a given year.

    Cost isn't the only issue. We also see a fair number of use cases that were previously considered technically difficult, even downright impossible, using traditional data storage systems.

    Whether because of cost savings or new capabilities, the emerging generation of data storage solutions has started to find its way into the enterprise first where pressure to innovate and create new revenue streams is greatest.
    Tony Falco
  • Nobody sensible wants to get rid of relational DBMSs

    Simple, elegant, highly flexible.

    The way forward is to make RDBMSs more scalable by adopting more efficient physical storage methods - abandoning the relational model makes no sense as there is no well proven viable alternative.

    In any case haven't I been hearing about the death of the relational DBMS since at least 1992?

    The RDBMS will definitely still be here in 20 years time and that is a very, very good thing.
    jorwell
  • RE: Big data vs. traditional databases: Can you reproduce YouTube on Oracle's Exadata?

    When to you tell MYSQL means non relational engine. Achille
    info@...
  • Apples vs Oranges

    As azrulhasni and happyharry_z, this is apples vs oranges comparison. You Tube is write once, read many times, unstructured data. Oracle is read/write, mostly online transaction processing. Oracle and Big Data are complementary, and Big Data is mostly read only data processing for large unstructured data.
    Please read https://oracle.sys-con.com/node/2035475 where I explain how they work together, not against each other.

    Ranko Mosic, CEO
    Lotus CSP - The Oracle Cloud Experts
    email: intk@lotus.in.rs
    Web site: http://www.lotus.in.rs
    phone: +381-60-33-00-464
    ranko.mosic@...
    • Thank you.

      NT
      bmonsterman
  • RDBMS and NoSQL can co-exist for some more time

    I think RDBMS and NoSql can co-exist for next 5-10 years but in long term RDBMS might become an subset of NoSQL (Merger of both where NoSQL Bid Data db might use RDBMS internally )
    blue_sky88