'Big data as a service' is here, but is anybody ready?

'Big data as a service' is here, but is anybody ready?

Summary: A data expert observes that the pieces are falling into place for BDaaS, but ethical questions arise.

SHARE:

Is "big data as a service" — BDaaS — a real badass idea or what?

Data-Big Data 2 Photo by Joe McKendrick
(Image: Joe McKendrick/ZDNet)

From a technical standpoint, BDaaS is perfectly doable, wrote Philip Wik, database administrator for Redflex in Service Technology. Sure, the last thing we need is another "something as a service" term. But the real question is: What would the business do with it?

It's inevitable that big data — defined as volumes of data, in new varieties, moving through organizations at close to real-time speeds — will be driving all technology decisions in the near future, Wik believes. "Big data, in combination with clouds, EDA, and SOA, is defining the future of information technology," he said. "Although third-form normalized operational data stores will retain their value, big data will not only supplement, but will eventually replace, data warehousing star schemas, as infrastructure costs decline and presentation level analytical tools increase in both number and in sophistication."

The ingredients necessary for BDaaS include a high-functioning service-oriented architecture, cloud virtualization capabilities, complex event-driven processing, Hadoop, and business intelligence tools than provide deep analytics. The pieces are already falling into place, Wik observed: "As big data software continues to improve, changes can be made in the user interface, communications, data storage, and task processing layers without having to rebuild the entire architecture," said Wik. "BDaaS can be regarded as an asynchronous SOA, with a complementary relationship between services and events."

Again, the question is, is everybody ready for this? Wik says the immediate, as-a-Service availability of big data analytics creates some troubling questions, not only from a business standpoint, but from an ethics standpoint as well. The power of analysis that big data provides can be abused. "Rarely in the technical literature do the words ethics and big data appear in the same sentence," he stated. "We can define what big data is, but do we understand what big data means? As a matter of commercial self-interest in the context of our universal rights, we must address the ethics of big data."

As Wik puts it:

Vast databases that talk to other vast databases could erode our sphere of privacy to the point that privacy will cease to exist, even for those who believe that they are off the grid. Because of the ubiquity of sensors and cameras, the grid is our existence itself.

The ethics of big data is one concern, and another should be the efficacy of big data. Business leaders may not have grasped its potential as of yet, and even those working with it have only begun to dip their toes in it. Unfortunately, things being what they are, "big data" is being sold as the next transformative panacea for all business ills. But as with anything technology related, applying big data analytics will not in and of itself deliver profitability or growth. What is needed is enlightened management that understands its potential.

Indeed, as Renee Boucher Ferguson pointed out in a post at MIT Sloan Management Review, there's even a danger of management becoming blindsided by too much reliance on the promises made around big data. As Richard Haass, president of the Council on Foreign Relations, asked IBM CEO Ginni Rometty at a recent meeting:

You don’t worry at all that there's a danger in data? I was sitting there listening, and I was thinking of what Wayne Gretzky says about, "you don’t skate to where the puck is; you skate to where the puck is going to be." Can't reams of data get in the way? Doesn't data at some point almost force you inside the box and towards averages?

Ferguson also discussed the risk of biases and gaps in the data that organizations are coming to rely on — "getting drawn into particular kinds of algorithmic illusions".

The bottom line is we have the tools to build BDaaS. But what we do with it is another, perplexing question.

Topics: Big Data, Cloud, Data Management, IT Priorities

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

6 comments
Log in or register to join the discussion
  • Which Data?

    The problems that seem to be ignored here are where the data are coming from and who gives who the right to do what with the data? Are the data obtained from one source for one purpose and then repurposed for something else entirely? There sure as hell are ethical questions! Unfortunately, people seen perfectly willing to define ethics as necessary to fit their own purposes and perceptions. The level of concern for security and maturity of purpose runs from sad to frightening. See examples of this in the post today on my blog:

    www.bigdataandthelaw.com

    (Shameless self promotion? Yes. Spot on? Yes again.)

    One does not have to be Big Business or Big Government to be concerned about this. There are a lot of things that we can do but should not do.
    tkeller@...
    • Big data itself is not ready

      Whenever you want some useful data you need to present it in a data structure first, and there you have a problem with these so-called 'big data' solutions: They all store data in a non-structural form (key-value pairs). Bummer! When you use non-structural data you are not going to mine value out swiftly.

      Any type of non-trivial query could quickly turn into a table scan of Terabytes (or Petabytes) of data. You can just imagine how slow that thing is. All these claims that they are ready are simply disingenous. There's no magic bullet here. No data structure then no speed. No speed then no use.
      LBiege
      • sounds like someone who hasn't tried it ....

        A easy way to avoid the trap to suggest is running expensive queries at intervals (aka Joins) and then pulling (straight selects) as needed. Incidentally this is exactly how most DW tools work.

        The other thing is scale, Yahoo has 10K node hadoop clusters. If you want to find the answer, Hadoop scales close to linearly.

        Peace,
        Tom
        mobile_manny
  • meh

    "It's inevitable that big data -- defined as volumes of data, in new varieties, moving through organizations at close to real-time speeds -- will be driving all technology decisions in the near future, Wik bellieves."

    Meh. Okay, one person's opinion.

    On a topic that is pretty much so vaguely defined it's safe to say it's just a marketing buzzword.

    Not to mention it's basically looking for a needle in a haystack when you don't even know if anybody put a needle in the haystack. Unless you're looking for something very specific that you KNOW helps your business, I'm not entirely convinced that "big data" will always have an ROI.

    "Although third-form normalized operational data stores will retain their value, big data will not only supplement, but will eventually replace, data warehousing star schemas, as infrastructure costs decline and presentation level analytical tools increase in both number and in sophistication."

    Groan. All that just to say "we think data mining will replace relational databases."

    ("data mining" is the term academia used before marketing turned it into "big data")

    "data warehousing star schemas" - really? Can't just use "relational database?"

    Sounds like "Wik" is being overly wordy to sound impressive. I'm not impressed.


    In any case, "big data" seems to be so poorly defined that you can even call a large enough relational database "big data" - the term has no real distinction as to what exactly it is, only that it's "big."
    CobraA1
    • a typical definition might be ...

      Volume + Velocity + Variety == Value
      mobile_manny
  • Most orgs have not gotten little data

    Big data is great. It can provide insights that are not obtainable in any other way. Unfortunately, most organizations are not successfully using little data today. Think one Data Warehouse with all corporate data nicely organised, with no duplication so that end users can do real time queries of anything that they care to try. My recommendation is to get to little data nirvana before attempting big data. Why? (1) If you cannot do little data it is highly unlikely that you will succeed with big data. (2) Great little data makes it easier to succeed in big data.
    waltersokyrko@...