Hadoop to complement, not eliminate, relational databases

Traditional relational databases, offering low latency and business intelligence features, still relevant but open source-based Hadoop will play increasingly important role in processing structured and unstructured data, observers say.
Written by Kevin Kwang, Contributor

Hadoop adoption is picking up as more unstructured data flows into the enterprise arena, but its rising importance will not relegate traditional relational database management systems (RDBMS)to the scrapheap. Rather, the open source-based data processing framework complements such systems with additional analytical capabilities, according to industry watchers.

Hari Vasudev, vice president of cloud platform group at Yahoo, explained that while Hadoop is "extremely efficient" in processing large volumes of structured and unstructured data, it does so with high latencies. It is suitable for supporting ad-hoc queries on a conventional data warehouse, but cannot replace RDBMS deployed for traditional business intelligence (BI) functions which have low latency requirements, Vasudev added.

Asked where Hadoop is best employed in today's datacenter environment, he told ZDNet Asia in an e-mail that the open source framework is now commonly used as an analytical tool in a wide array of business situations and industries.

Elaborating, he noted that if an organization has unmet business needs for a high degree of processing complexity, as well as high volumes of data that is predominantly non-structured and with analytical techniques that are continually evolving or challenged, then it should explore the benefits of adopting big data analytics such as Hadoop to "gain a competitive advantage over more risk-adverse enterprises".

Moaiyad Taher Hoosenally, industry principal of ICT practice at Frost & Sullivan Asia-Pacific, gave a slightly different perspective. The analyst said Hadoop is expected to have "some level of impact" on the function of conventional RDBMS over time.

Citing the example of Dell which now sells servers preconfigured with Hadoop, he noted that other IT vendors will begin to offer such big data analytics capability straight from the box.

Hoosenally added that Hadoop would be best employed initially by the banking and financial services, utilities and telecommunications verticals, but said the technology would over time be accepted by all other industry segments.

Microsoft Asia-Pacific's platform strategy lead, Chris Levanes, agreed that Hadoop and other projects from the open source software community were "providing a set of compelling big data toolset for early adopters".

Levanes noted that such offerings are still nascent and many vendors, developers and researchers continue to work on leveraging the technology for their respective projects. He added that Microsoft believes big data will enter the mainstream in the future and is investing and participating in many projects in this area.

One such initiative is Project Daytona, which he described as a "simple, easy-to-use interface for developers to write machine-learning and data analytics algorithms without knowing too much about distributed or Windows Azure".

ZDNet Asia's sister site CNET reported earlier that the Daytona toolkit freed up scientists from having to code their own software tools and provided them with the ability to focus their energies on analyzing the largest data collections.

Adoption challenges persist
That said, Vasudev pointed out that Hadoop's workloads "vary a lot", noting that network was the hardest variable to nail down. "The key is to buy enough network capacity to allow all nodes in your cluster to communicate with each other at reasonable speeds, and for reasonable cost," he said.

Hoosenally also pointed out that organizations faced a "steep learning curve" in utilizing Hadoop and this would make integration with legacy IT systems "relatively difficult".

This is further compounded by the fact that there is currently a lack of documentation and information on how to get started and ensure an efficient use of the open source framework, he added.

He also pointed to hiring the right people with domain expertise as another challenge.

IDC Asia-Pacific's associate vice president, Philip Carter, had highlighted the lack of data talent in an earlier report. He noted that companies in Asia, for instance, had a low level of understanding of big data and how IT departments should approach it. Even IT heads with more knowledge on the subject matter were unsure about the types of skills required to harness the information collated, the analyst said.

Editorial standards