1,000-node big data workbench to crunch analytics' toughest problems

1,000-node big data workbench to crunch analytics' toughest problems

Summary: EMC launches Greenplum Analytics Workbench to analyse petabyte-scale datasets and pave the way for future big data platforms

SHARE:
TOPICS: Cloud
0

EMC has unveiled a system to analyse petabyte-scale datasets and help develop the next generation of big data analytics platforms.

The Greenplum Analytics Workbench, which was revealed at EMC World 2012 in Las Vegas on Tuesday, will be available free of charge for analysis of extremely large data volumes.

It can process both structured and unstructured data using the open-source data analytics system Hadoop and EMC's Greenplum Database, a heavily-customised version of the open source PostgreSQL database that is able to carry out massively parallel processing.

Hadoop is suited to analysing petabyte-scale datasets because each node in a Hadoop cluster processes data in parallel.

Each node in the 1,000-node workbench cluster has two Intel X5670 processor, 24TB of storage and 48GB of RAM. The workbench uses 10/40GbE and FDR 56Gb/s InfiniBand interconnects provided by Mellanox Technologies, including its Unstructured Data Accelerator software that accelerates Hadoop job time.

Read this

Orange, Cisco, EMC and VMware form cloud alliance

A quartet of companies have formed the Flexible 4 Business cloud alliance to provide four varieties of pay-per-use cloud products, with Orange Business Services acting as the service provider

Read more+

Scott Yara, senior VP for products at Greenplum, said there is already a long queue of organisations waiting to get time on the machine: "They span healthcare research, manufacturing, drilling, mining, fraud detection in financial services - there's a lot of really advanced use cases."

Results from the workbench analysis will be made available to the Hadoop open source community, and will be used to inform future development of Hadoop and converged Hadoop / SQL analytics platforms.

"We're really trying to look for use cases that either require access to a large-scale of compute nodes or that pushes the limit in terms of analytic work that's been done before," said Yara.

The Workbench lives in hardware operated by EMC and is accessible via the cloud. The announcement puts an end to years of rumours that EMC was planning to take its Greenplum analytics into the cloud, though the node functions more like a rentable supercomputer than a provisionable service.

Topic: Cloud

About

Nick Heath is chief reporter for TechRepublic UK. He writes about the technology that IT-decision makers need to know about, and the latest happenings in the European tech scene.

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

0 comments
Log in or register to start the discussion