IBM unwraps storage system for faster analytics

The GPFS-based storage architecture, developed by IBM's labs in Almaden, is designed for distributing large data applications that run on multiple hardware systems

IBM has revealed details of a new storage architecture that promises to speed up the processing of business analytics in large datacentres and cloud environments.

The company spoke about its General Parallel File System-Shared Nothing Cluster (GPFS-SNC) at the Supercomputing 2010 conference in New Orleans on Friday. The system is twice as fast at processing business analytics as a Hadoop Distributed File System (HDFS) cluster, according to IBM's own internal testing.

"Running analytics applications on extremely large data sets is becoming increasingly important, but organisations can only continue to increase the size of their storage facilities so much... this new architecture will shave hours off of complex computations without requiring heavy infrastructure investment," IBM said in a statement.

The GPFS-SNC is an expansion to the company's General Parallel File System (GPFS), which sits inside four of its products including IBM High Performance Computing Systems, IBM Information Archive, IBM Smart Business Compute Cloud and IBM Scale-Out Network Attached Storage (Sonas).

GPFS-SNC was developed at IBM Labs in Almaden, California. It uses clusters, dynamic file system management and data replication techniques to boost analytics capabilities. It is a parallel system based around a distributed computing architecture. This means that tasks are handled by different 'self-sufficient' individual nodes, or computers, which allow large tasks to be split into subtasks and run in parallel, IBM said.

The file system could benefit organisations with large amounts of business intelligence data to churn through, such as financial firms, web-only businesses and companies involved in the display of digital media, according to IBM. This is because "the design provides a common file system and namespace across disparate computing platforms, streamlining the process and reducing disk space", the manufacturer said.

The work done by IBM Labs on GPFS-SNC could "enable future expansion" into big data analytics for IBM's core GPFS-dependent products, such as IBM Smart Business Compute Cloud, the company added.