X
Business

Yahoo opens distributed computing to academics

Yahoo has announced a program to encourage the development of software around distributed computing, with researchers from Carnegie Mellon University already signed up.
Written by Suzanne Tindal, Contributor

Yahoo has announced a program to encourage the development of software around distributed computing.

The open source program will see Yahoo push work around Hadoop -- an environment that enables its users to process massive amounts of data via distributed computing, which divides an application into many small fragments of work to be executed on large numbers of separate computers linked in a network.

Carnegie Mellon University is now involved in the program and will run Hadoop on Yahoo's supercomputing cluster, the M45, which has a peak performance of more than 27 teraflops.

Systems software researchers from the University will study the system's performance, while computer science professors will use the cluster to tackle information retrieval and graph problems. The cluster will also be used to work on large scale computer graphics, natural language processing, and machine translation problems.

"We are excited about collaborating with Yahoo on systems software research ... and jointly contributing back to the open source community," Randall E Bryant, Dean of the School of Computer Science at Carnegie Mellon, said in a statement.

Yahoo is not the only company to work in the area of handling of large data sets.

Google and IBM provided hardware, software and services to academics in multiple universities to improve computer science students' knowledge of highly parallel computing practices used to work with large data sets, while the CSIRO announced the start of a terabyte science project last week to research methods of dealing with large amounts of data.

Editorial standards