LinkedIn open sources TonY, its framework to run TensorFlow on Hadoop

The core idea is to run TensorFlow jobs as reliably and flexibly as other first-class citizens on Hadoop.

LinkedIn on Thursday will announce a new open source project that aims to provide native support for running TensorFlow jobs on Hadoop.

TensorFlow is the Google-built open-source machine learning library that's become one of the most popular platforms for creating machine learning and deep learning applications. Hadoop is the open-source software framework for storing data and running applications on clusters of hardware. Getting these two systems to function together natively is often an obstacle for data scientists and engineers, according to LinkedIn.

Also: LinkedIn bug allowed data to be stolen from user profiles

In steps TonY, LinkedIn's framework to natively run TensorFlow on Hadoop. The core idea is to run TensorFlow jobs as reliably and flexibly as other first-class citizens on Hadoop including MapReduce and Spark, LinkedIn said.

"We wanted a flexible and sustainable way to bridge the gap between the analytic powers of distributed TensorFlow and the scaling powers of Hadoop," Jonathan Hung, a senior software engineer on the Hadoop development team at LinkedIn, wrote in a blog post.

Also: Can LinkedIn finally kill the business card? TechRepublic

"Similar to how MapReduce provides the engine for running Pig/Hive scripts on Hadoop, and Spark provides the engine for running scala code that uses Spark APIs, TonY aims to provide the same first-class support for running TensorFlow jobs on Hadoop by handling tasks such as resource negotiation and container environment setup," Hung explained.

pasted-image-0.png

In addition to supporting this baseline functionality, TonY also offers features that aim improve the experience of running large-scale training, including GPU scheduling, fine-grained resource requests, TensorBoard support, and fault tolerance.

Also: LinkedIn's top places to work includes a lot of tech companies CNET

TonY is available on GitHub -- which, like LinkedIn, is also now owned by Microsoft -- starting today.

Previous and related coverage:

Microsoft's 'future CEO of GitHub' speaks out on Atom, keeping GitHub independent and more

Soon-to-be GitHub CEO Nat Friedman took to Reddit to try to answer developer questions and allay fears about Microsoft's plans for the service.

LinkedIn launches translation feature that builds on Microsoft AI technologies

Microsoft's LinkedIn unit has developed a new dynamic translation service for posts in the LinkedIn news feed, and used some of Microsoft's cognitive services to build it.

Open source: Why it's time to be more open about how projects are run

Security isn't the only consideration as you assess which open source projects to rely on; governance, community and professionalism matter too.