X
Tech

Why Yelp took Mrjob open source

While the package as-written saves Yelp money, it would appear to be a gold mine for Amazon
Written by Dana Blankenhorn, Inactive

There are many reasons why code goes open source, but most of the best come down to money.

Some coders want to make money. Some want to make it second hand, building communities from which businesses might spin-out.

Yelp seems to have made Mrjob open source in hopes that improvements will let it fend off growing competition from Google, after it turned down a $500 million buy-out.

Think of it as the third degree of making money. You offer the software free to improve your service to millions.

Mrjob is a Python package for running Hadoop streaming jobs using Amazon's Elastic MapReduce service. It's offered under the Apache 2.0 license.

The software was written to power a feature called "People Who Viewed this Also Viewed." (Regular users will see it in the lower right-hand corner of the screen when they look at popular content.)

As data mining engineer Dave M. wrote on the company's engineering blog, the company faced an intermittent use problem with its Hadoop cluster. Mostly it would sit idle, then a big job would require all the nodes, backing up smaller jobs.

While the package as-written saves Yelp money, it would appear to be a gold mine for Amazon, which can now sell more services to small Hadoop users. Elasticity can be a big selling point. You might think of this as innocent bystander benefits. Hadoop can also benefit from the flexibility.

This also brings up another important point, the importance of open source to new web services.

Everyone knows the story of Facebook, but many probably don't know that the story of Yelp is similar. Code drives services, operations drive revenue, so coders have the chance to get a lot more money and power than they ever could before. (Get your first billion before you get your first suit.)

Could it be that coders are the biggest winners from the open source revolution? Discuss.

Editorial standards