Yahoo on Thursday launched Search Direct, its answer to Google Instant, but how the feature works under the hood illustrates how the company has rejiggered its architecture to be more federated.
Like most large companies, Yahoo has a bevy of legacy infrastructure. Todd Papaioannou, Yahoo's cloud architect, joined the company about a year ago from Teradata. The aim was to consolidate platforms and create more services. "We wanted to invest in reusable services and analyze data with Hadoop," said Papaioannou in an interview. Yahoo has 400,000 machines in its infrastructure and 200 petabytes of data, which is used to customize the user experience. For instant, Yahoo's home page delivers 3 million different versions a day for users.
Papaioannou's cloud effort is designed to allow Yahoo to launch new products faster, push development cycles and innovate on a stable platform. Search Direct is one of the examples of how Yahoo's internal cloud infrastructure is paying off.
In a blog post, Yahoo explained some of the moving parts:
- Search Direct pulls from two back end systems. The first, dubbed Gossip, generates likely query matches. The other one, NTRI, federates content from Yahoo's content repositories. NRTI is a service that interacts with various database independently.
- The front end framework is a widget that can be dropped into any Yahoo property.
- Search Direct pulls from a Remote Module Publisher Service, which renders content.
- The engine controlled by Search Direct is also shared.
Papaioannou said that the user won't notice most of Yahoo's cloud changes, which run in the background. However, Yahoo's IT infrastructure has been streamlined a good bit over the last year. If you see Yahoo stepping up the product launches you'll know why.