Microsoft's Trinity: A graph database with web-scale potential

Microsoft's Trinity: A graph database with web-scale potential

Summary: Microsoft's annual internal TechFest research showcase kicks off on March 6. So what better time to check out Trinity, a graph database research project, from Microsoft Research?


It's a good day when you finally find new information about a Microsoft codename I first heard a couple of years ago, but about which I never could find more information.

One of my readers (thanks, Gregg Le Blanc) sent me a link to a Microsoft Research page on codename Trinity, which is a "graph database and computing platform."

Given this week is Microsoft's internal TechFest Microsoft Research event for its employees (with March 6 being the day that Microsoft allows select media and guests to tour some of its exhibits), it's a good time to talk about yet another Microsoft Research project.

Here's Microsoft's explanation of codename Trinity:

"Trinity is a graph database and graph computation platform over distributed memory cloud. At the heart of Trinity is a distributed RAM-based key-value store. As an all-in-memory key-value store, Trinity provides fast random data access. This feature naturally makes Trinity suitable for large graph processing. Trinity is a graph database from the perspective of data management. It is a parallel graph computation platform from the perspective of graph analytics. As a database, it provides features such as data indexing, concurrent query processing, concurrency control. As a computation platform, it provides vertex-based parallel graph computation on large scale graphs."

And here's the requisite architectural diagram:

Trinity is built on top of the distributed memory-storage layer called "memory cloud." Utility tools provided by Trinity include a "fast billion node graph generator," the Trinity Shell and various management tools.

According to the Trinity page, the Trinity code is available only via the Microsoft intranet at this time. So why is it interesting? One potential use of Trinity is people search within a network. The Trinity applications page shows off as an example searching within a "Web-scale social network," like, say, Facebook. Microsoft's Bing search engine can check a user's Facebook network to see if there's anything relevant to pull, but doing so is a massive task which needs to be completed quickly.

In the demo they performed using an example of someone with 130 Facebook friends, this kind of two-hop query could be conducted in 10 milliseconds using Trinity. A three-hop would take 100 ms, the researchers said.

Another possible Trinity application is Probase, another Microsoft Research project designed to improve machine understanding of human communication. A first release of Probase was made available for download in May 2011. Trinity is the underlying infrastructure for the Probase knowledge base.

Version .06 of the Trinity manual is downloadable (as of January 2012). There's also a Hanselminutes podcast about Trinity dating from August 2011 which I never knew about until now.

Given Microsoft's increasing focus on big data and analytics, it seems like a project like Trinity could be a natural fit for one of Microsoft's product groups....

Update: Here's a post I missed last year mentioning Probase and Trinity from ReadWriteWeb, which also mentions Microsoft's no-longer-active Dryad project.

Topics: Microsoft, Data Centers, Data Management, Enterprise Software, Hardware, Software, Storage


Mary Jo has covered the tech industry for 30 years for a variety of publications and Web sites, and is a frequent guest on radio, TV and podcasts, speaking about all things Microsoft-related. She is the author of Microsoft 2.0: How Microsoft plans to stay relevant in the post-Gates era (John Wiley & Sons, 2008).

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.


Log in or register to join the discussion
  • Interesting

    Can't wait to see where Bing takes us next :)
  • The podcast is worth a listen!

    [a href=""]Hanselman's podcast with Haixun Wang[/a] is VERY interesting and well worth a listen.

    They've managed to create a graph of the entire internet and can query almost any graph segment within 100ms!
  • Data Storage

    Data base requires to store your data on storage devices in such a way so that we can work easily in sense of storing and accessing.,19/