X
Home & Office

Extracting the structure of networks

Networks are used to represent the structure of complex systems, including the Internet or social networks, but often these descriptions are biased or incomplete. Now, researchers at the Santa Fe Institute (SFI) have shown that it's possible to extract automatically the hierarchical structure of networks. The researchers say their results 'suggest that hierarchy is a central organizing principle of complex networks, capable of offering insight into many network phenomena.' They also think that their algorithms can be applied to almost every kind of networks, from biochemical networks (protein interaction networks, metabolic networks or genetic regulatory networks) to communities in social networks. But read more...
Written by Roland Piquepaille, Inactive

Networks are used to represent the structure of complex systems, including the Internet or social networks, but often these descriptions are biased or incomplete. Now, researchers at the Santa Fe Institute (SFI) have shown that it's possible to extract automatically the hierarchical structure of networks. The researchers say their results 'suggest that hierarchy is a central organizing principle of complex networks, capable of offering insight into many network phenomena.' They also think that their algorithms can be applied to almost every kind of networks, from biochemical networks (protein interaction networks, metabolic networks or genetic regulatory networks) to communities in social networks. But read more...

A hierarchical network

You can see on the left "a hierarchical network with structure on many scales, and the corresponding hierarchical random graph. Each internal node r of the dendrogram is associated with a probability pr that a pair of vertices in the left and right subtrees of that node are connected. (The shades of the internal nodes in the figure represent the probabilities.) (Credit: SFI) Here is a link to a larger version of this figure.

This technique has been developed by three Santa Fe Institute (SFI) researchers, Aaron Clauset, a post-doctoral fellow at SFI, Cris Moore, who is also an associate professor of computer science at the University of New Mexico, and Mark Newman, a professor of physics at the University of Michigan.

Here is an excerpt from the SFI news release about this work. "Unlike much previous work in this area, Clauset, Moore, and Newman propose a direct but flexible model of hierarchical structure, which they apply to networks using the tools of statistical physics and machine learning. To demonstrate the practical utility of their model, they analyze networks from three disparate fields: the metabolic network of the spirochete Treponema pallidum (the bacteria that causes syphilis), a network of associations between terrorists, and a food web of grassland species. Even when only half of the connections in these networks were shown to their algorithm, the researchers found that hierarchical structure can predict missing connections with an accuracy of up to 80 percent."

A hierarchical network

You can see on the left an example of an "application of the hierarchical decomposition to the network of grassland species interactions: a, Consensus dendrogram reconstructed from the sampled hierarchical models: b, A visualization of the network in which the upper few levels of the consensus dendrogram are shown as boxes around species (plants, herbivores, parasitoids, hyperparasitoids and hyper-hyperparasitoids are shown as circles, boxes, down triangles, up triangles and diamonds, respectively)." (Credit: SFI) Here is a link to a larger version of this figure.

This research work has been published in the May 1, 2008 of Nature (Volume 453, Number 7191). Here is a link to the Nature editor's summary, "Refining the network." "Networks are now a ubiquitous tool for representing the structure of complex systems, including the Internet, social networks, food webs, and protein and genetic networks. Unfortunately, the data describing these networks are in many cases incomplete or biased. A new study provides a general technique to divide network vertices into groups and sub-groups. Revealing such underlying hierarchies makes it possible to predict missing links from partial data with higher accuracy than previous methods."

This issue of Nature carries two articles on this subject. The first one, from Sid Redner, of Boston University, is called "Networks: Teasing out the missing links" (Pages 47-48). Here is the abstract (Pages 47-48). "Focusing on the hierarchical structure inherent in social and biological networks might provide a smart way to find missing connections that are not revealed in the raw data -- which could be useful in a range of contexts." Here is another link to a 'beautiful' image showing assortative and disassortative networks.

The SFI letter to Nature appears on pages 98-101 on the same issue under the title "Hierarchical structure and the prediction of missing links in networks." Here is an excerpt from the abstract. "Here we present a general technique for inferring hierarchical structure from network data and show that the existence of hierarchy can simultaneously explain and quantitatively reproduce many commonly observed topological properties of networks, such as right-skewed degree distributions, high clustering coefficients and short path lengths. We further show that knowledge of hierarchical structure can be used to predict missing connections in partly known networks with high accuracy, and for more general network structures than competing techniques. Taken together, our results suggest that hierarchy is a central organizing principle of complex networks, capable of offering insight into many network phenomena."

For more information, here is a link to the full paper (PDF format, 21 pages, 287 KB).

Sources: Santa Fe Institute (SFI) news release, May 1, 2008; and various websites

You'll find related stories by following the links below.

Editorial standards