A program that learns from itself

A program that learns from itself

Summary: Two professors of the University of Texas at Arlington have developed a computer program that can find patterns in data represented as graphs. But what's more important is that this program can learn from its own discoveries. The first potential application is about counter-terrorism.

TOPICS: Big Data

These days, the Defense Advanced Research Projects Agency (DARPA) is making the headlines for its Grand Challenge robotic vehicle race through the Nevada desert (check the official site for the latest results -- and congratulations to all the competitors). But as usual, DARPA is also busy funding many other projects. For example, it just awarded a $500,000 grant to two professors of the University of Texas at Arlington (UTA) for their research into data mining and artificial intelligence. Their project, SUBDUE, is a computer program that can find patterns in data represented as graphs. But what's more important is that this program can learn from its own discoveries. Potential applications for such a program include counter-terrorism, but also bioinformatics, web structure mining, social network analysis or seismic events analysis.

Here are some short quotes from an article from the Shorthorn, the student newspaper of UTA.

The agency has sent Diane Cook and Larry Holder of UTA’s Computer Science Department a grant for their efforts and research into data mining and artificial intelligence.
The project is called SUBDUE, a computer program that finds interesting patterns in data represented as a graph. It can examine large chunks of data easily and examine data points and relationships between them.

The Shorthorn adds that the two professors have been working on this program for 10 years, but doesn't say they're also a married couple with two kids.

But let's go back to Subdue and how it can learn from its own discoveries.

The program isn’t just some number-crunching program. It has the ability to learn from its information. This isn't learning as nonresearchers would learn. SUBDUE takes its output and makes a graph about the positive phenomenon. Then it takes that information and compares it with more graphs. This gives the researchers a chance to compound the information and develop a "predictive model to identify emerging criminal networks."

Below is a diagram showing how Subdue discovers common substructures within relational data by evaluating their ability to compress the graph using the Minimum Description Length (MDL) principle (Credit: UTA).

How Subdue discovers common substructures

For more information about the SUBDUE project, please visit the project home page and start by reading the introduction.

SUBDUE is a graph-based knowledge discovery system that finds structural, relational patterns in data representing entities and relationships. SUBDUE represents data using a labeled, directed graph in which entities are represented by labeled vertices or subgraphs, and relationships are represented by labeled edges between the entities. SUBDUE uses the minimum description length (MDL) principle to identify patterns that minimize the number of bits needed to describe the input graph after being compressed by the pattern. SUBDUE can perform several learning tasks, including unsupervised learning, supervised learning, clustering and graph grammar learning.

You'll also find there links to many presentations and publications.

I've selected two articles published in 2005 for your reading pleasure: "Iterative Structure Discovery in Graph-Based Data," published by International Journal of Artificial Intelligence Techniques (Link, PDF format, 24 pages, 363 KB) (the diagram above comes from this paper) and "Graph-based Relational Learning with Application to Security," published by Fundamenta Informaticae (Link, PDF format, 16 pages, 149 KB)

Sources: Cole Dowden, The Shorthorn, Student Newspaper of the University of Texas at Arlington, October 4, 2005; and various web sites

You'll find related stories by following the links below.

Topic: Big Data

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.


Log in or register to join the discussion
  • I did this in college

    I wrote software for reducing (di)graphs to K5 or K33. I WONDER if I still have that code - in Turbo Pascal . . .
    Roger Ramjet
  • Here comes Skynet!

    hasta la vista baby!
    Reverend MacFellow
    • i hate to admit it....

      but that was the first thing that came to my mind as well.
  • Can you spell "boondoggle?"

    I spent DARPA money in the '80s when symbolic AI was hot. I didn't know it was a waste then and obviously the latest generation of funders didn't learn anything from the last generation. Graph theory is nice mathematics, but it's hardly intelligence. This should be funded as pure science, if anything - but I bet the funders of mathematical theory know better than to call this "intelligence."
    • Pattern recognition...

      ...is the very core of human intelligence, however. Is this thing going to hold a conversation? No, probably not. I think the real issue is that the definition of intelligence has become a bit pliable and ill-defined.