X
Business

Computers help to choose the best algorithm

Engineers at Ohio State University have developed a new way to improve pattern recognition software. Their quick test "rates how well a particular pattern recognition algorithm will work for a given application." And they think that their method can be used in many areas including genetics, economics, climate modeling, and neuroscience.
Written by Roland Piquepaille, Inactive

Engineers at Ohio State University, used to design computer algorithms to simulate human vision, have developed a new way to improve pattern recognition software. Traditionally, this kind of software uses a method called linear feature extraction, which doesn't always work and can lead to weeks of wasted time for the scientists. So the researchers have developed a quick test "which rates how well a particular pattern recognition algorithm will work for a given application." And they think that their method can be used in many areas including genetics, economics, climate modeling, and neuroscience.

Why did they start this new approach? Just because they were frustrated. Here is an explanation given by Aleix Martinez, assistant professor of electrical and computer engineering at Ohio State, and creator of the Computational Biology and Cognitive Science Lab.

The majority of pattern recognition algorithms in science and engineering today are derived from the same basic equation and employ the same methods, collectively called linear feature extraction, Martinez said.
But the typical methods don't always give researchers the answers they want. That's why Martinez has developed a fast and easy test to find out in advance which algorithms are best in a particular circumstance.
"You can spend hours or weeks exploring a particular method, just to find out that it doesn't work," he said. "Or you could use our test and find out right away if you shouldn't waste your time with a particular approach."

And here are some brief quotes about their results.

Martinez and [doctoral student Manil Zhu] tested machine vision algorithms using two databases, one of objects such as apples and pears, and another database of faces with different expressions. The two tasks -- sorting objects and identifying expressions -- are sufficiently different that an algorithm could potentially be good at doing one but not at the other.
The test rates algorithms on a scale from zero to one. The closer the score is to zero, the better the algorithm. The test worked: An algorithm that received a score of 0.2 for sorting faces was right 98 percent of the time. That same algorithm scored 0.34 for sorting objects, and was right only 70 percent of the time when performing that task. Another algorithm scored 0.68 and sorted objects correctly only 33 percent of the time. "So a score like 0.68 means 'don't waste your time,'" Martinez said. "You don't have to go to the trouble to run it and find out that it's wrong two-thirds of the time."

This research work has been published by the IEEE Transactions on Pattern Analysis and Machine Intelligence under the name "Where Are Linear Feature Extraction Methods Applicable?" (Volume 27, Issue 12, pp. 1934-1944, December 2005). Here is a link to the abstract and the beginning of it.

A fundamental problem in computer vision and pattern recognition is to determine where and, most importantly, why a given technique is applicable. This is not only necessary because it helps us decide which techniques to apply at each given time. Knowing why current algorithms cannot be applied facilitates the design of new algorithms robust to such problems. In this paper, we report on a theoretical study that demonstrates where and why generalized eigen-based linear equations do not work.

And here is a link to the full paper (PDF format, 20 pages, 396 KB). Here are some excerpts from the conclusion.

In this paper, we have shown that linear feature extraction algorithms are not always guaranteed to minimize the Bayes error even if the metrics used do so in principal. We have shown that the sanity check is not that of whether the data is homoscedastic or heteroscedastic (which had been our common practice).
We have then used our results to de ne a new robust algorithm (of such linear methods). And, we have discussed how these results can be used to improve the design of current algorithms.
The results reported in this article will help scientist in computer vision to better understand their algorithms and, most importantly, improve upon them. Obviously, feature extraction techniques are applicable to a large variety of problems in science and, therefore, one can expect that the result reported in this paper will also impact areas of research as diverse as genetics, economics, climate modelling and neuroscience.

If you want to read this paper, which doesn't give any reference to the test the researchers built, you'll have to brush your math skills. Anyway, this research project looks promising.

Sources: Ohio State University news release, January 24, 2006; and various web sites

You'll find related stories by following the links below.

Editorial standards