Crowdsourcing may have just helped close the "analogy gap" for computers

It's vexed computer scientists for decades, but a huge roadblock for true AI is falling
Written by Greg Nichols, Contributing Writer

To paraphrase Arthur Schopenhauer, genius is seeing what everyone else sees and thinking what no one else has thought. Put another way, genius is breaking down the usual silos that isolate ideas and knowledge into specific fields and purviews.

It's an elegant definition. Thanks to work about to be presented by researchers at Carnegie Mellon University's School of Computer Science and the Hebrew University of Jerusalem, it may soon apply to AI.

The researchers have just given computers the capacity to mine patent databases and other research records in order to repurpose old ideas to solve new problems. To do it, they had to devise a method to teach computers to make analogies.

"After decades of attempts, this is the first time that anyone has gained traction computationally on the analogy problem at scale," said Aniket Kittur, associate professor in CMU's Human-Computer Interaction Institute.

"Once you can search for analogies, you can really crank up the speed of innovation," adds Dafna Shahaf, a CMU alumnus and computer scientist at Hebrew University. "If you can accelerate the rate of innovation, that solves a lot of other problems downstream."

Analogies, which are a way of drawing comparisons between things that aren't easily compared, lie at the heart of innovation. A spokesman for CMU offered me a few illustrative examples, including the case of Jorge Odon, an Argentinian car mechanic who invented a device for delivering babies that's much safer than forceps. The idea came to him after watching someone use a trick to remove a loose cork from a wine bottle.

The leap it takes to move from the stuck cork to a difficult birth, which may happen spontaneously and unconsciously for humans, has proven elusive for machines. The problem is that computers don't understand the world on the deep semantic level we do.

"Researchers have tried handcrafting data structures, but this approach is time consuming and expensive," said the CMU spokesman, "not scalable for databases that can include nine million U.S. patents or 70 million scientific research papers."

The researchers tried something different. Kittur has spent years studying how crowdsourcing can be used to find analogies. He and Shahaf, along with fellow researchers, hired workers through Amazon Mechanical Turk and asked them to look through products on Quirky.com, a product innovation site, and then find analogous products on the same site. The workers then noted precisely which words caused them to connect disparate products, mapping each pathway.

"We were able to look inside these people's brains because we forced them to show their work," explained Joel Chan, a post-doctoral researcher at CMU.

Based on insights gleaned from the crowdsourced research, computers equipped with deep learning AI were able to analyze additional product descriptions and identify new analogies. By seeding the process of forming analogies in a specific ecosystem like the product website, the researchers effectively taught the computers to mimic the human mind's expansive capacity for comparison.

According to the team, the same approach can be used to tailor computer programs to find analogies between patent applications and literature on problems currently facing the world. It could be that tools exist to help solve emerging problems, even if no one has made those connections.

The research team will present its findings on Thursday, Aug. 17, at KDD 2017, the Conference on Knowledge Discovery and Data Mining, in Halifax, Nova Scotia, where the research paper has won both Best Paper and Best Student Paper awards.

Editorial standards