Xerox researchers say they've developed a text-mining tool that's tuned to the way humans think, speak and ask questions.
Type "what Steve Jobs said yesterday" into the FactSpotter tool and the search software will hunt through documents and return a handful of relevant answers, instead of churning out countless articles containing the Apple CEO's name.
But the FactSpotter software, unveiled last week, will not be available to the public over the Internet or otherwise--only to customers of document management company Xerox, which developed the tool.
Jean-René Gain, director and general manager of marketing, strategy and alliances at Xerox, told Silicon.com that Xerox will not sell FactSpotter as a standalone application--only as an embedded application to its customers.
"We are not taking on Google with this," Gain said. "It is an aside option to consider, but we need this technology to differentiate ourselves" from competitors.
Mario Jarmasz, technology showroom engineer at Xerox, said: "This is completely different from searching on Google because we can drill down to certain levels of detail."
The FactSpotter tool, due to be available by 2008, will first be offered to the document-heavy legal and litigation market.
Xerox predicts the text-miner software will be useful in other situations where information must be retrieved from a massive database, including corporate and government searches, drug discovery, fraud detection and risk management.
Christopher Dance, laboratory manager at Xerox, told Silicon.com that FactSpotter could also be used to manage the vast number of documents produced during large mergers and acquisitions.
FactSpotter can hunt for relevant documents at a rate of 2,000 documents per second. Dance said the next stage of the development process will be to speed up the software.
The tool uses a linguistic engine that analyzes the meaning of words and the construction of phrases and sentences to work out exactly what a user is hunting for.
FactSpotter also recognizes concepts in a search term. To use the previous example, when a person types in "what Steve Jobs said yesterday," the tool will break down the sentence and recognize "Steve Jobs" as a person and "yesterday" as a time.
Xerox is "trying to make a computer understand text like a human being," said Frédérique Segond, parsing and semantics area manager at the company.
Segond added the FactSpotter tool is the next step for searching documents and uses "Web 3.0" technology that connects data, whereas Web 2.0 applications only collect data.
Gemma Simpson of Silicon.com reported from London.