Google + reCAPTCHA could raise bar in anti-bot, anti-spam battle

Google buys an excellent crowd-sourcing tool and, by default, gets to raise the bar significantly in the fight against bots and spam.

Locked in a cat-and-mouse game with spammers who use bots to defeat anti-fraud mechanisms and create fake accounts, Google today announced a deal to acquire reCAPTCHA, a company that provides those squiggly words at login screens (see image at right).

The ReCAPTCHA deal isn't exactly a security transaction.  Strategically, it gives Google an excellent crowd-sourcing tool to beef up its already impressive machine-vision algorithms (think book-scanning and maps) but, in the long run, the ability to use CAPTCHAs that are near-impossible for bots to decipher allows Google to raise the bar significantly in the fight against bots and spam.

According to Adam O'Donnell, director of emerging technologies at anti-spam firm Cloudmark, believes this is a very smart purchase by Google.

"Google already has the best computer-vision techniques.  The way ReCAPTCHA works, this means that Google will only be presenting CAPTCHA words that are very difficult for a bot to defeat," O'Donnell explained.

"By pushing up that boundary, it will make CAPTCHA technology much better."

The words presented by the ReCAPTCHA service come from scanned printed material (archival newspapers and old books).   As Google explains here, computers find it hard to recognize these words because the ink and paper have degraded over time, but by typing them in as a CAPTCHA, crowds teach computers to read the scanned text.

In this way, reCAPTCHA’s unique technology improves the process that converts scanned images into plain text, known as Optical Character Recognition (OCR). This technology also powers large scale text scanning projects like Google Books and Google News Archive Search. Having the text version of documents is important because plain text can be searched, easily rendered on mobile devices and displayed to visually impaired users. So we'll be applying the technology within Google not only to increase fraud and spam protection for Google products but also to improve our books and newspaper scanning process.

CAPTCHAs have served to slow down spammers and phishers but in many cases, they are easily defeated by bots or humans hired to manually solve text in the squiggly-lined images.

[ Dancho Danchev: Google's CAPTCHA experiment and the human factor ]

Earlier this year, Researchers at Google recently released a paper detailing a new CAPTCHA system consisting of correct image rotation (Socially Adjusted CAPTCHAs) whose main purpose is to make it easier for humans, and much harder for bots to recognize them.