Business

Google's CAPTCHA experiment and the human factor

Any research is prone to irrelevance if it starts with the wrong research questions, takes the wrong perspective, or in this case, attempts to fight the wrong enemy - automated bots attempting to recognize CAPTCHAs.Researchers at Google recently released a paper detailing a new CAPTCHA system consisting of correct image rotation (Socially Adjusted CAPTCHAs) whose main purpose is to make it easier for humans, and much harder for bots to recognize them.

Written by Dancho Danchev, Contributor April 20, 2009 at 9:00 p.m. PT

Researchers at Google recently released a paper detailing a new CAPTCHA system consisting of correct image rotation (Socially Adjusted CAPTCHAs) whose main purpose is to make it easier for humans, and much harder for bots to recognize them. But with the emphasis of this and many other research papers on "bots vs CAPTCHAs", the research excludes a growing trend to which the new approach -- if implemented -- would actually make the new CAPTCHA much more efficiently abused than the previous one.

How come? Despite the persistent attempts by malware infected hosts to recognize CAPTCHAs, at the end of the day, a data entry team capable of solving 200,000 CAPTCHAs and charging $2 per 1000 entries ultimately drives the CAPTCHA solving economy.

A lot has changed since the factual research detailing "Inside India's CAPTCHA solving economy" was published last year.

Following their improved recognition rates -- in case you remember you have to pass a CAPTCHA solving speed test in order to become a qualified CAPTCHA solver -- the vendors of these services consisting primarily of boutique shops and a few consolidated ones, have gone mainstream to the point where Russian based CAPTCHA solving services are outsourcing the process to Indian workers and charge their customers more than the pay to their Indian colleagues.

In February this year, a novel approach was introduced by a Russian boutique vendor of CAPTCHA solving services - a community-driven revenue sharing scheme for CAPTCHA breaking. The concept is mimicking reCAPTCHAs ease of implementation and ubiquity, but with a mean perspective in mind. It allows webmasters to not only implement CAPTCHA solving forms at their registration pages, but is offering idle forum/community members the opportunity to solve CAPTCHA and earn revenue in the process, with the successfully solved CAPTCHAs fed into their system fulfilling yet another bulk request for bogus account registration.

Go through related CAPTCHA posts: Microsoft's CAPTCHA successfully broken; Gmail, Yahoo and Hotmail's CAPTCHA broken by spammers; Spammers attacking Microsoft's CAPTCHA -- again; Spam coming from free email providers increasing; Gmail, Yahoo and Hotmail systematically abused by spammers

Perhaps even more disturbing is the fact that these vendors are naturally Web 2.0 aware, and are clearly working with some of the most popular vendors of blackhat search engine optimization and automatic account registration/spamming tools by offering them the capability to empower their customers with CAPTCHA solving capabilities through API keys.

A practical example of how these human networks efficiently exploit CAPTCHA systems originally designed to fight bots, and facilitate cybercrime in the process, is the social networking worm Koobface (Koobface Facebook worm still spreading; Dissecting the Latest Koobface Facebook Campaign; Dissecting the Koobface Worm's December Campaign; The Koobface Gang Mixing Social Engineering Vectors).

Koobface is eating every social network's internal CAPTCHA barrier for breakfast not because the Koobface gang is taking advantage of CAPTCHA recognition algorithm, but because it's relying on CAPTCHA solving services.Sergei Shevchenko at ThreatExpert demonstrated the process in December, 2008, and pointed out that :

"In the real test, Facebook.com asked the Koobface to resolve the CAPTCHA image that reads "suffer accorn" - this image was pretty noisy for image recognition algorithms to resolve it successfully. But Koobface does not attempt to resolve it by itself. It submits this image to its C&C server. The server replies correct answer in about 34 seconds. Once the answer is received, Koobface submits the message via Facebook's compromised account including correct CAPTCHA answer."

With human networks and bots clearly converging (see graph), Sergei also discussed a very pragmatic solution on defeating Koobface back then - injecting a large number of successfully accepted CAPTCHA images to Koobface's command and control server, have them resolved by the CAPTCHA solving vendor, and the bill sent to the Koobface gang :

"Detailed analysis of traffic between Koobface and its command-and-control server allowed tapping into its communication channel and injecting various CAPTCHA images in it to assess response time and accuracy. The results are astonishing – the remote site resolved them all.

But here is a twist: uploading a large number of random CAPTCHA images into its communication channel will load its processing capacity, potentially up to a denial-of-service point. Well, if not that far, then at least it could potentially harm its business model, considering that the cost of resolving all those injected images would eventually be paid by the Koobface gang."

The ongoing arms race is not between bots vs CAPTCHAs, its between human networks efficiently exploiting networks aimed to originally distinguish between humans and bots. No CAPTCHA can survive a human, since it was originally meant to be recognized by one, and therefore making it easier to be recognized by humans like in Google's recent experiment, ultimately makes it easier for the CAPTCHA solving economy to scale.

CAPTCHA is in pain, humans are slowly killing it not bots. What do you think?

Editorial standards

Show Comments

Linus Torvalds and Dirk Hohndel, Open Source Summit North America 2024

Google's CAPTCHA experiment and the human factor

Related

Linus Torvalds takes on evil developers, hardware errors and 'hilarious' AI hype

6 features I wish MacOS would copy from Linux

The best AI image generators to try right now