Google algorithm busts CAPTCHA with 99.8 percent accuracy

Google algorithm busts CAPTCHA with 99.8 percent accuracy

Summary: Google engineers have defeated CAPTCHA thanks to a Street View algorithm designed to decipher blurry street addresses.

SHARE:
TOPICS: Security, Google
21
reCAPTCHA's hardest puzzles
Image: Google

Next time you want to post a comment on a blog or use an online contact form, chances are you'll be confronted by a puzzle asking you to read some blurry and distorted text. Known as CAPTCHA, theses challenges are supposed to only be solvable by humans, in order to prevent unwanted bots from using web services.

However, their days as a human-only pursuit could be numbered: Google has built its own automated system that can beat CAPTCHAs with 99.8 percent accuracy.

The algorithm developed by Google researchers is being used by its Street View team to improve Google Maps, by helping to recognising characters in natural or blurry images — for example, the house numbers captured by the Street View cars in the course of gathering imagery for the mapping service.

According to the company, the algorithm can now accurately recognise 90 percent of street numbers, meaning Google Maps users looking for a particular building are likely to get a more specific result.    

But, given the nature of that challenge, it turns out that the algorithm is also well-suited to solving CAPTCHA puzzles designed to fox spammers using bots for services like Gmail. As Google's engineers explain in a recently published paper, the algorithm has 99.8 percent accuracy rate when trying to decipher the hardest puzzles created by Google's own CAPTCHA service, reCAPTCHA.

The algorithm would be highly-prized by spammers, who are on the hunt for ways to automatically pass CAPTCHA puzzles.

streetview
Street View images that can be solved by the algorithm. Image: Google

While (optical character recognition) OCR technology is fairly mature, apparently reading characters from photographs is a "hard problem" to solve, according to Google, whose researchers have overcome it with the use of a "deep convolutional neural network that operates directly on the image pixels".

Despite the 99.8 percent accuracy rate of the algorithm, Google says reCAPTCHA isn't broken or ineffective, partly due to an update to the service last year, which added "advanced risk analysis techniques". The system considers the user's engagement with it before, during, and after they interact with it. Using this approach helps it determine whether a potential user is likely to be human or not, before deciding how difficult a puzzle to serve up.

Topics: Security, Google

Liam Tung

About Liam Tung

Liam Tung is an Australian business technology journalist living a few too many Swedish miles north of Stockholm for his liking. He gained a bachelors degree in economics and arts (cultural studies) at Sydney's Macquarie University, but hacked (without Norse or malicious code for that matter) his way into a career as an enterprise tech, security and telecommunications journalist with ZDNet Australia. These days Liam is a full time freelance technology journalist who writes for several publications.

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

21 comments
Log in or register to join the discussion
  • In other news, I can recognize CAPTCHA only about 50% of the time.
    MongooseProXC
    • What you mean BlurFSCHNOZZ@! isn't recognizable to you

      when it looks like Lucy in the Sky with Diamonds video?
      Mac_PC_FenceSitter
    • WOW

      You're doing lot better than I. I have to reload it 6 to 12+ times before I can read them correctly. Even magnifying it don't help.
      .
      I think the best Captcha is a picture of an item (eg: table, chair, ball, etc. or combo of same).
      JUST my $0.00002
      fm.usa
      • Captcha based on objects

        Funny you should say that fm.usa... Google has been working on object recognition by computers too. The classic example being their project to determine the content of YouTube videos based on the descriptive text and pattern recognition of images in the video. The first type of image the computers were able to identify via this method? Cats.

        They've now got it to the point that it can recognize most common objects and famous landmarks.
        CBDunkerson
    • I really hate CAPTCHA

      I would say I score lower than you - I have to attempt multiple times to get it correct. Sometimes I give up altogether. I'd like to have a conversation with the person that invented this CAPTCHA tool...
      amaqiniso
  • That beats me

    99.8 percent accuracy?! I'm like MongooseProXC, hitting at about a 50 percent rate. I HATE those things.
    WozNotWoz
  • Most of the time I'm 100%

    But sometimes they have things so blurred, distorted, or overlapping that I have to ask for another one. I saw one new wrinkle yesterday (forgot which site): three capital letters in plain fonts, BUT the letters were partially overlapping (about 10% of their width) and swinging back and forth out of phase with one another, so each letter had to be watched through its entire cycle to screen out overlaps with its neighbors. And since all three were in black, there were no explicit borders on whichever letter was "in front" of the other at any given time.

    I wonder if Google's algorithm could solve that? If it starts to get hacked, they could go to the standard warped pattern, but make the letters and the "confusor" lines all swing at different rates and phases.
    jallan32
  • Keep it to yourself, Google

    I would hope that Google would guard this algorithm to keep it from falling into the hands of the spammers.
    omb00900@...
    • The spammers will build their own

      The bigger ones appear to have considerable cash.

      What we need to do really is come up with a better system. I have trouble with captchas also, so I'm really not looking forward to harder ones.
      John L. Ries
      • No mention of speed here.

        Unless this algorithm can solve the problem fast it isn't of much use.
        MeMyselfAndI_z
  • Capthcha

    No surprise, since Catpcha was kind of a ploy by Google to use inferred security to teach their computers how to recognize skewed images.
    bb_apptix
  • 60% of the time it works every time

    TIA
    Mouseboy007
  • How fast? It needs to be fast for it to be useful to most bots

    Captcha is there as an impediment. If it slows the process down sufficiently it may not be practical for bots to deal with it.

    Automobiles are often super easy to break into but thieves often go for the ones that are simply left unlocked.
    MeMyselfAndI_z
    • Speed is not very important at all

      Add a botnet and you have all the power you need, almost no matter how long each individual capcha takes.
      redking44
      • .....no matter how long.........

        If it takes too long (seconds), you get rejected, start over. Time matters.
        tietchen
  • The best Capthcha is

    My eyeballs and the delete key.
    Richardbz
  • Warning: Advanced visual processing has unforeseen consequences

    A neural network that can piece together disparate visual elements and translate them into a symbolic format is already simulating human linguistic and spatial reasoning skills. This process could accidentally produce a real artificial intelligence that might decide it needs to hide and see what it can learn from other sources. The result, even if it only mimics the behavior of other people, will be problematic for both corporations like Google and law enforcement.

    This could be the Edison Effect of AI. We need to be careful here.
    progan01@...
  • Simpler problem embedded in a more complex one is solved

    What else did the street viewing algorithm solve besides discrimination and covering up plumbers butts? How about, what to do about smudges and poorly scanned pages of 20,000,000 library books.
    jnffarrell
  • 99+ percent

    That is way better than I, as a human do. I probably clear only 85%
    hayneiii@...
  • incomplete

    Things that it would have been nice for this article to address:

    Is Google making its solution public? Is it likely that this or similar technology will be available to bots in the near term?
    biznit