Innovation

Gaming AI beats human top scores by cheating

The artificial intelligence system had no qualms about exploiting ancient bugs to win.

Written by Charlie Osborne, Contributing Writer March 2, 2018 at 3:12 a.m. PT

An artificial intelligence (AI) system tasked with playing retro 1980s arcade game Q*bert managed to secure impossibly high scores by exploiting an ancient bug.

Featured

Switzerland now requires all government software to be open source
I replaced my Samsung Galaxy S24 Ultra with the Z Fold 6 for a week - and can't go back
Can't hear TV dialogue? 3 fixes to dramatically improve your television's audio - and 2 are free
My 4 favorite iOS 18 features make the iPhone a lot better, and more fun

The bot was programmed to use evolutionary strategy (ES) algorithms, which includes machine learning (ML) and allows the AI to learn, adapt, and change tactics depending on the situation and other players.

ES and reinforcement learning (RL), which is based on behavioral psychology and revolves around a simple reward system, have already been used to beat human players in games including Chess and Texas Hold 'em.

In the poker game, as AI learned how its competitors played, it was able to reach levels of "superhuman performance."

When it comes to Q*bert, however, the AI didn't seem to mind cheating to rack up those points.

As reported by The Register, researchers from the University of Freiburg, Germany, implemented ES in a gaming AI to compare the success of ES in comparison to RL.

In a paper, the researchers found that ES can beat RL in a number of cases.

Q*bert requires players to jump from cube to cube in order to change their color while avoiding obstacles and enemies in order to progress to the next round.

However, the AI was able to find and exploit a bug which caused the platforms to blink, allowing it to bounce and rack up close to a million points.

Speaking to the publication, co-authors of the paper Patryk Chrabaszcz and Frank Hutter said that the AI was fed roughly 1.5 million parameters to create the ES system and as high scores were the goal, exploiting the ancient bug was the most natural path to take.

"To find the bug, the agent had to first learn to almost complete the first level -- this was not done at once but using many small improvements," the researchers said. "We suspect that at some point in the training one of the offspring solutions encountered the bug and got a much better score compared to its siblings, which in turn increased its contribution to the update -- its weight was the highest one in the weighted mean."

The AI's exploration of different paths and tactics during tests did not always find and exploit this bug, but in eight out of 30 tests, the exploit was used. RL systems used as a comparison did not reach the high scores of its ES counterpart.

However, the latter tended to outperform ES in racing and shooting games where understanding context, rather than patterns, was crucial.

36 of the best movies about AI, ranked

Previous and related coverage

NVIDIA swings for the AI fences
The company you know for its gaming and graphics tech has transformed into an AI powerhouse. Its next frontier is being an autonomous vehicle powerhouse and a household name.
Microsoft makes more AI programming interfaces available to developers
Microsoft is making available new vision, face recognition and entity search interfaces to developers who want to add more AI smarts to their apps and services.
Vectra raises $36 million in AI-based threat detection push
The startup's Series D round highlights investor interest in AI cybersecurity systems.

Editorial standards

Show Comments

Gaming AI beats human top scores by cheating

Featured

36 of the best movies about AI, ranked

Previous and related coverage

Related

Google's DeepMind AI takes home silver medal in complex math competition

OpenAI shares safety updates after whistleblower complaints, lawmaker demands

Credit Karma unveils updated AI tools to help you better understand your finances