Google Books and Scholar users beware: AI-generated nonsense is flooding search results

The book and academic search engines are including titles produced entirely by ChatGPT. Here's how to identify them.
Written by Artie Beaty, Contributing Writer
funky-data/Getty Images

Do you use Google Books to find books on certain topics? Or Google Scholar to dive into academic research? Here's something you should know: These sites, which enable users to "search the world's most comprehensive index of full-text books" -- and search academic literature across any discipline -- have started indexing low-quality, AI-generated books that appear to be written by real, human authors.

This troubling trend was first spotted by 404 Media, which used a simple trick to track down AI-generated books. If you query ChatGPT for current events, you'll often be greeted with the phrase, "As of my last knowledge update." That's just OpenAI's way of letting you know the chatbot has time constraints on what information it can access.

Also: AI taking on more work doesn't mean it replaces you. Here are 12 reasons to worry less

If you search Google Books for "As of my last knowledge update," you'll run across books that apparently included ChatGPT-generated content verbatim. A quick search for that phrase turns up page after page of titles. Some of the books are about ChatGPT and include the phrase to discuss the chatbot's limits, but dozens of others are trying to pass off the AI-generated writings as written by a human author.

For example, one book about the Boston Marathon bombing used the phrase "As of my last knowledge update in September 2021, the case continued to be subject to legal proceedings, and the ultimate outcome was still uncertain" when addressing the attack's perpetrators. The "author" of that book has 50 other works, including titles about the Cold War, 9/11, America's founding fathers, ancient Rome, famous boxers, and famous Native Americans.

Every one of those titles was published in 2023 (ZDNET's own Jack Wallen took 30 years to publish that many books) and was between 50 and 100 pages. Browsing through them, I found that every one offered superficial narratives that at best resembled a Wikipedia entry and at worst looked like ChatGPT spitting out facts. A quick search online also revealed these books are for sale on Amazon and at other retailers.

When I plugged the same phrase into Google Scholar, which is supposed to be a repository for human research, I got back 19 pages of results, including papers on at-risk youth, diabetes, autism, COVID-19, and airline pilot fatigue.

Also: This is why AI-powered misinformation is the top global risk

AI-generated content spreading on the web is nothing new. It's a bit worrisome, however, to see AI-generated content show up alongside human-written work inside reliable resources like Google Books and Google Scholar. 

Speaking to 404 Media, Google said it would "continue to evaluate our approach as the world of book publishing evolves" but didn't mention removing these results from either service.

Editorial standards