Meta warns its new chatbot may forget that it's a bot

Meta has released BlenderBot for all to test - but maybe don't believe everything it says.
Written by Liam Tung, Contributing Writer
Image: Getty Images/Josef Kubes / EyeEm

Meta has released BlenderBot 3 chatbot to the public with baked-in "safety recipes" that it claims has reduced offensive responses by up to 90%, but the bot can tell lies and be rude.  

The BlenderBot 3 demo is accessible online for US-based visitors and should be available in other countries soon. 

"BlenderBot 3 is capable of searching the internet to chat about virtually any topic, and it's designed to learn how to improve its skills and safety through natural conversations and feedback from people 'in the wild'," Meta says in a blogpost

A key part of Meta's research in releasing the chatbot to the public is to help develop safety measures for the chatbot. 

"We developed new techniques that enable learning from helpful teachers while avoiding learning from people who are trying to trick the model into unhelpful or toxic responses," it says, referring to earlier research in automatically distinguishing between helpful users and trolls. 

Microsoft's 2016 chatbot mishaps in its public beta for Tay.ai showed exactly what can happen when humans interact with a chatbot which can be trained how say awful and racist comments. 

Meta warns that BlenderBot 3 is also capable of saying some bad things. It seems to be Meta's main unresolved problem, despite having a model that can learn from feedback.

"Despite all the work that has been done, we recognize that BlenderBot can still say things we are not proud of," it says in a BlenderBot 3 FAQ page

"This is all the more reason to bring the research community in. Without direct access to these models, researchers are limited in their ability to design detection and mitigation strategies."

Meta is encouraging users to report when the chatbot says anything offensive. It also warns the bot can make false or contradictory statements. Chatbots even forget that they are a bot and experience "hallucinations", Meta's description for when a bot confidently says something that is not true. 

"Unfortunately yes, the bot can make false or contradictory statements. Users should not rely on this bot for factual information, including but not limited to medical, legal, or financial advice," it notes.

"In research, we say that models like the one that powers this bot have "hallucinations", where the bot confidently says something that is not true. Bots can also misremember details of the current conversation, and even forget that they are a bot." 

It's called BlenderBot because Meta's previous research has found that by teaching a bot to "blend" many conversational skills improved performance better than when training a bot to learn one skill at a time. 

Google is aiming to improve the "factual groundedness" of chatbots and conversational AI through LaMDA or the "Language Models for Dialog Applications", which it unveiled in mid-2021. Google trained LaMDA on dialogue, aiming for it to engage in free-flowing conversations. It released LaMDA 2 at its I/O conference in May and offered researchers the AI Test Kitch app to let others experience what it "might be like to have LaMDA in your hands". 

LaMDA hit the spotlight in June after the Google engineer Blake Lemoine publicly released the document "Is LaMDA Sentient?", which he'd shown to Google executives in April. In it he suggested the model might be "sentient". Many disagreed that LaMBDA had reached this. 

LaMDA is a 137 billion parameter model that took almost two months of running on 1,024 of Google's Tensor Processing Unit chips to develop. But Google hasn't released LaMDA to anyone but its own engineers.

Meta says BlenderBot 3 is a 175 billion parameter "dialogue model capable of open-domain conversation with access to the internet and a long-term memory." That it is now publicly available allows Meta to study its capabilities in a more diverse setting than can be done through research studies. 

Meta notes it combined machine learning techniques, SeeKeR and Director, to build conversational models that learn from interactions and feedback.  

"Initial experiments already show that as more people interact with the model, the more it learns from its experiences and the better and safer it becomes over time — though safety remains an open problem," Meta notes. 

Editorial standards