Twitter creates 'Safety Mode' to temporarily block accounts caught insulting users

The feature will only be available to a small group of English-language users on iOS, Android and Twitter.com.
Written by Jonathan Greig, Contributor

Twitter is rolling out a new feature called Safety Mode that temporarily blocks certain accounts for seven days if they are found insulting users or repeatedly sending hateful remarks.

The feature will only be available to a small group of English-language users on iOS, Android and Twitter.com, the company explained in a blog post on Wednesday. 

Users will also be blocked if they are sending "repetitive and uninvited replies or mentions," according to Twitter senior product manager Jarrod Doherty. 

"When the feature is turned on in your Settings, our systems will assess the likelihood of a negative engagement by considering both the Tweet's content and the relationship between the Tweet author and replier," Doherty said. 

"Our technology takes existing relationships into account, so accounts you follow or frequently interact with will not be autoblocked. Authors of Tweets found by our technology to be harmful or uninvited will be autoblocked, meaning they'll temporarily be unable to follow your account, see your Tweets, or send you Direct Messages."


A screenshot of what Safety Mode will look like. 


Doherty added that unwelcome Tweets have gotten in the way of the kinds of conversations Twitter wants its users to continue having, prompting the creation of the Safety Mode tool and other features added in recent years to protect people. 

Users can learn more about the Tweets and accounts that were flagged by Safety Mode and will receive a notification once the Safety Mode ban period is about to end. Twitter will also send a recap of the situation before the period ends. 

"We won't always get this right and may make mistakes, so Safety Mode autoblocks can be seen and undone at any time in your Settings. We'll also regularly monitor the accuracy of our Safety Mode systems to make improvements to our detection capabilities," Doherty explained. 

"We want you to enjoy healthy conversations, so this test is one way we're limiting overwhelming and unwelcome interactions that can interrupt those conversations. Our goal is to better protect the individual on the receiving end of Tweets by reducing the prevalence and visibility of harmful remarks." 

In recent years, Twitter has worked with human rights groups and mental health organizations to get feedback about their platform and changes that need to be made to better protect users from discrimination, racism, sexism and other issues that have become rampant on the site. 

Twitter also created a Trust and Safety Council that they said pushed for certain changes to Safety Mode that would make it less likely to be manipulated. The council also nominated certain Twitter accounts to join the inaugural group of users that will have access to Safety Mode, with a particular emphasis being put on providing the tool to people from marginalized communities and female journalists.

Digital human rights group Article 19 -- which is a member of the Trust and Safety Council -- said it provided feedback on Safety Mode "to ensure it entails mitigations that protect counter-speech while also addressing online harassment towards women and journalists."

"Safety Mode is another step in the right direction towards making Twitter a safe place to participate in the public conversation without fear of abuse," Article 19 said in a statement.

Doherty noted that Twitter has taken part in other discussions about ways women can customize their experience on the site through tools like Safety Mode and others. Twitter will see how the tool is used and make adjustments as it rolls it out the larger Twitter user base.

The site has been making changes in recent months to cut down on the disinformation and abuse that have caused outrage among users for many years. In August, the site announced that it was conducting a test that would allow users in the US, South Korea and Australia to report misleading tweets, which have gained prominence during the COVID-19 pandemic and subsequent vaccine rollout.

Editorial standards