Citizen Lab: WeChat’s real-time censorship system uses hash indexes to filter content

The filtering systems also censor content that are not critical of the Chinese government.

WeChat, the popular messaging app operated by Tencent, uses various MD5 hash indexes to censor content in real-time, according to findings released by Canadian research group Citizen Lab on Monday. 

The report, (Can't) Picture This 2: An Analysis of WeChat's Realtime Image Filtering in Chats, found that WeChat implements real-time censorship of images sent on its platform by using MD5 hash indexes of known sensitive images to filter any images whose MD5 hash exists in the index. 
 
The ways this operates is that when a user sends an image, WeChat's servers will calculate the image's cryptographic hash, and if the hash is in the hash index, the image is censored in real-time instead of being sent to the would-be recipient.
 
"Cryptographic hashes can be computed quickly, and therefore this hashing is amenable to real-time filtering applications unlike more expensive techniques such as OCR or perceptual fingerprints," Citizen Lab said.

WeChat maintains separate hash indexes for the platform's Moments, group chats, and 1-to-1 chats.

If an image is not found in the hash indexes, it will not be initially censored, and will be queued for analysis by WeChat's non-real-time Optical Character Recognition (OCR) censorship system. As Citizen Lab explained, OCR compares text to keywords on a sensitive keyword blacklist, and for images, it compares the posted image's visual fingerprint to those on a sensitive image blacklist. If an image is deemed to be sensitive through the OCR, it would then be censored and its MD5 hash added to the hash indexes. 

As Moments, group chats, and 1-to-1 chats each have their own respective hash indexes, the Canadian research group also found that 1-to-1 chat had less censorship in comparison to the two other features. Conducting a censorship test of 128 images, only 36 of those images were filtered in 1-to-1 chats in comparison to the 109 images censored by both Moments and group chats.
 
"While the image blacklists for group chat and Moments are largely identical, the reduced filtering on 1-to-1 chat is an indication that WeChat perceives 1-to-1 chat as less of a risk for sensitive conversations due to its more private nature," Citizen Lab said.

See also: Winnie The Pooh takes over Reddit due to Tencent investment and censorship fears

The report also found that WeChat's filtering system often censored more than just "negative information about the government but also neutral references to government policy and screenshots of official announcements accessible via government websites". 
 
Through testing various images on WeChat, Citizen Lab found that images, such as a screenshot of a Euronews broadcast about an Italian artist creating a huge portrait of Chinese President Xi Jinping welcoming him to Italy in March 2019, were censored despite being neutral to China.

"While we might expect Chinese censors to be sensitive to domestic criticism, this reminds us that even the reference to an outside alternate form of governance may also be sensitive," Citizen Lab said.  
 
WeChat's filtering system is only applied to Chinese accounts registered with a Chinese phone number however, with censored content remaining visible if it is sent to an account that is registered with an non-Chinese number.

Tech companies operating in China are required by law to control content on their platforms, or face penalties, under the expectation that companies will invest in the technology and personnel required to ensure compliance.

Since November, WeChat has been boosting its content and qualification auditing to actively crack down on content posted on its platform that damages the content ecology and seriously affects the users' reading experience, WeChat said at the time

The content cleanup specifically targets harmful political information, pornographic and vulgar content, click-bait headlines, plagiarism, infringement, and other violations, WeChat added.

WeChat's latest initiative to step up censorship was an effort to comply with the requirements of the Cyberspace Administration of China, which has disciplined and closed down more than 9,800 self-media accounts from various social media platforms since it started a content cleanup campaign in October, according to Chinese media reports.

China kicked off another round of its online content purge in July by removing dozens of podcast and audio apps. The move to remove the audio-related content was backed by China's policy to regulate and remove any online video and text information that is considered "harmful" to society. 

WeChat currently has over 1 billion users that opening their WeChat accounts on a daily basis.

Related Coverage

China cracks down on podcasts and audio apps

The recent cleanup on audio information follows China's serial crackdown on online text information, video, and games that commenced last year.

Tencent's WeChat steps up censorship to clear undesirable content

WeChat's is purging undesirable content on its platform to maintain a 'healthy' reading environment as required by the government.

Apple: iPhone info requests from Chinese government have exploded

Apple releases information about government requests for user data in the second half of 2018.

Reports say China is installing surveillance apps on some visitors' phones

The Guardian has reported the app extracts emails, text messages, contacts, and handset information.

Telegram says 'whopper' DDoS attack launched mostly from China

The company's CEO has confirmed the timing coincided with the Hong Kong extradition law protests organised on his platform.

China is the biggest obstacle to US AI advancement, half of CEOs say (TechRepublic)

A lack of talent and employee trust are some of the largest barriers to artificial intelligence adoption in US businesses, according to an EY report.