GitHub: We won't take down any of your content unless we really have to

Microsoft's open-source code-sharing platform's latest report places freedom of expression above all else.

GitHub's transparency report: How and when it removes or block content

It's not just social-media giants that are working hard to convince their users that transparency is their guiding principle – albeit with mixed results. Microsoft-owned GitHub is also beating that drum. 

The code-sharing platform, which doubles as a developers' social network, has just released its 2019 transparency report, detailing how and to whom it discloses user information, and on which grounds it removes or blocks content. 

Abby Vollmer, senior manager of policy at GitHub, said the organization is in favor of keeping as much content on the platform, rather than removing information. GitHub believes, indeed, that content moderation can raise free expression concerns.

"Being transparent about content-removal policies, and restricting content removal as narrowly as possible, are among the United Nations free speech expert's recommendations to platforms for promoting free expression in content moderation online," said Vollmer. "At GitHub, we do both." 

SEE: Six in-demand programming languages: Getting started (free PDF)

Much of the public spotlight has recently focused on getting rid of harmful content on social media, such as child nudity or terrorist propaganda. In the second half of 2019, for example, Instagram reported that it had taken action against more than 1.6 million pieces of content containing depictions of suicide or self-harm. 

However, the user content shared on GitHub differs from the posts published on networks like Instagram or Twitter. The platform hosts and shares software code, and lets developers 'fork' each others' inventions, change them and merge them together – all in the spirit of open source.

More often than not, the requests that GitHub receives relating to user accounts and content are therefore legal requests from law-enforcement agencies, in the context of criminal investigations – rather than civil litigants reporting harmful content. Or, said Vollmer, the platform handles requests related to copyright infringement, which need to be treated with precaution due to the patent-free nature of open source.

According to Vollmer, GitHub therefore takes extra care before responding to requests to access user information or block content. This year, for instance, almost 96% of the requests the platform received to disclose user information came from law enforcement. 

Vollmer said GitHub only releases information to third parties "when the appropriate legal requirements have been satisfied", which usually means a subpoena, court order or search warrant is necessary.

Of the 218 requests the platform received in 2019, only 165 were fulfilled. However, Vollmer noted that GitHub received over three times as many requests to disclose user information this year than it did in 2018.

When it comes to requests to remove or block content that is judged unlawful, GitHub consistently checks that the notice comes from an official government agency, that it was sent by an official, and that the source of illegality has been specified, before removing the content.

"We block the content in the narrowest way possible," said Vollmer. "For instance, we would block content only in the jurisdiction(s) where the content is illegal – not everywhere."

Of the 16 government requests for take-downs that GitHub processed in 2019, half came from Russia and another six came from China. In 2018, the platform processed nine requests, which all came from Russia.

Another type of request to take down content can be motivated by copyright concerns – and filed by copyright holders, not necessarily governments. 

GitHub said it has processed 1,762 copyright notices in 2019, which has led 14,320 projects to be taken down. Although the number seems high, it is only about one-100th of a percent of the repositories on GitHub. 

Copyright issues have long been a sticking point for GitHub, where content is mostly free to use by all. In 2018, for example, the EU mandated the use of content filters for all internet content distributors to spot copyright infringement. 

In the case of open source, where some developers use copyright as a tool to grant other developers freedom of distribution, the new rules could have led to the disruption of the entire ecosystem. 

Vollmer said at the time that the EU's proposals were "overly broad in their scope and, as applied to GitHub and our user community, could be so cumbersome as to prevent developers from being able to launch their work". 

SEE: Microsoft opens up Rust-inspired Project Verona programming language on GitHub

When a copyright notice is submitted, GitHub lets the user who posted the infringing content send a counter-notice asking the platform to reinstate the content if they believe the take-down was a mistake. 

Vollmer reported 37 counter-notices in 2019, and also noted that the company received many "incomplete" or "insufficient" notices on copyright infringement, which GitHub didn't act on.

GitHub's latest report is the fifth of the platform's annual transparency assessments. However, with 40 million users, it seems that the organization has less on its plate than social-media networks like Facebook. 

Mark Zuckerberg's platform has 2.38 billion users, and lately registered hundreds of thousands of government requests for user data, and took down almost 2.6 million pieces of content based on copyright reports.