A common "script kiddie" technique to find vulnerable online computer systems is to attempt to scan a range of IP addresses for responsive known services, such as Telnet or SSH, and then attempt to log in using the default username and password. A crude physical analogy would be a burglar who walks from house to house in a neighbourhood, checking to see whether anyone has forgotten to put a lock on their door.
With an opportunistic attack, given enough "neighbourhoods" and enough time, one could potentially gain an insight into how poorly protected people are. However, with the burglar being a single person, doing so would take them a prohibitively long time — unless, theoretically, they were able to recruit vulnerable households and send them to different neighbourhoods to do the same.
That was the idea behind the Internet Census 2012 and the Carna botnet, an illegal project that, in addition to answering the big question of how many IPv4 addresses are in use, highlighted just how many people left their metaphorical front doors unlocked by using default passwords and user logins.
In an author-less paper published on Bit Bucket, the Carna botnet outsourced the task of port scanning millions of IP addresses to vulnerable machines, beginning with a scan of about 100,000 IP addresses.
"With 100,000 devices scanning at 10 probes per second, we would have a distributed port scanner to port scan the entire IPv4 internet within 1 hour," the paper said.
The botnet itself did not spread using any sophisticated techniques, instead trying the username/password combinations of admin/admin, root/root, and both usernames with blank passwords. Between March and December 2012, the Carna botnet consisted of 420,000 clients.
Despite having control of such a large network, the botnet remained relatively benign, as its intent was to collect data for research. The paper states that the author had "no interest to interfere with default device operation" and made no permanent changes to machines it infected. Removal of any uploaded files was a simple matter of rebooting the device, and a number of precautions were made to ensure that they would not interfere with normal device usage.
"Our binaries were running with the lowest possible priority, and included a watchdog that would stop the executable in case anything went wrong. Our scanner was limited to 128 simultaneous connections, and had a connection timeout of 12 seconds," the paper said.
"We also uploaded a readme file containing a short explanation of the project, as well as a contact email address to provide feedback for security researchers, ISPs, and law enforcement who may notice the project."
While the census found that the majority of vulnerable devices were consumer routers, the paper notes that a few vulnerable devices included industrial control systems and border gateway protocol routers responsible for assisting in finding country-wide routes for the wider internet. In simplistic terms, the latter routers help ISPs to find optimal routes through which to direct, at times, country-wide traffic, making it possible to cut entire nations off from the rest of the world. This protocol has been subject to attack in the past, such as when last year, or when 15 percent of the world's traffic was in 2010.
Although Carna spread to 420,000 devices, it does not represent the total number of vulnerable ones; technical limitations, such as insufficient space for Carna's binaries, meant that it could not run on certain devices. From the unique hardware addresses (MAC addresses) the botnet collected from vulnerable devices, the paper suggests that there were about 1.2 million unprotected devices in its census, and many more devices that were simply unable to report their hardware address.
"A lot of devices and services we have seen during our research should never be connected to the public internet at all. As a rule of thumb, if you believe that 'nobody would connect that to the internet, really nobody', there are at least 1,000 people who did. Whenever you think, 'that shouldn't be on the internet, but will probably be found a few times', it's there a few hundred thousand times. Like half a million printers, or a million webcams, or devices that have root as a root password."
As insecure as these systems were, and the significant consequences that their exploitation may have brought, the paper chose to ignore all traffic going through them that was not relevant to its study in order to respect users' privacy.
A twist in the study came in the form of another botnet that Carna discovered during its first initial scan of a few thousand devices. It discovered another bot called Aidra that the paper's author said was "clearly made for malicious actions". After examining what targets Aidra was looking at, Carna's author decided to close the telnet service that it was using to spread on machines that it infected.
In this manner, if Carna infected a machine it knew was being targeted by Aidra, it would gather its research data, then "auto-immunise" itself against the impending Aidra attack, therefore protecting the victim from being abused.
"We figured that the collateral damage as a result of this action would be far less than Aidra exploiting these devices."
However, Carna's policy of ensuring that no changes were ever made permanent meant that restarting the device would remove Carna as well as any protection it offered against Aidra.
The big question of the census, however, was how many 3.6 billion or so non-reserved IPv4 addresses are actually in use?
"That depends on how you count. 420 million pingable IPs plus 36 million more that had one or more ports open, making 450 million that were definitely in use and reachable from the rest of the internet. 141 million IPs were firewalled, so they could count as 'in use'. Together, this would be 591 million used IPs. 729 million more IPs just had reverse DNS records. If you added those, it would make for a total of 1.3 billion used IP addresses. The other 2.3 billion addresses showed no sign of usage."
As for the rest of the data, the author has made it available for anyone who's willing to download the 568GB compressed package. Uncompressed, the total data represents 9TB of raw logfiles, covering results from ping, reverse DNS, port scan, traceroute, and TCP IP fingerprint tests.
"We hope other researchers will find the data we have collected useful, and that this publication will help raise some awareness that while everybody is talking about high-class exploits and cyberwar, four simple stupid default telnet passwords can give you access to hundreds of thousands of consumer as well as tens of thousands of industrial devices all over the world."