The technologies at the heart of most data leaks
Docker, a tool designed to make it easier to create, deploy, and run applications (as containers), is one of the most popular technologies in cloud server environments these days.
Despite a 2017 joint report from Cisco and Rapid7 finding that there were over 1,000 Docker containers left exposed online without any authentication, there haven't been any major data breaches reported via this technology, until now.
Nonetheless, the potential for abuse is there, and it's being exploited already. Not by data thieves, but by crypto-mining groups.
To avoid any potential Docker container hijacks --which in turn can lead to data thefts or accidental leaks-- there are some steps that server owners can take.
AWS S3
Amazon Simple Storage Service, or Amazon S3, is a data storage service that comes included with the Amazon Web Services (AWS) web hosting package from Amazon. Over the past few years, S3 has been a nightmare to work with, coming with complicated controls and settings, which have led to many incidents where companies have left S3 servers exposed online, leaking sensitive data along the way. Here's a short list of the biggest incidents:
- An unsecured S3 server exposed thousands of FedEx customer records
- An AWS S3 error exposed GoDaddy business secrets
- Accenture left a huge trove of highly sensitive data, including "keys to the kingdom," on exposed servers
- Customer records for at least 14 million Verizon subscribers, including phone numbers and account PINs, were exposed via an S3 bucket
- A Verizon AWS S3 bucket containing over 100 MB of data about the company's internal billing system was also found exposed online
- An S3 database left exposed leaked the personal details of job applications that had Top Secret government clearance
- Another S3 server exposed the details of 198 million American voters
- National Credit Federation leaked US citizen data through unsecured AWS bucket
- Nigerian airline Arik Air also leaked customer data via an exposed S3 bucket
- Pocket iNet ISP exposed 73GB of data including secret keys, plain text passwords
- An S3 leak at Alteryx left 123 million American households exposed to fraud and spam
- AgentRun, an insurance startup, also leaked sensitive customer health data via amisconfigured Amazon S3 bucket
- Donald Trump's campaign website also leaked intern resumes via an S3 bucket
- Spyware firm SpyFone also left customer data, recordings exposed online via an S3 server
- Booz Allen Hamilton, a top DOD contractor, leaked 60,000 files, including employee security credentials and passwords to a US government system
- An AWS S3 server leaked the personal details of over three million WWE fans who registered on the company's sites
- An auto-tracking company leaked over a half of a million car and car owner details.
- Voting machine firm Election Systems & Software (ES&S) left an S3 bucket exposed online that contained the personal records of 1.8 million Chicago voters
- Dow Jones leaked the personal details of 2.2 million customers
- An S3 bucket leaked data of thousands of Australian government and bank employees
- Password manager Keeper also exposed an S3 server

MongoDB
MongoDB, a NoSQL database solution, has been at the heart of many data leaks, probably as many as AWS S3 incidents, if not more. Here's a small --and very incomplete-- list:
- MongoDB server leaks 11 million user records from California e-marketing service
- MongoDB server leaks data of nearly 700,000 Amex India customers
- Data management firm Veeam leaks 445 million records
- Garmin's Navionics exposed data belonging to thousands of customers
- CVs containing sensitive info of over 202 million Chinese users left exposed online
- French news site L'Express exposed reader data online
- OCR software dev exposes 200,000 customer documents
- MongoDB server exposes babysitting app's database
- Health care data of two million people in Mexico exposed online
- Almost 9,5 million PII records leaked by data aggregator Adapt
- Children's charity Kars4Kids leaks info on thousands of donors
- Database of abandoned iOS app exposes details for 198,000 users
- Microsoft careers website was leaking data via a misconfigured MongoDB database
For securing MongoDB servers, the advice listed in this blog post is the first steps that most database admins should be taking.
ElasticSearch
ElasticSearch is a technology used for powering distributed search technologies, which in lay terms can be narrowed down to "very powerful search system." It is a ubiquitous technology because it's very good at its job.
Developed to be deployed on internal networks, ElasticSearch installations have suffered from the same problems that have plagued MongoDB installations, where companies install them and forget to put a password or a firewall to protect the cached search results, which in many cases included highly sensitive information.
ElasticSearch has been at the heart of a large number of breaches recently, but we're only gonna list a few from the past months:
- FitMetrix user data exposed via passwordless ElasticSearch server cluster
- Sky Brasil exposes data of 32 million subscribers
- Brazil's largest professional association suffers massive data leak
- Real-time location data for over 11,000 Indian buses left exposed online
- ElasticSearch server exposed the personal data of over 57 million US citizens
- Online casino group leaks information on 108 million bets, including user details
- VOIPO database exposed millions of call and SMS logs, system data
- Millions of bank loan and mortgage documents have leaked online
- Data management giant Rubrik leaked a massive database of client data
For information on securing ElasticSearch cluster, please go to this guide.
Kibana
Kibana is a software package that works as a visual interface (GUI) for viewing ElasticSearch data. It is almost always installed with ElasticSearch clusters.
Many of the security breaches reported as being caused by ElasticSearch are, in reality, caused by admins leaving the Kibana interface without a password, while the ElasticSearch server underneath is well-secured. The opposite scenario is also valid, where Kibana has a password, but the ElasticSearch server is left wide open on the internet.
It's hard to distinguish post factum which of the ElasticSearch breaches were caused by Kibana and which by the ElasticSearch. We created a separate slide to make sure ElasticSearch server owners understand that they also need to make sure they password-protect their Kibana apps as well as the ElasticSearch server that runs beneath it.
rsync
Rsync is a data backup utility that allows computers to synchronize and transfer files between different workstations. It's an internet-based service, which automatically means it's almost guaranteed that someone misconfigured it at least once. And, they have.
The most notorious incidents where rsync misconfigurations led to data breaches include (1) a 2016 report when a leaky rsync server exposed new evidence into an inmate's suicide, (2) a spam mail operator who leaked 1.37 billion email addresses via a rsync service, and (3) the Oklahoma Department of Securities, which leaked details about FBI investigations earlier this year.
Apache CouchDB
CouchDB is a lesser-known open source NoSQL database solution. It is coded in Erlang, and is currently developed by the Apache Foundation.
Just like its other database brethren, CouchDB can be misconfigured and can leak data. Past data breaches caused by CouchDB instances include the Thomson Reuters World-Check database of people suspected of terrorist activities, a database for managing alarm systems at Oklahoma banks and government agencies, and a CouchDB storing details for 154 million US voters.
The CouchDB security guide is available here.
etcd
Etcd is a database server that is most often used in corporate and cloud computing environments. They are a standard part of CoreOS, an operating system developed for cloud hosting environments, where they are used as part of the OS' clustering system. CoreOS uses an etcd server as a central storage environment for passwords and access tokens for applications deployed via its clustering/container system.
The technology is relatively new, and there has been only one leak caused by an etcd database recorded to date --that of Finnish phone maker Nokia. However, the potential for more is there, as there are over 2,200 etcd servers currently exposed online, some of which may be freely accessible to anyone.
Firebase
Firebase is a Backend-as-a-Service offering from Google that contains a vast collection of services that mobile developers can use in the creation of mobile and web-based applications.
An Appthority report from June 2018 found that thousands of iOS and Android mobile applications are exposing over 113 GBs of data via over 2,271 misconfigured Firebase databases.
JIRA
Jira is a proprietary issue tracking product developed by Atlassian. JIRA has a reputation of being hard to configure due to the terms it uses in its UI controls. Over the past few years, organizations who accidentally made JIRA boards public include NASA, the United Nations, and... thousands more.
Trello
Trello is a web-based project management application developed by Atlassian. It suffers from the same confusing wording of various UI controls that sometimes lead to accidental exposures of companies' internal project boards. Accidental exposures have happened at the United Nations, the UK government, and the Canadian government.
Sometimes, the accidental exposure of some Trello boards can become much much worse if employees post passwords for other services and servers on those boards, which appears to be a common practice.
Kubernetes
Kubernetes is a new type of software that's usually deployed on cloud server infrastructure. It's used for managing large IT networks and for quickly deploying app containers across multiple servers. If ever left exposed online, such systems usually expose the keys to the kingdom, allowing attackers to access existing server containers or deploy new ones with specific tasks in mind.
The most prominent companies that suffered a breach because of an exposed Kubernetes instance include Tesla Motors and Weight Watchers.
Albeit there are over 20,000 Kubernetes systems currently available online, most are properly secured, and there have been very few leaks that originated from Kubernetes until now.
But give it time! The technology is still new, and no doubt there will be many snafus in the coming months and years. If Kubernetes admins are looking into securing such systems, this page is the first place to go.
Docker
Docker, a tool designed to make it easier to create, deploy, and run applications (as containers), is one of the most popular technologies in cloud server environments these days.
Despite a 2017 joint report from Cisco and Rapid7 finding that there were over 1,000 Docker containers left exposed online without any authentication, there haven't been any major data breaches reported via this technology, until now.
Nonetheless, the potential for abuse is there, and it's being exploited already. Not by data thieves, but by crypto-mining groups.
To avoid any potential Docker container hijacks --which in turn can lead to data thefts or accidental leaks-- there are some steps that server owners can take.
Redis
Redis is an open source in-memory data structure store, that can be used as a database, cache system, and message broker. By design, Redis does not come with any default authentication system and all data stored inside its memory is stored in clear text.
Over the past several years, there have been numerous reports warning that there are tens of thousands of Redis servers currently available online without a password.
While there have been a small number of companies who lost data to hackers after leaving servers exposed online, most hacker groups have focused on using these servers for crypto-mining operations, mainly because they have access to large hardware resources that other database systems don't tend to have.
An Imperva 2018 study found that 75 percent of all Redis servers currently left without a password online had already been infected with one or more types of malware. While companies might not be interested in securing servers against malware attacks, these infections are still considered breaches, and companies will be forced to send breach notifications when such incident (considered an intrusion) is detected, regardless. So, in the end, it wouldn't hurt server owners to take a look at the Redis security page and follow the tips and advice on that page.