Data breaches are so common now that your eyes may tend to gloss over the news of yet-another public exposure of personally identifiable information (PII) and customer records.
Even in such a world, however, sometimes a case which tops many others still enters the public domain -- such as the discovery of a database which has been described as "perhaps the biggest and most comprehensive email database I have ever reported" by the researcher who uncovered the breach.
According to Bob Diachenko, alongside security researcher Vinny Troia, the 150GB MongoDB instance in question contained four separate collections of data.
In total, Diachenko and Troia found 808,539,939 records, the largest collection of which was named "mailEmailDatabase," separated into three sections as below:
- Emailrecords (798,171,891 records)
- emailWithPhone (4,150,600 records)
- businessLeads (6,217,358 records)
The information on offer was "more detailed than just the email address and included PII," the researchers say, with information relating to ZIP codes, phone numbers, physical addresses, email addresses, genders, user IP addresses, and dates of birth all available to anyone with an Internet connection.
After cross-referencing the database with records obtained from Troy Hunt's HaveIBeenPwned database -- a collection of known leaks and exposures which can be used by visitors to find out if they have been involved in a data breach -- Diachenko was able to ascertain that the database was not just a bulk data dump of stolen information, such as in the case of the Collection 1 leak.
"Although not all records contained the detailed profile information about the email owner, a large number of records were very detailed," the researcher added.
The MongoDB instance did provide some clues as to whom the data may belong to -- namely, a company called "Verifications.io."
At the time of writing, the company's website is unavailable, but cached pages show that Verifications.io describes itself as an email marketing firm with a particular specialization in circumventing spam traps and hard bounces.
One such service the company offers is called "Enterprise Email Validation," which allows customers to upload email lists for marketing and verification purposes. An email is simply sent to someone as a test which validates the email, but if it bounces, the message is added to a bounce list for testing later.
However, these messages appear to have been stored in plaintext and without any form of protective encryption once uploaded to the service.
While a list of email addresses and some PII may not seem like a big deal, Diachenko laid out a potential attack vector in which threat groups would find such a database an invaluable find.
If a hacker drew up a list of companies they wanted to compromise and also obtained a list of potentially usable credentials, rather than brute-force attack each one, all of their email addresses could be uploaded to a service such as Verifications.io.
By doing so, the threat actor is able to save time and reduce the chance of being exposed, while at the same time, the service validates their email cache to find the true targets worth pursuing -- as well as prove PII which could be used in identity theft or social engineering attacks.
The researchers reported their findings to Verifications.io, which pulled its website offline in response. The database was also taken down on the same day.
"In the response they identified that what I had discovered was public data and not client data, so why close the database and take the site offline if it indeed was "public"?," Diachenko noted. "In addition to the email profiles this database also had access details and a user list of (130 records), with names and credentials to access FTP server to upload / download email lists (hosted on the same IP with MongoDB). We can only speculate that this was not meant to be public data."