X
Business

VMs and backups: are yours securely stored, or indexed in Google?

Through recent research, I've unearthed an alarming number of VMs and backups in Google's index from companies, universities, and home users. Make sure your directories are protected!
Written by Stephen Chapman, Contributor

Question: Are you storing any virtual machines or HDD/SSD backups on systems that are connected to the Web? If so, then you might want to take a moment -- after reading this article, of course -- to make doubly sure that the locations you're storing them in are closed off to search engine spiders and URL traversing (modifying a URL in attempts to land in different directories).

Recently, I decided to undertake installing OS X inside of a VM. Much trial and error later, I had achieved what I sought to do... but then, curiosity began beating upon my brain, and I started to wonder if I could find any open directories, via Google, with OS X VMs that others had created. Well, it wasn't long after finding a few of them that I decided to see just how far the VM rabbit hole went, so I proceeded to search for VMs of OS X, Windows, and Linux. The results were alarming, to say the least.

Below is a very low impact sampling of the type of havoc that could be wreaked if someone got a hold of, say, your Windows 7 VM install:

Windows 7 Enterprise SP1 and Office 2010 licenses, as well as VPN specs
Windows 7 Enterprise SP1 and Office 2010 licenses, as well as VPN specs

Apologies for the low-res quality of that image, but all you need to take from it is that I extracted a Windows 7 Enterprise key, an Office 2010 Professional Plus key, and pre-defined VPN specifications. There were other internal tools packed in the VM, but they were of no interest to me. From this point, however, finding Windows 7 Enterprise media and Office Professional Plus 2010 media are a cinch. Naturally, I deleted the VM and anything that stemmed from it, save for the blurry screen shot above. My intentions aren't nefarious, but suffice it to say, plenty of others' are.

Now, I was going to dive into some specific search queries to show you just how easy it is to go about finding all of these VMs, but I've made the decision to nix that plan. Instead, I'll throw out one very basic example:

intitle:index.of Linux.vmdk

That's a ridiculously basic search, but you get the idea. There are tons of unique terms and phrases one can use with advanced search operators to reel-in some serious VMs, but it's not just Google that you can do this with. There are many specialty search engines out there that index even more of this type of stuff.

Now, what about backups? Whatever your backup solution may be (Norton, Acronis, etc.), suffice it to say that if you think finding VMs in Google is bad, then you should see the innumerable backup images that are out there: hundreds of thousands of gigs, just begging to be downloaded and rummaged through. And unlike the example above, I won't be providing a single sample query for this one. Put simply, I am astonished by the number of NAS shares and open directories that are sharing backups -- plenty of which are unencrypted, no less -- from individuals, companies, and schools.

In closing, the reality of the situation is that I doubt very many of you reading this will find that you have VM/backup shares indexed in Google; however, the gravity of the situation should motivate you to make sure such a thing doesn't happen in the first place -- that is, forward this to your IT team or verify site permissions yourself.

Also, try running some site: queries in Google with your domain name and see what you can unearth (example: site:yoursitehere.com). Add relevant keywords to get more specific, if need be (example: site:yoursitehere.com vmdk), or simply negate terms that appear often throughout your site so that you don't see any pages containing them in your search results (example: site:yoursitehere.com -keyword1 -keyword2). Lastly, searching for open directories on your site can yield worthy results, too (example: site:yoursitehere.com intitle:index.of).

Through my own consulting endeavors, I've audited quite a few sites that turned out to have sensitive information sitting in publicly-accessible directories (Internet- and intranet-facing alike). From there, it's simply a matter of a search engine spider crawling its way to such directories, then indexing them for all the world to potentially see.

Be cognizant of where your data lives and how it's accessed! And understand that search is incredibly powerful, even if you don't quite understand how it works.

Recommended reading:

10 Google Search Secrets
How to Become a Search Ninja: Harnessing the True Power of Google - Part 1
Search ninja part 2: How to find older versions of software (and much more)
Search ninja part 3: How to find unlisted YouTube videos with Google
Search ninja part 4: How to search FTPs with Google

Editorial standards