Unsecure directory on PHP.net contains Blu-ray movies, usernames and passwords, and more

Unsecure directory on PHP.net contains Blu-ray movies, usernames and passwords, and more

Summary: PHP, utilized by millions of Web sites around the Web, has a not-so-hidden secret on their Web site: a directory full of pirated content, config files containing user name and password information, and more.

SHARE:
Update: The directory has now been taken care of; however, for the time being, Google's cache of the directory remains intact. It's interesting what a night of advanced Google querying can yield. On the heels of running across USA Today's prototype Windows 8 application in a designer's profile (thanks to advanced querying), I've now stumbled upon a directory on PHP's official Web site that contains a number of pirated Blu-ray movie rips, config files with user names and passwords, games, music, and more. If you like CSI or taking part of investigative research, you're going to love this post. I'm not just going to show you the directory, but I'm also going to break down how I found it, how things like this get found and indexed by Google, and I'll shore it all up with key takeaways. But before I delve into all of that, let's begin with a screen shot of the directory for quick reference: Now, as you can see, the most obvious files within the directory are the Blu-ray titles and music albums. Admin types will quickly notice the wget logs and the folder titled "config.save" -- both, containing telling information, as you would expect. The directory "fsx" contains a copy of Microsoft's Flight Simulator X and the rest of what you see there is source content for Web sites. So, where did this directory come from and how did Google find it to index? After all, this is obviously a directory that wasn't meant for public consumption. Well, put simply, a link to this directory has been posted in some form or fashion, somewhere on the Web. But how do we find out where that is? Sometimes, you simply can't... but that won't stop us from trying! First, Google provides an operator that shows us sites that link to an address you specify. Since the directory we're interested in is "id.php.net/downloads/", we'll try that operator (link:) to see if we can find any sites that link to it by leveraging the following query: link:id.php.net/downloads Hmm. That returned no results, so let's remove "/downloads" from the query and see what that does: link:id.php.net That didn't return any relevant results, either, but why? Well, it could be due to a number of factors -- anything from "nofollow" (which basically tells Google, "I don't want you to pass any link juice to this site from my site" on any links you specify as "nofollow" on your site); to link: operator fickleness; to Google simply not wanting to show every site in their index that another site is linked to from. With that in mind, let's try searching for the URL residing on other sites as a textual string instead. It's completely possible that a link to this directory may well reside on PHP.net's site somewhere, but we'll start this leg of our investigation by excluding results from php.net: "id.php.net/downloads" -site:php.net Looking through the results, we see numerous sites that contain the text "http://id.php.net/downloads.php" on them. That's not *quite* what we're looking for, but let's visit that URL to see if it redirects to "http://id.php.net/downloads/", which could be the answer to our question. Nope! It does redirect, but to a page completely unrelated from what we've discovered. Continuing down the list of results in Google, we finally come to a result yielding a rather interesting URL structure: http://id.php.net/downloads/src/redhat/RHEL_5.5/x86/rhel-server-5.5-i386-dvd.iso This is quite promising, so let's look at the cached version of the page in the results. Doing a CTRL + F to bring up a search box in the browser, searching for id.php will take us right to the spot on the page. Is Google smart enough to recognize a URL in textual format, parse it, and try to crawl it? Absolutely. But before we call it a day, let's go back over to the search results to see what else we find. Ah, yes. In the search results, we see a result with the following link: http://id.php.net/downloads/src/redhat/ Let's visit the cached version of that page and once again do a CTRL + F to search for id.php and locate it on the page. What do we see? A hyperlink! This looks like a much more likely candidate for how Google found its way to the id.php.net/downloads directory, because Google found this site, crawled it, saw that hyperlink, followed it, then crawled that directory. And because the index in that URL contains a link to "parent directory," *everything* that's not 403d (or hidden/access denied) within id.php.net/downloads is now able to be discovered by Google -- thus, index-able by Google. Now, viewing the source code of the page that hyperlink was discovered on, we see no signs of "nofollow." Theoretically, that means that performing the link:id.php.net search query *should* have shown this page as a result. But all it really means is that the link: operator is extremely fickle, or Google simply doesn't want us to see very much when using the link: operator. Or a little of both. Most of you won't know this, but we've actually lucked out in terms of finding some good examples of how this directory may have been discovered by Google. There are plenty of other scenarios for why this directory could currently exist in Google with all traces to it having disappeared. For instance, the Web site that Google initially discovered the directory on could have been nixed and Google updated its index to reflect as such. Or, Google could have found its way to that directory from a hyperlink using anchor text like "click here," yet deciding not to return that site in a link: query. Interestingly, other *.php.net sub-domains that I tried all redirected properly, such as http://us3.php.net/downloads.php and http://us3.php.net/downloads, so why has this one slipped through the cracks? Also, the config files in the config.save directory contain some easily-decrypt-able (for those who know how) information within them -- all of which is cached in Google. This serves as an example that -- in some cases -- your data can end up exposed to others in the most unpredictable of ways. Lastly, a peek at the wget files in Google's cache reveals the source of the pirated content residing in that directory: http://linux1.hk.psn.net.id/~buset/ Overall, this was a pretty big find -- especially for one that happened by chance. As of this moment, the contents of that entire directory (including its sub-directories) are indexed by Google, so even if the admins over at PHP.net nix access to the directory directly, they will still have to wait for Google's index to reflect the change. To note, this directory appears to be publicly accessible since at least 2009. The picture below from archive.org shows a snapshot of the directory from May 30, 2009: And with that, I'll wrap up this post. As you can see, it's surprisingly simple for an otherwise private section of a site to be indexed in Google when a file within such a directory is linked to somewhere discoverable and crawl-able. To note, I'm planning to start a weekly series where I reveal directories I find that contain pirated content sitting on people's personal and professional Web sites. I want to show just how commonplace these instances really are and I hope the awareness I create will be significant. In the mean time, be sure to check out part one and part two of my "search ninja" series to learn for yourself how to discover such wondrous results in Google and other places! Additionally, have a look at the related posts below to read my other case studies that expose data via in-depth Google querying. Thanks for reading! -Stephen Chapman
SEO Whistleblower

Related Content:

Topics: Hardware, Browser, Google, Mobility, Software Development

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

18 comments
Log in or register to join the discussion
  • Have you not heard of mirrors before?

    id.php.net is simply a mirror to 3rd party hosting hosted by http://www.pesat.net.id/ .

    PHP is a community owned and organised website and can't keep an iron and fist grip and stricture on all the servers that kindly offer to host mirrors for them.

    Since you didn't mention any of this information, however, I can only imagine that you wanted to sensationalize the article as much as possible.
    boopan
    • RE: Unsecure directory on PHP.net contains Blu-ray movies, usernames and passwords, and more

      @boopan Your Kung Fu style is much stronger!
      Parassassin
    • RE: Unsecure directory on PHP.net contains Blu-ray movies, usernames and passwords, and more

      @boopan In other words, the open source you download from any mirror may have malware in it because there is no quality controls in the software?
      Your Non Advocate
      • RE: Unsecure directory on PHP.net contains Blu-ray movies, usernames and passwords, and more

        @facebook@... That's why md5 hashes are always supplied with most open source downloads - for you to check that the file you downloaded is correct.

        I would also imagine they would have an automated process in place to make sure that the mirrors supplied are updated with the correct files. If that doesn't happen, then they wouldn't be listed as an official mirror.
        boopan
      • RE: Unsecure directory on PHP.net contains Blu-ray movies, usernames and passwords, and more

        @facebook@...

        There you go... either trolling or just talking out of the wrong end again.
        techadmin.cc@...
    • RE: Unsecure directory on PHP.net contains Blu-ray movies, usernames and passwords, and more

      @boopan Or, wait for it..., he's not intimately familiar with the inner workings of how the PHP organization operates?
      Within Rafael
  • RE: Unsecure directory on PHP.net contains Blu-ray movies, usernames and passwords, and more

    How do you know those files were pirated? There's an implied accusation there and, for all you know, those are all valid files. Now, if someone else, such as yourself, finds them and downloads them, then they would be pirated. But I think you should be a bit more careful with your accusations.
    Techknowledgie
    • RE: Unsecure directory on PHP.net contains Blu-ray movies, usernames and passwords, and more

      @Techknowledgie Step 1: Take filename. Step 2: Paste them into your favorite search engine.
      Within Rafael
      • RE: Unsecure directory on PHP.net contains Blu-ray movies, usernames and passwords, and more

        @Within Rafael The fact you can search for something by no means whatsoever excludes anyone from having a legal license on storing and playing it...
        boopan
      • RE: Unsecure directory on PHP.net contains Blu-ray movies, usernames and passwords, and more

        @boopan Correct, but it also doesn't exclude them from potential legal woes for inadvertently serving such content -- be it some major company's Web site or your next door neighbor's personal server. Never mind the Cisco router config files that contain user names and easily-decrypted passwords -- all of which are still cached in Google, despite having been removed locally.
        StephenChapman
  • RE: Unsecure directory on PHP.net contains Blu-ray movies, usernames and passwords, and more

    This article is obviously self-serving, written in a sophomoric-at-best style, contains numerous grammatical errors, and is filled throughout with speculation and factual errors. Any argument to the latter points is nil because at no point does the author cite any factual references for his claims outside of screenshots and basic Google searches. Further, as the author has found it necessary to blame and defame the php.net folks and claim they are responsible, directly or indirectly, I see an obvious legal liability here.

    Really not the best article I've seen on ZDNet. In fact, probably one of the poorest in all its years.
    casual_observer
    • RE: Unsecure directory on PHP.net contains Blu-ray movies, usernames and passwords, and more

      @casual_observer Thanks for reading! And I suppose the reason they removed the directory/mirror is due to them having absolutely no responsibility in the matter, right?
      StephenChapman
      • RE: Unsecure directory on PHP.net contains Blu-ray movies, usernames and passwords, and more

        @StephenChapman Perhaps a better word should be used than "responsibility" here. You're using it interchangeably with "culpability," which is not the case. Instead, if you used it to indicate that they acted in a responsible manner, I think you'd be more accurate.<br><br>By the way, perhaps it's in the concise nature of your wording, but I find your responses to the two comments to which you've replied thus far to be rather sanctimonious; not much more than flames in return for varying degrees of criticism. And as it does not go without notice, one can appreciate the lack of reply directly addressing the inference that your personal computer harbors pirated data. Seems befitting to blame others for the transgressions we hope others will never notice of ourselves.
        casual_observer
      • RE: Unsecure directory on PHP.net contains Blu-ray movies, usernames and passwords, and more

        @casual_observer Spin it however you want, but what I meant by "responsibility" is fully applicable. Yes, my replies were admittedly curt. And uhohok's red herring, in regards to my computer, is simply not worth addressing. By extension, neither is your philosophical wandering based upon it.
        StephenChapman
  • Wonder when the author will be comprimised

    I wonder how long it will be before the author's own computer is compromised. I have a sneeky suspicion that he will also have "pirated" applications/music/video/etc on his machine. Ahh, can't wait for the world to see that! :)
    uhohok
    • RE: Unsecure directory on PHP.net contains Blu-ray movies, usernames and passwords, and more

      @uhohok There's a big difference between someone trying to compromise my computer vs. easily finding content on people's servers via simple Google queries. Sounds like you have a personal issue with me that the world couldn't care less about. Sorry to disappoint.
      StephenChapman
      • RE: Unsecure directory on PHP.net contains Blu-ray movies, usernames and passwords, and more

        @StephenChapman
        Actually that is not correct. I have had clients who have had their machines hacked and files uploaded to it that don't belong and then Google to come along an index them for the world to see. It can only take an hour or less for this to happen sometimes. So no the fact that Google found the information does not mean that it was put there with the full knowledge of the server owner or their permission. The fact that you couldn't find a link to the information anymore tells me that someone probably put the files there that didn't have the authority to do that (aka hacked the server or found a directory with the wrong permissions, too permissive).
        tim.w.jung@...
      • RE: Unsecure directory on PHP.net contains Blu-ray movies, usernames and passwords, and more

        @tim.w.jung I appreciate the perspective, but read a bit more closely and you will see that I did, in fact, find a link to the site from elsewhere. Additionally, the files were transferred there by the very people serving that mirror of PHP.net. This was demonstrated by the information found within the WGET logs. No hacking took place there whatsoever. I understand it can happen, but it didn't in this scenario. There's plenty of historicity and log information that was within that directory to show that what was there, was there intentionally. Obviously, it was not with PHP.net's knowledge, but if signing up to mirror their site is as selective of a process as their "Mirrors" page suggests, then they should consider running semi-frequent checks to monitor the integrity of all such hosts once approved.
        StephenChapman