10 years of Windows file changes

10 years of Windows file changes

Summary: It's a truism in storage: consumer's average files are bigger, making bandwidth more important than IOPS. But new research on shows that's not true - among other interesting results.

SHARE:

It's a truism in storage: consumer's average files are bigger, making bandwidth more important than IOPS. But new research shows that's not true - among other interesting results.

A recent paper, A Study of Practical Deduplication (pdf) by William Bolosky of Microsoft Research and Dutch Meyer of the University of British Columbia looked at how Windows file systems have evolved in the last decade. The paper was presented at the Usenix FAST '11 conference and won the Best Paper award.

  • Median files sizes aren't changing. Yes, the largest files are larger - think audio and especially HD video - but small files continue to proliferate keeping the median file size unchanged for 30 years.
  • Average file sizes are larger. While average sizes remain the same, the mean file size has tripled in 10 years to 318k.
  • Average file system capacities have tripled. In 2000 few Windows machines had more than 50 GB in their file systems. The systems in the new study found an average of 194 GB of capacity.
  • The variety of file types is increasing. The 10 most popular file extensions account for less than 45% of capacity vs over 50% in 2000. Files with no extension are now the most common.
  • Defrag works. The background defrag built into Windows works. Researchers found fewer than 4% of all files were fragmented.

The Storage Bits take The popularity of SSDs isn't just because they're cool: the proliferation of small files - and the IOPS needed to access them - needs the fast random read performance of SSDs. Seagate is on the right track with their hybrid flash/disk drives.

While the amount of stored data isn't growing as fast as storage capacity, the tripling of file system capacity points up the need for higher data integrity. The more data you store the more likely our crummy file systems are to corrupt your data.

And finally, it's good to see that the background defrag built into Windows and Mac OS - though the latter wasn't included in the study for some reason - actually works. Sometimes problems do get solved.

Comments welcome, of course. BTW, it turns out that simple whole file deduplication combined with sparse file support is an effective - and much simpler - way to deduplicate data.

Topics: Operating Systems, Software, Storage, Windows

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

42 comments
Log in or register to join the discussion
  • Robin

    Most people relate "average" value to be the first statistical moment, also know as "mean" value. Perhaps you meant to refer to the "median" value which is defined as the value where half the population has greater values and half has smaller values.

    Generally, a standard Gaussian or "bell" curve gives the same value for mean and median. However, these can differ greatly for skewed statistical populations.

    Just FYI
    jacarter3
    • Thank you

      @jacarter3

      I thought Robin could have been a bit more precise in his terminology. When I read "average", and then "mean", it seemed like a contradiction.
      Economister
      • RE: 10 years of Windows file changes

        A user makes the ultimate and seemingly final decision to delete a file or an email message <a href="http://www.party-shop.ie/baby-party-celebrations-baby-shower-parties-c-32_1069.html">Baby Shower Party Supplies</a>. Once that Delete key is hit, the data will be gone forever, right? Its actually extremely easy to recover a file or other item that has simply been deleted.<br>At one organization where I worked, a termination was handled poorly, and the person was allowed to go back to his office unaccompanied to clean up his files. He ran through his hard drive and deleted a bunch of files, and then opened Outlook and deleted all of his <a href=http://www.lfhair.com>wigs</a> messages. Later that day, his supervisor called me in a panic indicating that all of this information had been deleted. I was able to recover the data by remotely connecting to the machine, recovering the deleted files from the Recycle Bin (yes, the Recycle Bin), and then opening Outlook and undeleting the messages. The recovery <a href="http://www.mirti.com">Annuaire</a> process took about two minutes, and all of the information was back where it needed to be.
        richard8990
      • RE: 10 years of Windows file changes

        Its a truism in storage: consumers average files are bigger, making bandwidth more important than IOPS. But new research shows thats not true - among other interesting results.<br>Median files sizes arent changing. Yes, the largest files are larger - think audio and especially HD video - but small files continue to proliferate keeping the median file size unchanged <a href= http://www.bocaraton-real-estate.com> Boca Raton real estate</a> for 30 years.<br>The popularity of SSDs isnt just because theyre cool: the proliferation of small files - and the IOPS needed to access them - needs the fast random read performance of SSDs. Seagate is on the right track with their hybrid flash/disk drives.<br>While the amount of stored <a href="http://www.interiorwindowshutters.org">interior window shutters</a> data isnt growing as fast as storage capacity, the tripling of file system capacity points up the need for higher data integrity. The more data you store the more likely our crummy file systems are to corrupt your data.<br>And finally, its good to see that the background defrag built into Windows and Mac OS - though the latter wasnt included in the study for some reason - actually works. Sometimes problems do get solved. What do I tell my Mac-using computer <a href=http://www.roxanneardary.com>south jersey real estate</a> illiterate friends about Mac malware?
        richard8990
      • RE: 10 years of Windows file changes

        The Storage Bits take<br>The popularity of SSDs isn???t just because they???re cool: the proliferation of small files - and the IOPS <a href=http://www.electroniccigaretterated.com >electronic cigarette</a> needed to access them - needs the fast random read performance of SSDs. Seagate is on the right track with their hybrid flash/disk drives.<br>While the amount of stored data isn???t growing as fast as storage capacity, the tripling of file system capacity points up the need for higher data integrity. The more data you <a href=http://freelancewritersclub.com/>Freelance Writers Club </a> store the more likely our crummy file systems are to corrupt your data.<br><br>And finally, it???s good to see that the background defrag built into Windows and Mac OS - though the latter wasn???t included in the study for some <a href=http://www.africanmangos.net>African Mango</a> reason - actually works. Sometimes problems do get solved.<br><br>Comments welcome, of course. BTW, it turns out that simple whole file deduplication combined with sparse file support is an effective - and much simpler - way to deduplicate data.<br>The popularity of SSDs isn???t just because they???re cool: the proliferation of small files - and the IOPS needed to access them - needs the fast random read performance of SSDs. Seagate is <a href=http://www.oakitashop.co.uk >Oak furniture</a> on the right track with their hybrid flash/disk drives.
        dhape011
      • RE: 10 years of Windows file changes

        @Economister A very good and informative article indeed.It helps me a lot to enhance my knowledge more, <a href="http://www.klikmedia.net" rel="follow">high CPC</a> her in place.It helps me a lot to enhance my knowledge more, <a href="http://smsblast360.com/in/News/why-bulk-sms-are-useful-for-consumers.html" rel="follow">bulk sms</a> Neville spat available a number of <a href="http://www.seokutil.com/adira" rel="follow">Adira Asuransi Kendaraan Terbaik Indonesia</a> foam that's with their teeth, as well as bought their wand to help Ron as a signal in which mulberry daria hobo bags he or she got clearly claimed. A very good and informative article indeed.It helps me a lot to enhance my knowledge more, <a href="http://www.klikmedia.net" rel="follow">ads network</a> Thank you and wish you a nice day. Good Luck!
        seokutil
      • RE: 10 years of Windows file changes

        @Economister I totally agree with you there. But don't you think that it's a bit overrated what you just said? I really think that you have said too much there but I don't <a href="http://www.thesimssocialcheatsx.com/">the sims social cheats</a> really think that it's related about the topic above. I have read the posts below and would consider them a bit out of context so it's really nice to read something <a href="http://www.mafiawars2cheatsguide.org/">mafia wars 2 cheats</a> that is related to what was said above. You have great thoughts there though and I definitely learned something from them though not very much. My friends will surely <a href="http://www.hackcentral101.org/mafia-wars-2-cheats/">mafia wars 2 cheats</a> enjoy reading all the posts on this site including the comments above and below. And as to what I can say on the topic, well I personally think that you have a point on everything there but do you really have to say it out like that? Did you even think of what your thoughts would mean to us? <a href="http://www.hackcentral101.org/the-sims-social-cheats/">the sims social cheats</a> I don't believe so but still, if you wish to change the subject feel free to leave a comment and we can discuss this some more. Thank you for your time.
        Janice02x1
      • RE: 10 years of Windows file changes

        @Economister I really don't know what you meant with what you just said there. <a href="http://www.pacquiaovsmarquezfight.net/">pacquiao vs marquez 3 live stream</a> but hey, if that is how you see the topic then it's all good. <a href="http://www.pacquiaovsmarquezfight.net/2011/09/watch-pacquiao-vs-marquez-3-online.html">watch pacquiao vs marquez 3 online</a>
        indaymandra
      • RE: 10 years of Windows file changes

        @Economister

        What is up with the user below "richard8990" replying with a bunch of links to products like <a href=http://myafricanmango.com>African Mango</a> and what not? Anyway, I found the post insightful despite the un-insightful comments. :)
        jd119
      • RE: 10 years of Windows file changes

        @Economister <a href="http://smsblast360.com/in/News/why-bulk-sms-are-useful-for-consumers.html">bulk sms</a> You have your point of view. <a href="http://smsblast360.com">sms masking</a> Indeed! You have a Tablet Android. A very good and informative article indeed. <a href="http://mim.yahoo.com/adira-asuransi-kendaraan-terbaik-indonesia/">Adira Asuransi Kendaraan Terbaik Indonesia</a> It helps me a lot to enhance my <a href="http://smsblast360.com">sms gateway</a> knowledge more, I really like <a href="http://smsblast360.com/in/News/why-bulk-sms-are-useful-for-consumers.html">sms promo</a> the way you explain how it goes.
        seokutil
    • RE: 10 years of Windows file changes

      @jacarter3
      My bad. I used the term as it was in the paper without thinking. Corrected. Thanks!

      THINKING: don't leave home without it.
      Robin Harris
      • RE: 10 years of Windows file changes

        @Robin Harris
        Still reads funny in the second bullet point though... Actually, the last "average" in the first bullet point probably should be changed too.
        -bob-
      • RE: 10 years of Windows file changes

        @Robin Harris That is really the funniest way to present such issues right over here. <H1><a href="http://blog.seoservice360.com/top-1-oli-sintetik-mobil-motor-indonesia.html" rel="follow">TOP 1 Oli Sintetik Mobil-Motor Indonesia</a></H1> Its a great platform showing <H1><a href="http://www.seokutil.com/adira" rel="follow">Adira Asuransi Kendaraan Terbaik Indonesia</a></H1> such impressive way to deliver any message. Hope that you will keep <H1><a href="http://smsblast360.com" rel="follow">sms broadcast</a></H1> posting in the future too to let us know more. Keep sharing.
        seokutil
        • RE: 10 years of Windows file changes

          @seokutil Hope that you will keep <H1><a href="http://togeltogel.com" rel="follow">togel</a></H1> posting in the future too to let us know more.
          seokutil
      • RE: 10 years of Windows file changes

        @Robin Harris I am also of the same opinion. <a href="http://www.finca-mallorca-reisen.de/blog/finca-news-finca-mallorca/">Finca Mallorca</a>
        peterklein
    • RE: 10 years of Windows file changes

      @jacarter3 Maybe they meant the median value. Great to read an informative posts here.

      <H1><strong><a href="http://hlcgroup.net">Mortgage Rates</a></strong></H1>

      <H1><strong><a href="http://hlcgroup.net">Mortgage Louisville KY</a></strong></H1>
      holly.potman
      • RE: 10 years of Windows file changes

        Has anyone had any luck with this? <a href="http://antiquesnarts.com">antique</a> I don't have the option of going back to XP as W7 has been force upon me from mgmt and I need to get to this information. <a href="http://pleasantspot.com">pottery</a>
        jimmy9200
    • RE: 10 years of Windows file changes

      Even Apple realized that it could not continue with that train-wreck. Is is behind on performance, on features, on robustness and on capacity. That is why Apple was trying to port ZFS over. <a href="http://schoolgirlpictures.org">school girl pictures</a>
      dhape
    • RE: 10 years of Windows file changes

      Macs really are more secure than Windows. But people still have to use their brains to avoid every scam - which means some scams will work - regardless of platform.
      What do I tell my Mac-using computer illiterate friends about Mac <a href="http://www.careworx.co.uk">registered nurse jobs</a> malware?
      First, ignore the alarmists. Mac???s are well locked down <a href="http://www.careworx.co.uk">care manager Jobs</a> as they???re based on Unix. Hackers have been beating up Unix for decades and it???s solid.

      Make it harder: don???t download apps from sites you don???t know; don???t open up zip files from people or companies you don???t know; don???t install anything - which requires your password - if you didn???t specifically want to install it. If in doubt, leave it out.

      Buy new apps from the App Store. They???re safe and will automatically be updated - with updates from the App Store.
      One more thing: go to Safari Preferences->General and UNCHECK ???Open ???safe??? files after downloading???. Exploits <a href="http://www.careworx.co.uk">home manager jobs</a> can come in through JPEGs and movies. Only open files that you selected and trust.
      Much of that advice goes for Windows users too. Except you should be alarmed: Windows malware is everywhere!
      richard8990
    • RE: 10 years of Windows file changes

      The popularity of SSDs isn???t just because they???re cool: the proliferation of small files - and <a href=http://www.karmaloop-codes.com >Karmaloop Codes</a> the IOPS needed to access them - needs the fast random read performance of SSDs. Seagate is on the right track with their hybrid flash/disk drives.

      While the amount of stored data isn???t <a href=http://buysetgame.com>Set Game</a> growing as fast as storage capacity, the tripling of file system capacity points up the need for higher data integrity. The more data you store the more likely our crummy file systems are to corrupt your data.

      And finally, it???s good to see that the <a href=http://www.fragrantbodyoilz.com/>body oil</a> background defrag built into Windows and Mac OS - though the latter wasn???t included in the study for some reason - actually works. Sometimes problems do get solved.

      Comments welcome, of course. BTW <a href=http://www.auroradesign.com.au>Logo Design</a>, it turns out that simple whole file deduplication combined with sparse file support is an effective - and much simpler - way to deduplicate data.
      guiyman011