Building a 500PB data center

Building a 500PB data center

Summary: Yes, as in 500,000TB, or half an exabyte. And it is a small company - not the NSA or Amazon - building it. Big data is closer, smaller and cheaper than you imagine.

SHARE:
TOPICS: Storage, Hardware
6

30 years ago a terabyte of disk storage would have filled a data center, if you could afford the $20,000/GB price and the 25,000 hour MTBF. Today you can buy a 1TB USB 3.0 thumb drive, and a small company can start building out a 500PB data center.

That small company is Backblaze, the online backup company, who is currently storing about 40PB of customer data. But they ran out of space at their current data center and moved into a much larger - and less earthquake prone - site near Sacramento, California.

screen_shot_2014-02-02_at_10.27.06_pm

Using their open source storage pods they expect the new site to support 500PB when built out. In this picture each rack supports 450 hard drives or - with 4TB drives - 1.8PB. It's 8 racks wide, for a total raw capacity of 14.4PB.

Backblaze chose their new site with an eye on earthquakes, common in California, but also for flooding. It isn't well known, but in 1862, during the Civil War, a mega-flood made California's central valley a lake 300 miles long and 20 miles wide.

The Storage Bits take<br>There was a rumor last year that the NSA's new Bluffdale, Utah facility might support 1 yottabyte - 1,000,000 exabytes - of storage. Given that the Backblaze racks are fairly dense - I estimate almost 300TB/sq.ft - they'd need over 69,000 of the 8 rack row in the photo to get to 1YB. Plus a lot of power.

But that would be about 1.4 million square feet - or 30 acres - of data center space for storage alone. Given that the data center space in Bluffdale is 100,000 sq. ft. it seems impossible for them to store 1YB. But they probably could manage 50 zettabytes, which is respectable.

It was inconceivable 10 years ago that a small, low-cost storage company could ever build a 5PB facility. That shows how far we've come in the last decade, both in price and density.

Disk drives are one of the great success stories of modern technology. Learning to manage all the data that is stored on them will be the great challenge of the next 20 years.

Comments welcome, as always. What year did you learn what a petabyte was? For me I think it was the late 90s.

Topics: Storage, Hardware

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

6 comments
Log in or register to join the discussion
  • The breakthrough started with IBM

    And their "pixie dust" surface manufacturing for disk drives. After that, storage capacity exploded.

    And the storage available doesn't have to be in disk drives. Most of the data stored is actually archived on tapes.

    When I started in the supercomputer industry (1991), the center had 3 silos of 9000 tapes (800 MB each) - total storage capacity around 6TB... and that was front-ended by only 90GBs of disk.

    When I left (2001), the same 3 silos were passing 400TB of active data (by then each tape was about 1.5 TB each).. and preparations were being made for the next round - going to DLT drives and expecting petabyte storage, and the storage was being front-ended by 3 10TB filesystems.

    So getting tremendous data storage is quite possible. Tape storage capacity is a LOT denser than disks, or anything else for that matter - leaving only theoretical storage devices with higher density.
    jessepollard
  • Tape denser than disk?

    Most tape cartridges are about the size of a 3.5" disk and the current gen of LTO is 2.5TB raw capacity and 5-6TB compressed, not much better than 4TB drives today.

    And 6TB drives are coming this year. Nor have there been any commercial announcements for SMR disks - now sampling to cloud providers like Facebook - whose capacity I expect to be 10-12TB. So it's close to a wash.

    Tape is certainly cheaper: LTO-5 cartridges are $26 each in bulk. So that's a win.

    Cheers,

    Robin
    R Harris
  • File systems inadequate

    Most of the modern file systems have a limit of just over 4 billion files per volume (a few have even higher limits). But while they can theoretically handle the storage of that many files, users run into significant problems once the number of files gets into the "few million" range. The size of an inode is 256 bytes and the size of a FRS (NTFS file system) is 4096 bytes on all the new disk drives. That means that you need 2.6 GB of RAM or a whopping 40 GB of RAM just to hold the metadata table in memory for Ext3 or NTFS file systems respectively when you have just 10 million files. Searches for files become painfully slow when you get that many files as well.

    Clearly, the "disk full" error is becoming less of a problem with lots of options for cheap, high capacity storage, but the "file not found" condition is going to get worse, not better.
    Andy Lawrence
    • Modern file system

      Maybe a modern filesystem would be something like ZFS? EXT3 and NTFS need not
      apply.

      ZFS is a 128-bit file system,[35] so it can address 1.84 × 10^19 times more data than 64-bit systems such as Btrfs. The limitations of ZFS are designed to be so large that they should not be encountered in the foreseeable future.

      Some theoretical limits in ZFS are:

      2^48: number of entries in any individual directory
      16 exbibytes (2^64 bytes): maximum size of a single file
      16 exbibytes: maximum size of any attribute
      256 zebibytes (2^78 bytes): maximum size of any zpool
      2^56: number of attributes of a file (actually constrained to 2^48 for the number of files in a ZFS file system)
      2^64: number of devices in any zpool
      2^64: number of zpools in a system
      2^64: number of file systems in a zpool
      alpharob1
  • How does this fit in with the 360TB data 5-dimension crystal storage

    that was recently demoed? (Although, admittedly, that was experimental, not production or even a prototype?)

    And what about memristor technology? How is that coming along and how do its storage densities compare?

    Seems to me that within a few years (maybe 5-7) some new technologies will be providing huge leaps in speed and density, not just evolutionary improvements.
    Rick_R
    • Crystal storage

      Hitachi showed something along the same lines in the last 18 months as well. Think long-term archive storage as it is only writable once.

      Robin
      R Harris