Inside Amazon

Inside Amazon

Summary: You're running one of the world's busiest e-commerce sites, handling up to 4 million checkouts per day. Response time is critical.

SHARE:

You're running one of the world's busiest e-commerce sites, handling up to 4 million checkouts per day. Response time is critical. Every page is customized on the fly using over 150 network services. And the system must manage failures of any component, including entire data centers.

You are taking real money and shipping real goods. Quality of Service is your lifeblood. How does Amazon do it?

Enterprise reliability at massive scale Like Google (see Google's three rules) Amazon:

  • Uses commodity servers
  • Embraces failure
  • Architects for scale

Yet Amazon raises the ante on Google: application developers can tune the storage infrastructure, Dynamo, to meet application needs.

Scalable architecture Here's an illustration from a recent Amazon paper on Dynamo, their main storage infrastructure.

Inside Amazon

Unique features Highly automated. No manual intervention is required to add or remove storage nodes - the system handles discovery and data redistribution automatically.

99.9% percentile performance Amazon's best customers also have the most data: recently viewed items; wish lists; long histories. Instead of measuring average performance, Amazon looks at performance at the far end of the distribution to ensure that all customers, not just the majority, have a good experience.

Tunable trade-offs. It isn't possible to have high availability and consistent data: the mechanisms that ensure consistency hobble availability. Amazon gives developers some knobs so they or their applications can tune the system for fast reads or fast writes, for availability vs. cost. Availability is key for Amazon's 7x24 business model so Dynamo is designed for eventual consistency.

Decentralization. Amazon's system is designed to withstand the loss of many components, up to and including data centers. Dynamo clusters are distributed across data centers linked by fast pipes. All data is stored in multiple data centers.

Heterogeneity. Systems management is based on application performance. Powerful new systems get more work than old systems, but all applications get their work done on time.

Implications for the enterprise Your friendly $250,000 a year storage sales rep would be shocked to learn that a loosely coupled storage system built of commodity parts can provide mission-critical availability and performance. As the Amazon team reports:

Many Amazon internal services have used Dynamo for the past two years . . . . In particular, applications have received successful responses (without timing out) for 99.9995% of its requests and no data loss event has occurred to date.

I'd wager that matches the best that EMC, IBM and HP can do at that volume.

The storage Bits take Today's massive scale Internet Data Centers point the way to a revolution in enterprise computing. Instead of feature-rich products that attempt to handle every eventuality, the IDCs have the scale to architect their infrastructures for the jobs they need done.

Amazon's storage doesn't use RAID, relational databases or fancy interconnects. The intelligence goes into optimizing the architecture of the software rather than attempting to build bulletproof hardware, which, at their scale, is hopeless anyway.

Given the growth of data, I predict that in 10 years most enterprises will be running at least part of their business on similar architectures.

Comments welcome, of course. Werner Vogels, Amazon's CTO, kindly sent me the link to this paper. I'll have a longer description of it on StorageMojo later this week.

Topics: Storage, Amazon, Data Centers, Hardware

About

Robin Harris has been a computer buff for over 35 years and selling and marketing data storage for over 30 years in companies large and small.

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

11 comments
Log in or register to join the discussion
  • How does Amazon do it

    Robin, I recently started following your blogs and have contributed to two - <b><i>How Microsoft puts your data at risk</b></i> (http://blogs.zdnet.com/storage/?p=169) and <b><i>Data corruption is worse than you know</b></i> (http://blogs.zdnet.com/storage/?p=191).
    <p>Thanks for your contribution to this misunderstood area. I was amused by some of the responses to <b><i>How Microsoft puts your data at risk</b></i>. Some people refuse to recognize and address the problem. Their systems will eventually reflect this neglect</p> <p>Back in the old days, when I was writing test software for missle systems, we had to tell the US Air Force that there was no way to build a computer controlled system with a single data path that would be 100% accurate. The way to achieve the best result was to have redundant data storage and retieval paths. The contracting officer did not like the answer because of the cost of storage at that time, but as Amazon has shown, redundant data paths do give very high reliability. On some US Navy systems, we used an extension of Hamming codes built into the disk controller software so that we could detect and correct two bit errors and detect 3 bit errors.</p> <p> With the lower cost of storage today, every computer manufacturer should offer some type of data reliability option like these. Those that value the protection would willingly pay for such an option because of the reduced cost of trying to recover from data loss due to the inevitable corruption of data.
    <p>Regards, Jack Cole
    hmcm@...
    • Thanks!

      Jack,

      I love to see the different reactions for the insight into consumer psychology.
      Between this and the work I do with small businesses here in the wilds of northern
      Arizona I am getting a real education on the world outside Silicon Valley - and a
      lot more sympathy for regular folks dealing with computers.

      If you haven't already you'll also enjoy <a href="http://storagemojo.com/"
      target="_blank">StorageMojo</a>, my storage geek blog.

      Robin
      R Harris
  • Amazon - Whizzy technology is wasted by dire delivery times

    All this whizzy technology is great, but it is completely wasted when Amazon's warehouse monkeys annoy the crap out of customers by taking 3-5 days to pick and pack an order for dispatch......unless you pay extra for express delivery of course which is a rip off when Tesco.com charge you ?5 to deliver *anything* up to a Plasma TV or Refrigerator.

    I'll vote with my feet and take my business elsewhere.
    neil.postlethwaite
    • uh....and?

      All of which has absolutely nothing to do with Amazon's method of dealing with customer's web experience.
      lutherlarry
    • Tesco - a terrible example

      I'm in Italy and I'll bet they can't get a fridge to me for a fiver, or within a week. Tesco is one of the British food retailers who fleece their customers by charging more for half decent food. Healthy Options? What's the "cheaper" stuff (not very cheap though)? Unhealthy Options?
      I get loads of stuff, books, CDs, DVDs, from Amazon.co.uk and its associated sellers, and rarely wait more than a week. Sometimes stuff turns up within four or five days. So what's the gripe?
      pvandck
    • Tescos.opoly.com have placed your cheque in the post!!

      Hey

      Yeh like read how much customers of Tesco Direct rave about the quality of their customer services... not!!

      Or how Tescos put out of date food on the vehicles doing home deliveries...

      Part from that I get my veggies from Sainsburys at least they aren't all rotting like Tescos... Example rotting carrots with tops at my local branch...

      Getting back to ... Oh yes talking about Amazon (I had almost forgotten)... No concerns there... Their website works and I have had a 100% positive customer experience...

      Bob Wya

      PS I do shop at Tescos but would NOT USE TESCO DIRECT!!
      PPS You vote with your mouse/fingers not your feet (unless you are v. disabled)!!
      Bob Wya
  • fool

    you're in some sort of weird fantasy world. Your order goes in the queue and comes out the other end. You can't blame the warehouse workers, they are just pulling pick lists and retrieving items from the warehouse. The guys in the warehouse are invariably the coolest and smartest people in the company. When they are busy, your order will take a while. Once I ordered something from Amazon with free shipping in November and I didn't get it until after Christmas. No biggie, I expected it. Another time I ordered from Amazon, again free shipping, and it showed up two days later.

    I don't work for Amazon, but I have worked in grocery wholesale. The grocers in my local area all suck (they buy their groceries from my former employer!) so I am buying more and more of my groceries from Amazon, the prices are much better than the supermarket, and I don't have to deal with traffic and the like.
    frantaylor
  • RE: Inside Amazon

    Google released a statement on this article. It goes like this: Pppphhhhfffttt.... :-P

    >)
    mcc99@...
  • RE: Inside Amazon

    A MUST READ article!
    Thank you Robin.
    Scott "Buck" Johnson, VIP Consulting LLC
    zingozango
  • RE: Inside Amazon

    I've spent thousands of dollars via Amazon. Not one problem in over 8 years of use. Their vendors are reputable and their service the best. Oh there are some slow shippers but I have never lost a shipment. Remember that "Shipment happens."
    dfarrich@...
  • RE: Inside Amazon

    You consistently aim <a href="">e zigaretten</a> being in a little while as you air at home categorize of a <a href="">cigarro eletronico</a> before as well drain add together a <a href="">cigarrillo electronico</a> designed in lieu of case appear in argument
    http://tigcig.com/de/ | http://tigcig.com/pt/ | http://tigcig.com/es/
    helenass