I recently attended a discussion of the key components of the phenomenon known as Amazon Web Services (AWS), which I've turned into a basic walkthrough for anyone considering making a first move into cloud computing.
AWS is Amazon's fast growing cloud service: each day AWS adds server capacity equivalent to what Amazon had when it was a $7bn company back in 2003. That's some pretty heavy computational lifting, and at these volumes Amazon benefits from great economy of scale, as do its customers.
The idea of utility computing is that, like electricity you can plug into the circuit and it operates on-demand and regardless of geography. On top of that, companies can pay for the services as they use them, which can be cheaper than signing a big, up-front contract; companies simply calculate what they need in the way of storage, cloud capability and content delivery with options including security, backup, DNS, database, storage, load-balancing, workflow, monitoring, networking and messaging.
The idea of AWS is that you should only provision what you have to do at any given time. This, of course, is the opposite of what happens with so many companies now which tend to provision for the worst case scenario. However, it's worth noting that, as with many cloud services, Amazon is more cost effective for spiky or one-off workloads whereas if you're running workloads where demand is predictable day-to-day it may be cheaper to run it in-house.
There are four basic use cases for AWS in terms of why customer might want to use the service.
• On and Off. This is where you may just have to spin up for a one-off workload like month-end
• Variable workload to build an infrastructure that knows what a particular business looks like
• Fast growth. Start small but be ready to build capacity quickly
• Predictable peaks
AWS is nine regions: four in the US (including one that is wholly devoted to the US government) and five around the world - Europe, South America, Singapore, Tokyo, and Sydney. Then there are 26 availability zones and 46 edge locations.
Next up is deciding on network infrastructure:
• Direct Connect — dedicated connection to AWS
• VPN Connection — secure internet connection to AWS
• Virtual Private Cloud — private, isolated section of the AWS Cloud
• Route 53 — a high availability and scalable domain name service
Customers chose their type of connection based on data protection requirements and cost. Customers can also create a virtual private cloud space
On top of this, Amazon's range of software and services is also extensive and is made up of a number of elements:
EC2 — Elastic Compute Cloud — Amazon's cloud hosting services.
Auto-scaling — The automatic provisioning of compute resources based on demand, configuration or scheduling. Free of charge, you can scale up or scale down.
Elastic Load Balancing — You can create scalable applications that can be distributed across multiple instances in multiple availability zones.
S3 — Simple Storage Service — This is an object storage service that is trusted by many companies large and small to look after their data storage needs. NASA is one of its better known users.
Elastic Block Store — Block storage devices which scale from 1GB to 1TB in size. This is a virtual mirrored hard drive which you can use and then detach it and attach it to another machine.
Relational Database Service — Amazon aims to offer a choice of different brands of this basic and vital tool. There is Oracle, MySQL and Microsoft SQL Server and also the Amazon Relational Database Service (Amazon RDS) which stores forum threads, site content and project configuration data.
Dynamo DB — This is the counterpart to a a managed NoSQL database. It gives you the ability to provision the performance you require on a table by table basis so you can focus on issues like, the structure of our data and how does that structure changes over time. All writes to a database are simultaneously synchronised across multiple instances.
Amazon Redshift — This is one the company’s newer web services. It was launched in February, and allows companies to fully managed, petabyte-scale data warehouse service. They can analyse all data using existing business intelligence tools and it aims to optimised datasets ranging from a few hundred gigabytes to a petabyte or more. According to Amazon it costs "less than $1,000 per terabyte per year" which if correct is much cheaper than other established databases. It automatically synchronises with S3.
It scales from the smallest environment, which is two terabyte and one node all the way up to 1.6 petabytes and a 100 node cluster.
Amazon SQS — This is a simple queuing system for messages. Like other Amazon software it can be preset with a variety of different protocols which between themselves will work out the next steps for the message.
Amazon Simple Workflow (SWF) — This takes SQS a stage further to give a way of taking their existing applications and mapping their workflow into the Amazon system. Flow Framework goes a stage further and lets you take the workflow and push it out onto completely different systems. This workflow software was used by NASA when it was developing the software for the Curiosity Rover and now as it runs it.
CloudSearch — A managed service that works within the Amazon Cloud and it allows users to set up, manage, and scale search solutions for their websites or applications. Amazon CloudSearch lets users search large collections of data such as web pages, document files, forum posts, or product information.
Elastic Beanstalk — This is a daft but surprisingly appropriate name for a clever product. Launched in December 2012, it is not yet fully finalised and at the time of going to press was still in Beta. That is perhaps not too surprising as it is an ambitious Platform as a Service (PaaS) product that lets users create applications and push them to a definable set of AWS services, including EC2, S3, elastic load balancers as well as several other services. Elastic Beanstalk works with Visual Studio, Eclipse and Linux Git as well as Microsoft .NET, PHP and Java.
The need for shared responsibility, Amazon style
So who is responsible for what in a cloud environment? Amazon says it uses a 'Shared Responsibility Model' where AWS takes responsibility for the global infrastructure and foundation services including the operation of our datacentres, the physical site security and the quality if the service. The user has responsibility for building secure applications that do not leave them exposed to hacking attacks. Users are also responsible for transmitting and transporting data in regions that are suitable from a data protection point of view.
Ovum analyst Laurent Lachal says the AWS model is a powerful one: "The Amazon story is a very good one and the customers love it. They have such a good story on APIs. Other companies keep their APIs hidden and secret and AWS just hands them over."
Lachal believes that this make AWS difficult to compete against. "If you are a customer and you talk to Amazon about applications, unlike the other suppliers Amazon will show them the applications running and AWS is difficult to compete against because, on the strength of an open (API-centric) services, they understand that public cloud platforms not only run applications, but also businesses. As a result they created a strong ecosystem of partners and customers."
And he believes that Amazon, "as more open than other suppliers when it comes to sharing information but more in tune than others with customer requests, although they respond to these requests incrementally".