Innovation

Counting clouds: Which metrics matter?

If you can't measure it, you can't master it; a basic principle of management that applies to cloud technology. Knowing what to measure is the first step to building a strategy

Written by Rupert Goodwins, Contributor June 14, 2011 at 3:30 a.m. PT

While a move into the cloud is supported by a number of reasons, such as scalability, flexibility and efficiency, the largest part of any argument quickly boils down to money.

Fewer overheads, a more precise match of requirement to expenditure, and freedom from over-investment all make a strong case.

Plus — if the market develops healthily — there should be plenty of competition among cloud providers to keep prices keen.

However, all this depends on having a good knowledge of how efficiently you are using the services, whether they are in a public cloud or self-provisioned ones in a private cloud. There are four main factors in internal metrics for cloud provisioning, and knowledge of these helps generate awareness in selecting appropriate strategies for monitoring cloud usage for efficiency, relevance and profitability.

Networking

There are four main factors in internal metrics for cloud provisioning, and knowledge of these helps generate awareness in selecting appropriate strategies for monitoring cloud usage.

Cloud systems use networks rather differently to traditional computing. A medium-sized enterprise using self-provisioned services will make use of the web in a disparate way, with individual users and groups within the enterprise having their own particular selection of external sites and data sources. That same enterprise doing much the same work with a cloud provider will have the majority of network traffic going to that provider — it will look a lot more like a traditional point-to-point leased line.

Not all ISPs work equally well with this kind of traffic, so monitoring latency, throughput and reliability is important. A reconfiguration within the network can have big effects on your connectivity, and having the data to back up such a diagnosis will more than repay the cost of collection.

Storage

As with all of the metrics, understanding your storage requirements means understanding your usage model. Storage is particularly sensitive to this as, while requirements go only one way over time — up — the speed at which this happens is very variable, and providers understandably like to over-sell provisioning or, alternatively, make flexibility an expensive option.

And, if you're building a private cloud, you've got all the standard headaches of self-provisioned storage plus the need for flexibility and scalability.

These are the key metrics to watch out for:

Availability, which should be unambiguously defined in the service-level agreement (SLA) and unambiguously measured
Reliability, which is similarly fact based
The simplicity and suitability of any web-service API access for the set of actions you are likely to use

Similarly, take note of:

The access methods on offer (what filing systems are exposed, with any concomitant performance issues)
Billing (what is charged for, over what time, where price breaks are and what limits there are on scaling)

The area that needs most attention and which is, as usual, the least well served at the moment is security.

The industry hasn't created any meaningful security metrics for the pre-cloud environment, so nobody's expecting a revolution overnight. None of the old responsibilities have gone away, and nothing can take the place of having a proper understanding between you and your cloud supplier of what security they offer, how it is ensured, what checks you can run, what lines of communication are available in an emergency, and so on.

Processor

With infrastructure-as-a-service (IaaS) providers, there are many variables in CPU provision, and this is reflected in the lack of an industry standard way to define how it is measured.

The industry hasn't created any meaningful security metrics for the pre-cloud environment, so nobody's expecting a revolution overnight.

Amazon's EC2 service is one of the longest established and uses ECUs (Elastic Compute Units) that roughly equate to one 2006-vintage Xeon processor running at 1.7 GHz. VMware has VPUs, which probably equal one AMD core running at 2.9 GHz, but this is less well defined. Other services sell per-core or per GHz.

As these are usually virtualised processors, there are further variations. You may be provisioned with so many percent of the core, with the cloud provider's hypervisor allocating the rest to other instances. This can lead to issues with metrics, whereby an instance is limited by its CPU, but the internal CPU activity metering shows it to be running at well under full utilisation.

Here, the trick is to check the amount of time the CPU is idle, which will indicate you need more if it's running at under 20 percent or so. Alternatively, you can use another increasingly common metric, the CPU Steal, or stolen CPU percentage. This indicates how much of 'your' CPU is being used elsewhere.

There's no substitute, however, for benchmarking. As with physical servers, the choice and relevance of any particular benchmark is heavily dependent on the workload profile of your actual tasks, and there's yet to be any form of cross-platform benchmark worthy of the name. Any worthwhile metrics strategy will include a way of spotting and correcting CPU-bound instances and checking SLA promises.

Energy

This final metric is the most unusual. As the major limiting factor in datacentre engineering, per-CPU and per-server energy consumption is something that cloud providers are intensely aware of. It's also something they are most unwilling to communicate, except disguised in overall costs.

As energy usage starts to move towards a smart model, with enterprises prepared to accept responsibility for and expect benefits from much closer attention paid to IT energy consumption, cloud providers should be giving information to help tune this aspect of corporate computing. They don't, and won't unless pressure is put on them by customers.

Get the latest technology news and analysis, blogs and reviews delivered directly to your inbox with ZDNet UK's newsletters.

Editorial standards

Show Comments

Counting clouds: Which metrics matter?

Networking

Storage

Processor

Energy

Related

The work laptop I recommend to most people is not made by Apple or Lenovo

7 reasons I use Copilot instead of ChatGPT

Why I travel with Bose's QuietComfort Ultra headphones instead of the Sony XM5