What your company can learn about data centers from the tech giants

Learning vicariously from organizations with tens of thousands of servers deployed around the world can help you avoid costly mistakes in server deployment and management.
Written by James Sanders, Contributor
Dario Lo Presti, Getty Images/iStockphoto

Industry leaders such as Microsoft, Google, and Amazon have massive server deployments around the world. Likewise, Netflix -- which is famously cloud-first for programmatic operations -- still relies on their own infrastructure for its content delivery network (CDN). Unless you work for these companies, your data center deployments are unlikely to be quite that large. That said, there are valuable lessons to learn vicariously from these massive server deployments, which are applicable in smaller-scale deployments. Accordingly, learning how to scale down the practices of hyperscale companies for the size of your organization is also a valuable lesson to learn.

Failure migration is vital to get right the first time

For cloud computing providers, storage is made highly available on instances, making it possible to spin up an identical instance on a different node in the event of a node failure. This availability is vital for failover migration, as changing nodes for a given instance requires manual troubleshooting to determine the origin of the issue, to ensure customers will not lose data when the instance is spun up elsewhere. Naturally, this prevents IT staff from designing VMs to automatically spin up on a different node. However, requiring IT staff to manually investigate issues can lead to downtime for customers (or, for internal use cases, applications).

Capacity growth is not necessarily linear

Computing and storage requirements are often influenced by patterns that exist in business operations. Rather than assuming that the amount of data your business generates grows evenly with the number of days worked, data growth is more likely to increase with external milestones. For example, an accounting firm is likely to demonstrate larger increases in data storage in the first quarter of a given year, as supporting documentation for tax filings is submitted in advance of an April filing deadline.

Bart McDonough, CEO of managed IT and cybersecurity firm Agio, suggests that companies "be very deliberate about capacity planning," adding that "You should never be surprised by depletions of resources bandwidth and storage". Depending on the business's size and circumstances, these levels should be assessed on a regular and deliberate basis, whether it's weekly, quarterly, or otherwise.

Do not assume backups are bulletproof

The practical usability of backups is the Schrödinger's Cat of IT. If you run backups without periodically testing to ensure systems are restored properly from these backups without issue, do you actually have anything backed up? McDonough notes that "there are best practices we know we need to do, including checks that data storage backups and disaster recovery (DR) tests are still effective. Companies tend to only examine these in response to a problem, and less so proactively. Get these checks on the calendar, schedule them regularly, assign accountability for both the task itself and ensuring that completion is achieved."

You do not have the purchase power of hyperscale companies

Because of the performance requirements of hyperscale companies, much of (if not all) the hardware they deploy in their data centers is a custom design. According to Stephen Hill, Senior Analyst for Applied Infrastructure at 451 Research, "much of the value proposition of a mega-scale environment can come from designing them with an eye for extreme efficiency of the physical factors; managing a delicate balance between compute density, power, and cooling. For them, a percentage point or two of improvement can have a major impact on their bottom line."

Likewise, Hill notes that smaller organizations will not be able to match hyperscalers in terms of hardware customization, but that management overhead for deployments at smaller organizations can be reduced through automation just as effectively as at hyperscale companies.

Hanging on to old hardware is bad for security and morale

While hyperscale companies do not publicly disclose their hardware lifecycles, Holger Mueller, Principal Analyst & VP at Constellation Research, estimates most systems as having a two- to four-year lifecycle at hyperscale companies, and indicates that this is good guidance for enterprise as well.

Allowing hardware to stay in service beyond that time frame can have negative implications for security, as well as demoralize IT staff who must dedicate time to fixing that hardware. As business requirements vary, Mueller notes that "CIOs need to ask themselves what are the performance and security implications to let hardware linger longer than that."


Vendor comparison: Microsoft Azure, Amazon AWS, and Google Cloud(Tech Pro Research)
Effectively measuring, contrasting, and comparing the details of products and services offered by Azure, AWS, and Google Cloud requires a systematic and rational approach. This download includes an overview of critical decision factors as well as a simple tool for comparing services and choosing the best vendor for your needs.

How the cloud wars forced IBM to buy Red Hat for $34 billion
IBM's purchase of Red Hat is a big bet on the hybrid and private cloud and the ability of Big Blue to manage multiple public cloud providers. Here are a few dynamics to ponder.

Microsoft launched Azure 10 years ago and lots (but not everything) has changed
Microsoft launched Azure in October 2008. In the ensuing decade, Microsoft's cloud platform has come a long way from its 'Red Dog' beginnings.

Amazon AWS, Microsoft Azure, and Google Cloud Platform: Comparing prices for basic services (TechRepublic)
Amazon AWS, Microsoft Azure, and Google Cloud Platform are the major players in the integrated cloud services market. TechRepublic compares the estimated cost of basic services.

How to escape the cloud and move back to on-premise systems (TechRepublic)
Cloud computing may have been a good idea at the time, but it isn't always right for your application. How do you move it back to your own systems? Experts offer advice.

Editorial standards