We're in the cloud. What happens when something goes wrong?

ITIL was great for smoothing technology delivery, but the rise of cloud has made things complicated.

Written by Joe McKendrick, Contributing Writer June 10, 2022 at 3:53 a.m. PT

sark clouds over water — The proliferation of cloud means more complications when it comes to running things smooth and service-like, according to a new report out of Constellation Research.
Joe McKendrick

A few years back, I helped conduct a survey in which we asked IT managers how they first learn of incidents or slowdowns in their services. The leading method was via phone calls or emails from users or customers. The second leading method was via phone calls or emails from (gulp) their top executives.

Special Feature

Securing the Cloud

Cloud computing is now a business essential, but keeping your data and applications secure is vital. Find out more about cloud security in this ZDNet special report.

Read now

That's why methodologies and certification programs such as Information Technology Infrastructure Library (ITIL) came about, providing IT teams a proven and standardized roadmap for delivering applications and functions as reliable services. However, with the growing use of outside cloud services, ITIL -- designed during on-premises times -- may be stretched beyond its limits.

The proliferation of cloud means more complications when it comes to running things smooth and service-like, according to a new report out of Constellation Research. "Most enterprise IT teams are struggling to cope with the newer cloud operations-demand-based scaling, cloud-native monitoring, observability, and incident management," says Andy Thurai, Constellation analyst and author of the report. "Most enterprises today are still not set up to handle all the IT-related incidents, or crises, in real-time. Classic legacy enterprises are set up to deal with IT incidents in old-fashioned ways, without considering the cloud, software-as-a-service nuances, or the social media venting and demand by customers that puts pressure on enterprises to fix the incidents faster than ever."

The old-fashion method, "raising a ticket and waiting for it to progress through support levels to reach the proper subject matter expert to solve that incident, can be a disaster waiting to happen," he cautions.

Thurai points to an emerging generation of tools vendors which ostensibly cater to the hybrid environments seen at many enterprises, including:

Thurai provides the following guidelines for handling incidents:

Avoid incidents when possible.
Be prepared for unexpected and unplanned outages.
Identify the incident before the customers do.
Act quickly and decisively to solve the problem immediately.
Take ownership of the incident. Communicate well and in full. Own the story in digital channels.
Capture all details about the incident.
Do a blameless detailed postmortem.
Invest in proper observability tools.
Invest in a centralized incident management system.
Invest in AIOps tools
Break things regularly and see if your theory holds.

"Making assumptions is a dangerous thing in the digital economy," Thurai cautions. "Enterprises are one major incident away from disaster, which can happen anytime. Every business leader or board member should be asking these questions of their IT executives: If a major incident happens to us, how would we manage it? Would we be able to handle it and prove to our customers that we are worthy of their trust, or will we botch it up and cease to exist? If we are not prepared now, how can we get prepared? Ask for a plan of action and proof. Be willing to fund what's necessary to make this happen."

Cloud

Editorial standards

Show Comments

We're in the cloud. What happens when something goes wrong?

Special Feature

Securing the Cloud

Cloud

Related

What caused the great CrowdStrike-Windows meltdown of 2024? History has the answer

CrowdStrike caused Windows outage chaos for airports, banks, and more. Here's what happened

The flagship Roborock S7 Mav Ultra robot vacuum mop is still $500 off after Prime Day