Why IT growth is only leading to more burnout, and what should be done about it

Nine in ten IT managers agree that companies with a high degree of automation have the most effective incident response.
Written by Joe McKendrick, Contributing Writer
Westend61/Getty Images

While information technology work -- development, engineering, administration, hand-holding -- is considered among the world's most preferable jobs, it is also a source of unending burnout. Major causes include excessive workloads, excessive hours worked, lack of recognition, and lack of challenges, explains Nick Kolakowski in a recent Dice report.

Let's look closer at those first two factors -- excessive workloads and hours. While there are many activities tech professionals need to pursue all at once, one thing that clogs their days up more than anything is the overwhelming number of glitches, outages, breaches, and other incidents that demand their attention. 

Also: Can Microsoft recover from the collapse of its Surface business?

When systems glitch, end users and customers can get extremely frustrated. They wonder: isn't anyone minding the store? The answer, of course, is there are IT teams minding the store. But they're getting just as frustrated as anyone else dealing with issues that arise. 

Business' leaders ought to be frustrated, too, because there's a real cost to their organizations as well. Incidents can cost large companies more than $100 million a year, according to a recent analysis and survey out of Constellation Research. "Even more eye-opening, 49% of those incidents are straightforward and repetitive, and can be automated away," the report's author, Andy Thurai of Constellation Research, observes. 

Also: OpenAI announces first developer conference: Everything we know so far

More than half of 317 IT managers responding to the Constellation survey, 57%, indicate that they get more incidents than they can handle. "This is alarming, especially because the number of incidents is continuing to increase and incident response teams are already overwhelmed," according to Thurai. "Poor incident response experience with repetitive, manual toil can lead to employee attrition. In fact, it is cited as the top cause of employee attrition by many incident responders."

Add a lack of organizational support and awareness to the mix. "Leadership lacks visibility into top incidents, team toil, team burnout, and incident response costs," Thurai observes. "Continuing with older practices leads to too many alerts, creating alert fatigue."

While there are some robust tools and platforms on the market that help automate and alleviate this pain, the growth of cloud, analytics, and distributed systems has made incident response only more complex. "Incidents, both major and minor, are more frequent than expected," the survey shows. Plus, "the current way of responding to incidents is broken." 

There has been some progress in the five years since Constellation's previous survey on this topic. In both cases, more than one-third report more than five major incidents within the past 12 months with their production cloud infrastructures. (Thirty-four percent this year, down slightly from 38% five years ago.) 

Also: My two favorite ChatGPT Plus plugins and the remarkable things I can do with them

"In other words, companies have not reduced the rate of major incidents with their production cloud infrastructure," Thurai, points out. He cites rising rates of cloud adoption and newer applications in deployment, and the continued prevalence of manual processes to respond to IT incidents. Couple that with a shortage of skilled IT personnel, who are already overwhelmed with multiple demands.

IT managers are almost unanimous that something needs to be done, the Constellation survey shows. Nine in ten agree that companies with a high degree of automation have the most effective incident response. The same number state that "application downtime is a top cause of customer dissatisfaction and churn."

Some of the actions Thurai recommends to reduce burnout and alert fatigue include the following:

  • Automate as much as possible. This should ultimately include self-healing capabilities. "They can be pre- or post-automation remediation measures to avoid incidents."
  • Educate and train: Help IT staff become "more knowledgeable so they can solve incidents without escalating."
  • Get the business on board: "Every board member should be asking these questions of their IT executives: If a major incident happened to us, how would you manage it? Would we be able to handle it and prove to our customers that we are worthy of their trust, or would we botch it and cease to exist? If we are not prepared now, how can we get prepared? Ask for a plan of action and proof. Be willing to fund what's necessary to make this happen."
  • Take a team approach: Thurai advocates for the "automated creation of collaboration or war rooms," as well as the "immediate creation of conference or video call links associated with the incident to reduce the need for manual intervention and save valuable time during major incidents."

Just as important as these more direct actions to address incident fatigue is providing a rewarding, meaningful, and -- yes, I'll say it -- fun workplace atmosphere for all levels of professionals. It's time to recognize the hard work that goes into building digital businesses.

Editorial standards