PagerDuty, a provider of a platform for managing IT incidents, published a report this week that finds the number of critical incidents IT teams have needed to address has increased nearly 20%.
The report finds IT teams on average experienced 105 critical incidents per month. Critical incidents are defined as those involving high-urgency requests for services that were not auto-resolved within five minutes but acknowledged within four hours and resolved within 24 hours. Some sectors such as online learning platforms, collaboration services, travel, non-essential retail, and entertainment services experienced an 11 factor increase in the number of critical IT incidents that needed to be addressed. At an average of 105 critical incidents a month per organization in 2020 the annual cost per organization for these incidents is $158,760.
Based on data collected from 16,000 organizations that was created by more than 700,000 users, the report suggests the level of stress IT teams have experienced during the COVID-19 pandemic has been considerable. On average, the report finds each IT incident requires 1.2 members of an IT team about 126 minutes to resolve. The average incident costs $126 in engineering time.
About a third of incidents appear to have occurred outside of normal working hours, resulting in members of IT teams working the equivalent of two extra hours per day, totaling an extra 12 weeks of work per year. Specifically, there was a 9% increase in interruptions between (6:00 p.m. and 10:00 p.m.), and a 7% increase in holiday/weekend interruptions. An interruption is defined as a non-email notification such as a push notification to a mobile phone; text message or phone call generated by an incident. The number of interruptions during normal business hours increased 5%, while there was a 3% decrease in the number of interruptions when end users and the IT staff that supports them are normally sleeping.
About 10% of users of the platform experienced 19 non-working hour interruptions a month, which is ten times that of the median responder. It’s not clear to what degree that 10% represent members of the IT staff that have unique skills or are simply individuals that are rising above and beyond the call of duty, notes Sean Scott, chief product officer for PagerDuty. “There’s a lot of burnout potential for these individuals,” he says.
PagerDuty defines an “overworked” responder as a member of the IT staff that has seven non-working hour interruptions a month, which is three the median for responders per month.
Overall, the PagerDuty platform ingests roughly 30 million events per day, which generates about one million alerts resulting in more than 500,000 interruptions that go beyond an email notification. There are also roughly 55,000 critical incidents a day.
The report finds that the absolute volume of interruptions on a year over year basis only increased 4% in 2020. The overall percentage of IT staff being interrupted is flat or trending downward. That data suggests overall companies are doing a good job spreading the load equitably across their employees.
Of course, no two organizations are exactly alike in terms of how they manage IT. There are many more organizations that don’t have an IT incident management platform that do. IT professionals might want to take that into account when they determine what type of organization they just might want to work for next.