From the course: DevOps Foundations: Incident Management

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

First response and escalation

First response and escalation

From the course: DevOps Foundations: Incident Management

Start my 1-month free trial

First response and escalation

- [Instructor] The next part of an incident is getting the right people working on it. This starts with having people who are supposed to work on incidents. I wish this went without saying, but I've worked in places that didn't have anyone whose job included responding to production problems, and others who just said, "Well, of course, everyone should work on them." Neither of these cunning plans made for very high uptime. When no one's responsible, it's obvious why. When everyone is responsible, still no one is. It's called the bystander effect. You should put some care into an on-call schedule. You need to balance coverage with quality of life for the engineers on-call. Forcing someone to be on-call all the time leads to burnout, or them leaving for a different job. Every team has different needs, so their on-call schedules look different. Weekly, rotations, assigned days, follow the sun stripes for international teams, there's no objectively correct way of doing it. Do what ends up…

Contents