From the course: DevOps Foundations: Incident Management

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Cleaning up after

Cleaning up after

From the course: DevOps Foundations: Incident Management

Start my 1-month free trial

Cleaning up after

- All right, you ran the problem to ground, and the incident's over. Back to bed, right? Well, not so fast. Make sure you're leaving the system in a good state before you sign off. First, make sure the problem's really fixed. I know that sounds obvious, but I've seen probably 100 examples of an incident responder taking an action that they assume will solve the problem. "I restarted the service like always." Or "I rolled the code back to the previous version "that was working." And then sign off with the incident not really being resolved. A green dashboard doesn't mean service is really restored. Verify as fully as possible. Exercise the system from an end user point of view. Watch a piece of data go in one end and out the other or whatever's relevant for that service. Also, you may have made a number of changes during the incident that are leaving the system in an unstable state. Did you turn off the alerts because they were bugging you? Time to turn 'em back on. Did you disable…

Contents