From the course: DevOps Foundations: Effective Postmortems
What went well
From the course: DevOps Foundations: Effective Postmortems
What went well
- Our first step in investigating our incident is going to be thinking about what went right. Is that surprising? It shouldn't be. So far, we've learned about Safety-II and that focusing on everyday function and how things are going right is one of the keys to creating safety. Not only that, but it's a way to combat negativity bias and attribution error, and those will get you five to 10 in the state pen. You've already built a lot of safety into your systems. Enhancing that can be more effective than chasing the latest flaw. In other words, how can you improve your system's existing immune system, both automation and people, but by building on the good things you're already doing? You are doing good things, right? In the Extended Dreyfus Model for incident lifecycles compiled by J. Paul Reed and Kevina Finn-Braun, they characterized questions that advanced organizations might ask around incident analysis. These are questions like, what aspects of our system and team contributed to our success here? And during this incident and the events leading up to it, how did we actively create and sustain success? How are you already monitoring, responding, anticipating, and learning? Asking these questions first helps in two ways. The first is that it reminds everyone that this is a well-managed system, and things go right the vast majority of the time. The second is that you look at how to build upon your strengths for mitigation instead of coming up with crazy new schemes that might sacrifice that safety. Examples of things that you should highlight in this section of the postmortem include, was detection timely? Was a change tested according to procedure? Were the various processes in place followed? How was the problem fixed, how did the responder figure out what was wrong, and remediate it? Did any of our safety measures prevent the issue from becoming worse? By doing this, you move your team from the just-the-facts routine of the timeline into thoughtful analysis of your system. And you started on a positive note, reminding your team of the strengths you have to build on.
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Contents
-
-
-
-
-
(Locked)
Controlling cognitive bias3m 33s
-
What went well2m 16s
-
(Locked)
Contributing factors4m 36s
-
(Locked)
Challenge: Your contributing factors3m 27s
-
(Locked)
Solution: Your contributing factors4m 21s
-
(Locked)
Corrective actions5m 7s
-
(Locked)
Challenge: Your corrective actions3m 20s
-
(Locked)
Solution: Your corrective actions3m 11s
-
(Locked)
Facilitating the postmortem meeting4m 17s
-
(Locked)
Leading a group postmortem analysis4m 12s
-
(Locked)
-
-