From the course: IT Service Management Foundations: Problem Management

Introduction to problem management - ITIL Tutorial

From the course: IT Service Management Foundations: Problem Management

Start my 1-month free trial

Introduction to problem management

- [Instructor] Welcome to an introduction to problem management. When it seems like everyone is involved in solving some sort of problem or another, it can be tough to understand and implement the systems, tools, roles, and responsibilities that can take your problem solving organization from good to great. Part of the challenge is in actually understanding what the specific goal of problem management is. What are its phases, why they're important, and how it can add value to your overall customer experience. Let's start with defining some goals. The goal of problem management as a function is to reduce the likelihood and impact of incidents by identifying actual and potential causes of incidents and managing workarounds and known errors. That's quite a mouthful isn't it? Let me translate. If you're in charge of problem management, your job is to make sure that the world around you improves. This means we examine what hurts or hinders our customer experience and we try to make it go away as much as possible. In the end, it's all about making some kind of change. If you don't change anything as a result of going through the problem management process, you haven't accomplished your goal. So no matter what your boss says about solving problems and getting them done quickly as part of your goal, all of it's for naught if nothing changes in the end. Problems are also sometimes confused with their cousin, incidents. A problem although related to incidents, is managed in different ways. An incident is like a fire. It has impact. Usually on users or business processes, and that fire needs to be put out so that normal business activity can take place. Problems are the cause of these fires. As such, they require an investigative approach to understand how the fire happened, how to prevent it in the future, and how to put it out faster in case it happens again. As a problem manager, this makes you the fire marshal, not the firefighter. However, if you see a fire, by all means, put on your best incident hat and go help. Don't argue with functional definitions. At the end of the day, your customer is the one being impacted and in service operations, we are all aligned to the same goal. Ensuring customer success. Problem management is broken up into three distinct phases. The first is problem identification. This is where we seek out previous fires and look for upcoming risks that could create one. In this phase we often use tools that can help us perform trend analysis on past incidents and analyze current service desk hot issues. We also take feedback from sources like software developers, quality control, and project test teams. Finally, we use observational and listening skills during live outages to help us understand where we can improve our overall experience. This helps us create a map of sorts for the organization. Think of it the same way you would any other map with landmarks, parks, and buildings, each representing a service or product that your company provides. The second phase is problem control. In problem control. we aim to prioritize our map and identify which buildings, landmarks, parks, and roads are at risk of creating major issues and headaches for our residents. In technical terms, which of your products, hardware, and services are on shaky ground and are in need of maintenance. Depending on the impact of this maintenance or technical debt as it's called, we mark these assets as high risk. For each of these high risk resources, we will aim to understand the causes and impact of failure and then put into place a plan that provides a workaround or temporary fix so that customers are not impacted as heavily. This is almost like adding a detour sign to a road that is under heavy construction so that traffic can still get to its destination. Our third and final phase is error control. In this phase, we manage all our known errors, temporary fixes, and our detours. The goal is to maintain, improve, and organize them so that finding a workaround and being able to execute it successfully during an incident takes less effort. Finally, we look to implement permanent solutions. Ones that are cost effective and provide a stop to the volume of incidents hitting our service desk. When a permanent solution is in place, future incidents are reduced. Customers don't experience delays and overall organizational risk is lowered. When this happens it's time to finally remove the detour signs and duct tape holding together our temporary solutions, take a step back, and congratulate the team on a job well done.

Contents