Join Mark Thomas for an in-depth discussion in this video Event management, part of Cert Prep: ITIL Foundations.
- Let's start with how it's event management is the first process. This is what I like to consider the entry point in the service operation. Why? It's because event management, essentially, manages the process. It's the process that manages events through their life cycle, giving us a sound basis for operations management. So what does that really mean to us? If you think about it, this process underpins a lot of the service operations activities. It's all about monitoring and having monitoring tools in place to be able to watch those critical CIs -- Critical Configuration Items.
And the whole idea here is, I don't want to catch an incident after a customer has called me and says "Hey, this isn't working." I'd like to have a monitoring tool watching this configuration item, and as soon as it starts to wobble, I can catch it and I can keep it from wobbling, so we can restore that or we can resolve that before there's ever been an effect on a customer. So it really streamlines a lot of the operations process. So the more monitoring you can get in place the better, but you have to have the resources, you have to have, really a good understanding from a responsibilities and accountabilities perspective.
Who's watching it? What activities or what actions do we take, when we see certain types of alerts out of these monitoring tools? That's why it's a very powerful process. Like I said, it really helps us remove that -- or catch those things that could become incidents before they ever cause any damage or any harm to our customer. So the whole idea is to detect and analyze those events. Now this is a very tool-based type of process. Of course, humans have to be involved in some of the decisions, but we actually look at these events -- and so we set these monitoring tools -- these event tools up to classify these different alerts that we have -- these notifications that we have, and the three types would be what we call, informational, warning or exception.
We'll talk about -- more on detail on those, here in just a few minutes. What they can do is they can determine an appropriate and maybe even an automated response, to be able to -- take the air conditioner in my data center, for example. If I have to keep the temperature at 70 degrees, and at 70 degrees it goes up to 72 degrees, there's some type of event monitoring tool, that's actually taking an automated action which turns on the air conditioner which cools it back off. So those are the things we want to be able to do throughout the key infrastructure components and be able to track and monitor these things.
It's an entry point to trigger into service operation, as we said before, we like to catch that so that it can alert us on those things that we need to have. So what is an event? An event's a change of state. It's a detectable occurrence that is significant enough for IT management to want to be able to track and understand how those changes of state take place. Therefore, we want to be able to monitor those things. That's what we're doing as a part of event management. Okay -- so, let's take a look at the overall scope. What are we talking about here? What do we want to monitor? Well, we want to monitor critical CIs, the status of those configuration items.
Remember a CI is a CI because it's required for the successful delivery of a service. So, therefore if that CI has an issue, something going wrong with that CI is going to have some effect on a service -- so I want to catch it there. So, you can monitor the status of those. Environmental conditions, as mentioned a few minutes ago. I can track my temperature. I can have smoke and those types of things to look at. I can also look at things like software licenses. I can monitor, using these monitoring tools, in conjunction with say -- my CMS.
I can say, "How many deployed -- "How many deployed x, y, z servers do I have?" Let's take virtual servers -- "How many deployed virtual servers do we have?" "How many licenses do we have?" Then we can actually keep track of those, and have an alert sent to me when we have a gap. We can look at security. Now security pioneered this stuff. I mean these folks on the security side, know how to work these monitoring tools, or should anyway, because they're looking at suspicious behavior -- right -- seeing certain things that create alerts for investigation, and those types of things that you might have.
And, of course, we're just making sure normal CI activity -- We're tracking those. Somebody logs in -- that's an event, so it's kept as a log. An email is sent -- normal activities. So, we want to keep track of those. When you hear people say, "Let's check the logs," this is really what we're talking about, which is the event of event management, or the scope of event management. These monitoring tools that you have, that are watching these things -- you can have active and you can have passive monitoring tools. Active is -- you know, we're watching -- we're pinging CIs -- right, and from a passive standpoint -- information we're gathering that basically detects and it sets up these alerts on those particular CIs.
So there's kind of a couple of approaches that you can look at on that. From a key conceptual standpoint -- Let's talk about these three types of events that you might have. So on those events -- the first one we said -- There's an informational event. That just generates a log entry somewhere, and that's just tracking normal CI activity. Basically, like we said, a user logs in. That's an event that's kept in a log. Those are certain events that we're watching. The second type we might have, is called a warning. Now what a warning means is, we haven't really reached a threshold yet, but what we've done is, we've come close to that threshold.
We're within a range that want to either take action or we want to be aware that it's actually taking place. So that warning -- for example -- Memory usage is within a certain percentage threshold. I want a warning saying, "Hey, just I thought you ought to know." And then finally, we have this thing called an exception. An exception has breached some threshold, that we've determined, whether it's based on my service level agreement, or we've determined that within the IT organization, that action has to be taken and therefore, that could be in the form of an incident ticket or a problem ticket.
Now, think about what you can do here. Let's take a look here. You have alerts that come out, and those usually are from these exceptions. Generally, the activities here, you don't have to really worry about memorizing every step along here, but the event takes place, we're notified, it get's detected and it gets logged -- right -- somehow, in a log -- in a log after it's been filtered. It comes up that we have these three types. We have an exception, warning and informational. If it's informational, a log entry is made -- right. We may review that at some point to make sure the actions that were done were correct.
We also take a look at that next one, which is the warning. A warning -- remember we've almost reached a threshold, we're within a certain percentage -- we want to know about it. There's a correlation engine that helps us determine -- and again -- we tell the tools this -- and it helps us determine whether we need to have a human response where there's an auto-response, or how we want to respond to that. But the exception is the one where, we've breached that threshold and now we go into say, the incident or problem process -- that helps us even out that change process.
You can see why this is very important for you. Event -- event monitoring tools -- Could those be CIs themselves? Absolutely. If I have an event monitoring tool, that is monitoring a critical CI, and that monitoring tool goes down -- I have an incident on my hands. I can't see what's behind that. I can't see -- I can't get automated alerts. So, that's a consideration that you may also want to have. Very important that you know what these event monitoring tools are for.
You tune them or you create the messaging that you want those to message. You have them go to the right responses, and you have to know who is responsible and accountable for handling this different types of responses out of this -- on the event management side. I will tell you that this is a entry point in the service operation, because next it's only logical that we should talk about incident management.
ITIL® is a registered trade mark of AXELOS Limited. This ITIL Foundations course is offered by Interface Technical Training, ATO of EXIN.