Incident Management

Definitions & Scope

Incident Management is responsible for handling all Incidents. This includes all kinds of malfunctions, errors and bugs which are either reported by users or technical staff or detected and reported automatically by monitoring tools and system.

The primary goals of Incident Management are:

  • to restore normal service operation as quickly as possible after Incidents,
  • to minimize adverse impacts on business operations,
  • to maintain the best possible levels of service quality and availability.

An Incident is defined as:

  • an unplanned interruption to a service (e.g. unavailability of the mail system) OR
  • a reduction in the quality of a service (e.g. reduced network bandwidth or increased response times) OR
  • a failure of a service component which is necessary for service provision, even if the service has not been impacted yet (e.g. failure of one part of a hardware cluster)

All occurrences of such Incidents with an actual or potential negative impact on service quality are handled by Incident Management. Input into the process could come from end-users, GS & IT staff or other processes such as Event Management.

Incident management process overview

The Incident Management process design faces the following challenges:

  • standardized collection and documentation of information
  • correct and consistent classification and dispatching of Incidents
  • correct prioritization of Incidents and implementation of deadlines
  • further issues (e.g. escalation, automation, involvement of the right and exclusion of the wrong people)

 

Classification

Incident Urgency classes:

  1. High: The damage caused by the Incident increases rapidly.
  2. Medium: The damage caused by the Incident increases considerably over time
  3. Low: The damage caused by the Incident only marginally increases over time

Incident Impact classes:

  1. Down critical adverse impact on the service
  2. Degraded major adverse impact on the service
  3. Affected minor adverse impact on the service
  4. Disrupted small number of the population affected

The Priority [P] is obtained from the combination of Impact [I] and Urgency [U]. [P] = [U] + [I] - 1

VIP treatment is implemented by raising the Priority with one step (e.g. a priority 2 becomes a priorty 1; a 6 becomes a 5 etc..).

The Classification Matrix

Classification

Sample Restoration Deadline Matrix

Restoration

Page last updated on: 30 January 2017 at 17:15