Creativity & Inspiration at Work


Home Page  

Barbara Taylor  

Books

Clients  

Feedback

Frequently Asked Questions

Inspiration 

Internet Service

Interesting Links

Mailing List

Michael Anthony

Michael Teachings

Newsletter

Personality Game

Privacy Policy

Products  

Services

Site Map

Speakers

Training

Travel

Translations

Workplace Spirituality

Spirituality Links  

 

Contact us

Search the site

 

Online Newsletter

spike bullet April 2016  ~ Creating a Computer Incident Response Team

Case Study
Creating a Project Team
Developing an Incident Response Plan
Implementing the Incident Response Team
Benefits
Resources (links, books, articles, the lighter side)
Printer-friendly version           

color bulletApril 2016  ~ Creating a Computer Incident Response Team

What happens in your company when a critical computer system goes down?  What happens when your customer-facing website goes down unexpectedly?  What happens when your network crashes?

With technology woven through so much of our every day business activities, any of those situations can cripple a business for minutes, hours, days or weeks if not dealt with properly.

If you don’t have a good plan in place for handling outages, this article may spark some discussions about how to handle those better.

Case Study

One of the organizations I had recently joined found themselves in deep trouble when a software patch was applied and the updated software would not work at all.  The problem was discovered just at quitting time on a Monday.  The person who first learned about the problem went home without reporting it to management or focusing any attention on it.Angry woman pounding her shoe on the desk while talking on the phone

The next day, internal customers were livid when none of their office programs worked.  It took several days before the problem was identified and eventually   fixed.  It was later learned that the patch caused the software to contact a server that was no longer in service.

That incident and the lack of any organized plan for handling computer system outages became a rallying cry for a better business process to avoid such severe customer pain.  Management agreed that we needed to do a better job and supported figuring out how to do that.

Creating a Project Team

I will use the process we developed as an example.  Readers can adapt or adjust as appropriate for their company or organization.

A team was created to come up with a better business model for handling computer outages.  In our organization, we gathered a representative from each area of the information services department:

  1. Help desk
  2. Applications development
  3. Network support services
  4. Mainframe operations
  5. Desktop support services
  6. Security
  7. Website support.

Developing an Incident Response Team Plan

Over the course of a couple months, we put together a plan that included:

  1. Purpose, Scope and Approach
  2. Definitions
    1. What is an "Incident" in our organization?
    2. What triggers activation of the Incident Response Team?
    3. Post Incident Review process
  3. Roles and Responsibilities
    1. Incident Command Leader
    2. Communication Leader
    3. Documentation Leader
    4. Help Desk Coordinator
    5. Incident Response Team Members
    6. Management Representatives
  4. Communication Plan
    1. Communication within the Incidence Response Team
    2. Communication from the Incident Response Team to Impacted Stakeholders
    3. Post-Incident Review Communication and Incident Tracking
  5. Escalation Path
    1. Incident Response – Severity Level 1 – Low
    2. Incident Response – Severity Level 2 – Medium
    3. Incident Response – Severity Level 3 – High

While the organizing team was doing their job, we had a few unexpected outages that allowed us to practice what we were working on and fine-tune our process.  Over a few years, it became a very well-run process that our internal and external customers came to appreciate very much.

Implementing the Incident Response Team

We developed a standing interdisciplinary Incident Response Team (IRT) to lead problem identification, communication, documentation and resolution activities once an "Incident" was determined to have occurred.  The team had pre-defined levels of authority appropriate for incident resolution.  We assigned primary members and backup members for each role.  The IRT designation was not the person's only job; their function on the IRT was directly related to the job they already had.  

Whenever a potential issue was identified, the IRT Team met quickly to determine whether an issue was urgent enough to activate the IRT plan.  If it was an urgent issue, we assigned a level (1 – low impact, 2 – medium impact, 3 – high impact) and quickly assigned roles: Incident Command Leader, Communication Leader, Documentation Leader and Help Desk Coordinator.  And, assigned other staff and duties as appropriate for the incident.

If the incident lasted more than 2 hours or was designated a Level 3 (High Impact), a Management Representative was assigned who was responsible for the particular area of outage.  For example, if it was a network issue, the Network Manager became the Management Representative for the IRT team.  If it was a virus, the Security Manager became the Management Representative.  If it was a software application issue, the Software Applications Manager became the Management Representative, etc.

We developed a 1-page Checklist that helped us identify Who, What, Where, Why, When and How that helped to quickly gather the appropriate resources to deal with an issue. (See example of Checklist )

The Communication Leader would then start the notification process to management and our internal customers giving them details about the incident: what happened, what we were doing about it, when they would see resolution and / or when we would provide an update.  They also issued the "Incident Resolved" notice when the problem was fixed.

We learned very early that when a technical problem happens, the technical people who can fix it should focus all their time and energy on fixing the problem, not dealing with customer communications and a barrage of questions.  People who can communicate with customers should be assigned the communication role so the technical people can concentrate on what they do best.

We developed communications templates that could be used and updated quickly with the current incident’s information.  We developed a process for how to notify internal customers if the network or phone system went down (e.g., walking the floors of offices, using the phone system instead of email, using walkie-talkies, using cell phones, etc.)  Our process also included when to post notices on our internal intranet as well as external customer-facing websites and social media about outages.

We developed a step-by-step plan for each severity level for each role – who did what, when and how.  And, when an "Incident" was called, that incident became the IRT Team’s highest priority.  Everyone went to work to get the issue resolved as quickly as possible.

Each business group that had a representative on the IRT Team, also developed a written process for what they would do when there was an Incident in their focus area.  Each IRT Team member and backup person had a 3-ring notebook with their overall IRT and focus area written plans.  That gave them something they could grab quickly and check off what needed to be done.  If for any reason, someone else needed to step in for a team member, everyone knew where to look to take over easily.  We even developed telephone scripts that could be used to broadcast a voicemail message to appropriate people in addition to email notifications.

After an Incident was designated, the IRT team members met quickly as needed for status updates (usually, about every 2 hours) so that we could keep impacted stakeholders updated with progress reports regularly.  Some incidents were resolved quickly; some went on for several hours or days before they could be resolved.

After an Incident was resolved and customers notified of "all clear," we held a Post Incident Review meeting within a week to review what happened, what went well, what didn’t go well and what improvements we could make, either in the IRT process or in any other area affected.  Over time, those review meetings provided a large number of process improvements that prevented other failures.

Benefits

The IRT team members were committed to the process, given the authority to take action quickly and became very much respected by management and customers.  Everyone in the organization learned that if a computer problem was noticed by anyone at all, there was a clear path to let the right people know about it so it could get resolved.   Happy computer user - from ClipartPanda.com

One of the side benefits of our Incident Response Team process was that it helped to raise awareness in the IT staff about the customer impact of a network change that went wrong, a new application that failed, a technology change that caused unexpected customer disruption or the business impact of any other type of  technology outage.  It also provided a clearly identified business process, with multiple people aware of and responsible for handing any possible service disruptions.

Computer failures happen all the time, for a wide variety of reasons.  While we can’t very often predict sudden outages, as IT professionals, we can do a much  better job helping customers know what to expect and we can do our very best to fix a problem as quickly as possible.  We certainly owe our customers the simple courtesy of letting them know when the technology they use every day isn’t working correctly.

Customers — internal and external — expect to be treated with respect.  Having an effective computer Incident Response Plan in place is simply good business.  Not having one in place can lead to great customer dissatisfaction, wasted time trying to identify and fix problems, and can cost a business a great deal of money and unnecessary public relations headaches.

 

  Internet Resources

book graphic  Books  Disclosure: We get a small commission for purchases made via links to Amazon. 

  • Incident Management for I. T. Departments. Darren O'Toole.  CreateSpace, 2015. ISBN 978-1511631747 
  • Incident Response & Computer Forensics — Third Edition.  Jason T. Luttgens, Matthew Pepe, Kevin Mandia.  McGraw-Hill, 2014.  ISBN 978-0071798686 
  • The Computer Incident Response Planning Handbook: Executable Plans for Protecting Information at Risk — 1st Edition.  N. K. McCarthy, Matthew Todd, Jeff Klaben.  McGraw-Hill, 2012.  ISBN 978-0071790390 

world wide web - articles  Articles

Related newsletter articles:
    June 2001 - Successful Project Management
    November 2006 - Project Management - Early Warning Signs
    December 2000 - Sponsoring Successful Projects
    May 2010 - The 5 Goals of a Project Manager
    November 1996 - Management vs. Leadership
    April 2001 - Consulting Skills for Managers
    June 2004 - Successful Stakeholdering
    August 2008 - Secrets of New Project Success

smiley graphic  The Lighter Side  

 

About our resource links:  We do not endorse or agree with all the beliefs in these links.   We do keep an open mind about different viewpoints and respect the ability of our readers to decide for themselves what is useful.

spike bullet If you have comments about this month's topic, please let us know or take our newsletter survey.  If you would like to receive free notices of the new monthly topic, please sign up for our mailing list.  See our Privacy Policy

Page updated: October 16, 2023      

This page is http://www.itstime.com/apr2016.htm             Printer-friendly version

The 10th Need: Mischief    :)

| Home Page | Top of Page |

| Barbara Taylor | Books | Clients | FAQ | Feedback | Interesting Links | Mailing List |
| Michael Anthony | Michael Teachings | Newsletter | Personality Game |
| Products | Services | Speakers | Spirituality | Training | Travel | Translations

| Contact Us | Search the site | Site Map |

The 10th Need: Mischief    :)

© Copyright 1980  -  2015,  Barbara Taylor               Copyright Notice and Student Research Requests                 Privacy Policy and Legal Notice