Canadian provincial utility Manitoba Hydro uses a root cause analysis process to determine the cause of and develop corrective actions for safety incidents at its generating facilities. Based on the results of a staff investigation, the utility diagnoses the cause of the incident and uses that information to improve safety practices.
By Ronald A. Zimmer and T. Brent Robertson
Accident prevention is the name of the game when it comes to worker safety at hydro facilities. Modern safety programs are designed to ensure that work can be accomplished without accidents, especially the ones that cause injuries to workers. One key aspect of this prevention is to ensure that accidents do not reoccur and that minor incidents are corrected before they become serious and cause injury or property loss. This is where an effective accident investigation and analysis process is critical.
Manitoba Hydro avoids use of the term “accident” whenever possible. In our organization, there are “incidents” — incidents that result in injury and incidents that do not.
Immediate causes of safety incidents are only symptoms of deeper problems in the system. Finding the underlying problem requires asking what systemic breakdown of the organization’s safety or other systems allowed the immediate cause to exist.
|Safety officers use this safety kit to investigate incidents at Manitoba Hydro facilities.|
Serious incidents are seldom the result of a single cause. Consider this example:
A fire starts in a pile of scrap wood and garbage on the hydro plant floor. If the only remedy is to extinguish the fire each time it starts but nothing is done about the accumulation of combustibles in the workplace, it is only a matter of time until the plant burns down or is seriously damaged. If a worker gets injured as a result of the fire and all that is done is to extinguish that fire, it is only a matter of time until the injuries become more serious or even fatal.
Taking the analysis of a safety incident to the root of the situation means looking beyond the circumstances immediately preceding the event. Sometimes just asking why the procedure was out of date or why the worker did not use the personal protective equipment is enough to flush out the answer. Having a root cause analysis system available that consistently leads the investigator to the underlying cause, used in conjunction with an incident investigation, is the best way Manitoba Hydro has found to accomplish this goal.
Choosing to implement root cause analysis
The move to a root cause analysis process for incident investigation was the result of an enhancement of the accident investigation skills of Manitoba Hydro’s safety officers. In 2003, a safety review team directed by the corporate safety and health committee identified the need for enhanced incident investigation training as one requirement for safety officers at Manitoba Hydro. Major accidents had always been investigated thoroughly, but up to this point recommendations were developed based on details revealed in the investigation alone (mainly the immediate causes, and ultimately subjective).
The provincial government provided further incentive to incorporate a root cause analysis. This occurred when the requirement for the identification of direct and indirect causes of serious incidents was included in the revision of Workplace Safety legislation in Manitoba that was enacted in February 2007.
The model Manitoba Hydro chose in 2004 for incident investigation and analysis is based on the loss causation model identified in the 1970s and uses the systematic cause analysis technique (SCAT) for root cause analysis.1
Manitoba Hydro personnel quickly realized that applying this process to all incidents would require involving more than just the corporation’s 40 to 50 safety officers. Thus, training on the use of this process for incident investigation and root cause analysis was extended to include all managers, supervisors, and workplace safety and health committee members. Members of this committee are elected worker representatives who, by law, are required to participate in all inquiries, investigations, studies, and inspections pertaining to employee health and safety. The depth of participation will vary from full involvement of the investigation and SCAT to simply reviewing the finished product to provide worker feedback.
|Manitoba Hydro corporate safety officers use the systematic cause analysis technique (SCAT) for root cause analysis of incidents at its hydroelectric facilities. This investigation allows the utility to proactively prevent future incidents.|
The safety and health of Manitoba Hydro workers is the responsibility of line managers. It was a matter of providing these managers and the committee members with the tools to conduct or assist with the investigation and analysis of what we call “low risk” incidents, as defined by the “evaluation of loss potential.” As a result, the potential class became 500 or more throughout the corporation.
Working to develop the process
In conjunction with the move to a more robust incident investigation process, Manitoba Hydro developed a new incident investigation policy that mandates staff to report all safety incidents and requires that supervisors become directly involved in the investigation process. In fact, the supervisor is now the key person in getting the process started. This new policy also requires that all incidents be investigated regardless of whether they result in an injury. This is a more proactive way of thinking. The implementation of controls for near misses is intended to prevent the incident from reoccurring and eliminate the risk of loss from future incidents.
Another enhancement to the process includes using a risk assessment matrix for all incidents that considers loss severity potential, probability of recurrence, and frequency of exposure to the hazard. This is called the “evaluation of loss potential.” The level of investigation and the resources assigned to this investigation are based on the outcome of the risk assessment rather than the outcome of the incident. The greater the risk of loss (e.g., personal injury, equipment damage), the more in-depth the investigation and the more resources assigned to the investigation process.
Applying the process
Ultimately, the supervisor of the injured employee (or supervisor of the working group in which the incident took place) fills out an incident investigation form that answers basic questions and completes a risk assessment. The risk assessment itself is comprised of three questions:
— What is the loss severity?
— What is the probability of recurrence?
— What is the frequency of exposure?
The answer to each of these questions is high, medium, or low and will complete the risk equation (risk = loss severity x probability of recurrence x frequency of exposure). For example, the breakdown of frequency of exposure is:
— High = frequently or continuously or >50 percent
— Medium = occasionally or 10 to 50 percent
— Low = very rarely/rarely or <10 percent
Based on a defined matrix, risk will be determined as high, medium, or low:
— High = minimum of two high and one medium
— Medium = minimum of two medium
— Low = minimum of two low and no high
Should the evaluation of loss potential put the incident in the low risk category, the supervisor continues the investigation with a shortened version of the SCAT. If the evaluation of loss potential is medium or higher, then field safety and corporate safety officers lead the investigation using a detailed report template.
Consistency in all investigations, coupled with an adequate level of details, is an essential element when applying root cause analysis. It is vital to have all of the facts and for all of the facts to tell the same story. The physical evidence must match the accounts of the event given by the people involved and any witnesses.
A couple of points quickly became obvious to the success of the process:
— Incidents are complex events that require advanced investigative techniques and a systematic analysis approach to ensure a correct understanding of their root causes; and
— Thorough investigation is required to ensure adequate information is available for root cause analysis. A thorough analysis is required to ensure proper development of corrective actions.
Training is critical to the success of the process. Safety officers at Manitoba Hydro are provided with a comprehensive level of training to enable them to provide support to supervisors and workplace safety and health committees in their use of the process. Incident investigation training is included as one component of a four-day Safety Training for Leaders training program provided to all supervisors and managers. A computer-based training module has also been developed to assist with completion of the investigation report form using this process, and incident investigation training is offered as a 6.5-hour training course to supervisors, managers, and workplace safety and health committee members.
Half of the 6.5-hour training session was dedicated to educating to a standard for incident investigation. The other half, including case studies, was devoted to the use of SCAT. This proved to be the most difficult portion of the training because this was new material for most of the participants and a lot of the terminology used was not “hydro language.” Terms like “loss exposure” and “critical task observation” were foreign to most employees.
The key to the use of a root cause analysis process such as SCAT is to understand that it is a systematic process. SCAT follows the loss causation model.1 The process is sequential and needs be followed in the proper order, without skipping steps. The root or basic cause of an incident cannot be determined until the immediate cause(s) is defined.
The process starts at the top of the chart with identification of the incident and a risk assessment. The incident identification is a description of the loss potential if there is no control put in place to prevent the incident from reoccurring. Examples may include statements like “Equipment damage due to fire” or “Multiple worker injuries due to vehicle collision.” Next is the event, which describes the contact with the energy source and is categorized into one of 20 predetermined statements. Examples include “fall from elevation,” “contact with electricity,” “caught in,” and “struck by.”
After the type of event has been determined, the next step is to focus on the immediate cause. The definition of an immediate cause is: “the circumstances that immediately precede the contact with the energy source.” The immediate causes are selected from another pre-determined list of substandard acts and substandard conditions from the SCAT chart.
Because the immediate causes in the list are generic in nature, it is necessary to further describe them in terms of the incident before moving on to selection of the root causes. In most cases, there is more than one immediate cause for an event. In all cases, it is recommended to brainstorm all potential immediate causes and shortlist them by review and consensus before moving on to the root cause.
Using SCAT, root cause is defined as “the underlying reasons that allow the substandard act or substandard condition to exist.” The root cause or causes are selected from a predetermined list that is divided into personal factors — such as “lack of knowledge,” “abuse or misuse,” or “improper motivation” — and job/system factors — such as “inadequate work standards,” “excessive wear and tear,” and “inadequate communication.” Each of these categories is further defined to pinpoint the exact cause as close as possible. A further description in terms of the specific event may also be necessary for each root cause identified.
The final step is to develop corrective actions. Corrective actions may address immediate causes, root causes, or more likely both. Those corrective actions that address root cause are the most important in this process because they address the systemic issues. These corrective actions will improve the systems in place that control the identified root cause.
For example, if the root cause was a lack of knowledge on the part of the worker, the corrective action may be additional training for all workers who perform this task. Other corrective actions may include reworking the training schedule to avoid missing the training again or reviewing the tracking process to see if there were any problems with that process.
If the root cause was inadequate preventative maintenance of a piece of equipment, it may be necessary to make adjustments to the maintenance program by reviewing the process from beginning to end. Were the maintenance standards wrong? Were the standards not followed? Were there standards at all? Who enforces these standards? Did the employees know about the standards? These and many other questions will be answered using the SCAT, and the corrective actions are based on the answers to these questions.
There is an expectation that corrective actions clearly define the desired action, the name of the person responsible for its implementation, and the expected date of completion. This makes the corrective action measurable and inserts a level of accountability into the process.
Results and future work planned
Requiring investigation of all incidents, even those considered low risk, has increased the level of involvement in the investigation process at the supervisor level and allowed corrective actions to be developed and implemented regardless of whether the incident resulted in loss or injury.
Corrective actions that have been defined as a result of incident investigations are more effective because they address root cause. Changes made to the corporate systems used to manage safety at Manitoba Hydro have made the entire safety system better. In one case, the inspection of potential rental properties was enhanced to ensure that they were evaluated for the presence of asbestos before leasing them to Manitoba Hydro employees. This was not something that was imbedded in the original process. Until an incident occurred and the root cause was determined, it would not have been corrected.
Incidents without injury, or “near misses” as they are sometimes called, are subjected to the SCAT. A near miss is just an injury waiting to happen, and determining root cause before an injury or loss occurs has obvious advantages. Small gaps in operating and switching procedures in the production group have been rectified as a result of this process.
There have also been challenges. Ensuring the root cause analysis process is understood and used as intended can be a tough haul. As with any new method, there is a learning curve. The people who must use the system need to be trained and given time to become accustomed to its use. Without adequate training and support, there is the possibility of misclassifying incidents and having a minor incident over-investigated (which can use up valuable time and resources) or a major event not investigated at all (which could lead to the possibility of similar incidents occurring).
Using the SCAT properly (in sequence) and applying it consistently throughout the entire corporation is challenging. This relates partly to training and experience with the system. The intent of a safety program is to reduce the number of incidents in the workplace.
Using a root cause analysis process to analyze safety incidents requires a level of familiarity with the system in use, but a typical supervisor may not investigate many of these incidents depending on the number of employees and the type of work performed. On one hand, we want the supervisor to understand how to properly investigate and analyze an incident but, on the other hand, we do not want him or her to become proficient at it.
Implementing the process for hydro
The Generation Business Unit of Manitoba Hydro implemented the root cause analysis technique and is using it with reasonable success.
In addition, improvements have been made to the process. At the beginning stages, supervisors were required to fill out two forms, which involved duplicate work. This has been streamlined by combining the Supervisors Incident Investigation form, which includes the SCAT analysis, and the Supervisors Report of Injury required by the Workers Compensation Board administrator at Manitoba Hydro.
The form also now populates automatically, which means the risk assessment calculation is automatic. Once the loss severity, probability of recurrence, and frequency of exposure have been checked, the risk category will populate automatically.
In the beginning stages of this process, Manitoba Hydro corporate safety officers reviewed all of the completed incident investigation forms and provided constructive feedback using an evaluation form. This helped to improve the overall quality of information being gathered and tracked for incidents, which ultimately aided in trending common safety incidents and developing strategies for improvements. The process will continue to improve over time and will continue to provide Manitoba Hydro with an effective investigation and analysis tool designed to get to the root of the matter!
1 Bird, Frank E., and George L. Germain, Practical Loss Control Leadership, DNV, Oslo, Norway, 1996.
Ron Zimmer and Brent Robertson are corporate safety officers in the Workplace Safety Department at Manitoba Hydro.