Deception of Credible Scenario

Risk identification is a fundamental phase of a risk assessment. It begins after the system and objectives are defined. Personal experience, rationality and the concept of a credible scenario are important in the context of risk identification. Often, we hear that practitioners consider only credible scenarios in a risk identification exercise.

Today we explore why only considering credible scenarios is a mistake and could lead to disaster.

Credible and Incredible Scenarios

We believe that considering only credible scenarios is censorship. Risk analysis should prioritize the scenarios, but no scenario should be discounted at first. The prioritization resulting from the risk assessment will take care of eliminating far-fetched or meaningless scenarios. That is a result of the analysis, not an arbitrary decision made prior to analysis.

It has been said that it takes 10,000 hours of practice to reach expert status in a subject matter. This is what Malcolm Gladwell discusses in his bestseller Outliers. When an expert says that a failure can’t happen, that a failure mode is beyond belief, it means that it is unheard of during the expert’s 10,000 hours of observation or practice. Those 10,000 hours do not occur with real hands-on experience, and not in the system’s real environment. We all know that detailed environmental conditions change considerably, for example, the likelihood of failure.

As a side note, if one takes ten experts and puts them together, the sum of experience will not alter considerably the final result: they may very well all agree, and erroneously, that a failure is far-fetched because their observation times were simultaneous.

Tailings Dams Failures

Note that, in the context of tailings dams, those 10,000 hours are negligible compared to the collection of every incident over the collection of dams. In fact, 10,000 hours is slightly more than the running time of just one operational dam per year (8,760 hours), which in turn is negligible compared to the 3*107 collective operational experiences per year for 3,500 dams. Of those 3,500 dams, there are perhaps 3−4 major failures per year.

3−4 major Tailings Dam Failures per Year

Those 3−4 major failures per year have been painstakingly collected and analyzed over the last hundred years. Publications like ICOLD (2001) (see figure below) attempted to define failure modes for dams of different makes. The large number of unknown causes highlights the uncertainties embedded in the definition of the failure modes. Furthermore, consider for example the slope stability category. How do we know that those slope stability accidents weren’t caused by erosion, seepage, foundation problems or perhaps a small earthquake?

Deception of credible scenario

Proper Taxonomy

Developing a correct taxonomy of the failure modes would have required detailed and complex forensic analyses that were, unfortunately, not performed. Thus, the 10,000 hours of expert observation rely on scant reports, hearsay, and uncertain data. Even recent statistical studies and information collection efforts (Azam and Li, Church of New England) have shown the existence of information gaps that make censoring of failure modes a very doubtful practice.

In recent years however, in the aftermath of some of those 3−4 failures per year, expert panels have developed detailed and scientific forensic analyses that generally determine a set of circumstances that lead to failure. This highlights the tragic truth: it is not one single failure mode that causes the deterioration of a dam, but a combination of triggers. A perfect storm of concurrent minor failure modes can contribute to the failure of the dam.

Failure Modes and Causality

It becomes apparent that failure modes, and in particular only “credible” failure modes are insufficient. They do not explain why failures occur. They explain, and not even completely, how a failure could occur under the influence of a very limited set of triggers. Forensic studies have shown that failures occur because of a set of causes. A dam may fail following an unstable slope failure mode, but the causes of that failure are way more complex. The question is: why do we keep trying to use a model we know is flawed? Is it because we put together groups of experts that easily concur, having similar backgrounds and using the same knowledge base?

ORE2_Tailings™ Causalities

ORE2_Tailings does not look at failure modes. It considers over 30 key performance indicators of the considered structure and predicts the causalities of the potential failure.

The algorithm follows the rule that failure is due to the compounded effect of various deficiencies. By doing so, ORE2_Tailings mimics the results of forensic investigations.

Closing Remarks

We cannot rely on our experience-based intuition; it is simply not enough. We cannot put too much trust on knowledge gained through incomplete and oftentimes misleading knowledge either. Using censored and simplistic failure modes to determine whether a failure can happen within required probability limits is wrong. Any scenario must be considered credible prior to risk assessment, allowing the risk assessment methodology to filter out low-risk scenarios. No arbitrary decisions should enter in this filtering, so that the assessment can provide a clear risk mitigation roadmap.