Under Pressure: Deepwater Horizon and Why Systems Fail (Part I)

When disaster hits your business, do you know where to start to make sure it doesn’t happen again? Part I of this week’s Deming Files introduces the value of combining Deming and Goldratt –with special attention on applying Goldratt’s Theory of Constraints - to the Deepwater Horizon explosion to understand what issues to tackle first.

On 20 April 2010 a blowout caused an explosion on the Deepwater Horizon oil drilling rig in the Macondo Prospect well in the Gulf of Mexico. Eleven crewmen were killed, 17 were injured, and the blowout ignited a fireball visible from 35 miles (56 km) away. The resulting fire could not be extinguished and on 22 April 2010 the Deepwater Horizon rig sank, leaving oil gushing at the sea floor. This caused the largest offshore oil spill in United States history. In addition to the human tragedy and environmental toll, the economic costs were huge and mounted with each passing day. The oil leak from the ocean floor was not declared "effectively dead" until September 19, 2010, five months later.

The official report on the disaster prepared by the National Commission (NC) on the BP Deepwater Oil Spill states, "The explosive loss of the Macondo well could have been prevented." As one Commissioner explained, "The root causes are systemic and, absent significant reform in both industry practices and government policies, might well recur."

The detailed NC report indicates there are many, many things to be done. We observe that all may be important, but some are more important than others. That observation leads us to Deming and Goldratt because they help us determine most important places to start the improvement process.

Dr. W. Edwards Deming, the eminent, insightful consultant who wrote on management and quality issues and Eliyahu M. Goldratt, perhaps most famous for his pioneering work on the Theory of Constraints (ToC), provide a few "lenses." These lenses show us where to start and how to make the solutions to prevent a future disaster both more effective and efficient.

In our experience the lessons and solutions Deming and Goldratt offer are universal, that is, they apply to organizations of all kinds and internationally – nor do you need to wait until after a disaster to implement and benefit from them.

The NC report makes very clear that the problem was systemic. To get a picture of what "systemic" means, please visualize thousands of dominos. Each individual domino is standing on its end in a line of other dominos. Many of the domino lines cross over to other lines of dominos. All the dominos are on a big platter, covered with a glass dome. Pressure can be added to the atmosphere under the dome. Undue pressure added to the system of dominos can cause one or more dominos to become unstable and to fall, setting off a chain reaction. Success means the dominos remain standing, because if enough dominos fall in enough lines the result can be an explosion, fire, deaths, rig failure, and oil flowing from the ocean floor. The dominos are policies, procedures, methods, management beliefs, maintenance, corporate culture, people, tools, equipment, materials, measures, etc.

Undue pressure can cause the dominos to fall

In terms of the Deepwater Horizon, please note, we do NOT presume to know what needs to be done to "fix" off-shore drilling. Furthermore we know that many people have strong feelings about offshore drilling –both for and against. It can be a highly political issue, as well. We take no position except that we believe using the body of knowledge of Deming and Goldratt would be advantageous.

Our theory is that had a few Goldratt and Deming lenses been firmly in place, the pressure which caused the dominos to become unstable, fall and create a disaster would have diminished greatly. Similarly, putting a few lenses into place now will help prevent future disasters.

We have identified two lenses from Goldratt and Deming for your consideration.

Lens: Work on the biggest "system" constraint first

Goldratt provides several helpful insights. For example, don’t try to improve everything everywhere in the system at once. Sure, detailed improvements throughout the system will need to be put in place, but first, focus on the constraint that affects the entire system. Once that significant constraint is removed, go to the next most significant, then the next, then the next. Again, we ask you to visualize thousands of dominos on a platter under a dome. They are standing on end and in lines, the lines interweaving with other lines of dominos.

Here is the theory (real world examples will be discussed in Part II): a constraint that puts pressure on the entire system can cause a lot of dominos to wobble in a lot of places throughout the system. This is undesirable.

So where should we start our improvement process? Are we smarter if we try to stabilize dominos one at a time so they can withstand the pressure? (Examples of this might be rewriting safety policies and procedures –or adding more safety enforcement officers.) Or are we smarter if we first try to remove the most significant constraint that increases pressure on the entire system?

Following Goldratt’s advice, it is best to remove one constraint at a time – starting with the most powerful one that affects the entire system. Then go to the next most powerful and then the next, et cetera. By removing a constraint that affects the entire system, you remove some pressure that causes dominos to wobble under the dome. Thus, by removing a constraint that affects the entire system you have made the entire system (of all those lines of dominos) safer –and you have saved the time and money that would have been needed to buttress one domino at a time.

What are examples of pressures on the entire system?

Dr. Deming tells us commonly-accepted-but-harmful management practices such as management-by-quotas and management-by-fear and pressure destabilize people and processes. They affect the entire system (we will discuss them in greater detail in Part II) and they blind people to the consequences of decisions made under pressure. Those management practices can cause people to skip a routine inspection, to understate the potential impact of deterioration of a safety cut off, or to be tempted to put an untested process in place on the spur of the moment.

Another way to look at what Goldratt advises us in ToC is: if you start your improvement process by trying to shore up the stability of individual dominos, not only are you not affecting a constraint that is putting pressure on the overall system, you are doing improvement work that might not need to be done at all IF you removed the constraint that is pressurizing the entire system of dominos. For example, the pressure to drill faster is likely to cause: procedural work-arounds, the need to make faster decisions and more on-the-fly decisions, and misdiagnosis of root issues. Individually such elements might not be significant, but a preponderance could create a critical mass of instability –and devastating outcomes.

Here is the tricky part: the temptation is to start working on improving individual dominos everywhere in the system. Why are we tempted to do that? Because of what Goldratt calls UDEs: Un-Desirable Effects.

Goldratt points out that to identify a constraint, it is useful to identify the UDEs that are related to the constraint. However, if we start by looking at 1000 or 100 UDEs, then we will want to try to get rid of all the UDEs everywhere. We will have missed the point. Working to stabilize each of the individual dominos does not really help us remove the significant constraint that puts pressure on the entire system. By focusing on the dominos themselves –and not on the pressure that makes them unstable to begin with—we are assuming the root cause is within each domino. Thus, we will miss the fact that the pressure on the system of dominos is a big, systemic, root-cause constraint.

Fortunately, the NC report identified many of the biggest UDEs, including: poor worker safety and poor rig reliability. Again, just because we might see lots of UDEs somewhere in the system doesn’t mean they are the first ones we need to address. Working on UDEs related to individual dominos is not only insufficient, but often wasteful and even counter-productive because working on the localized UDEs to buttress those localized unstable dominos is NOT going to remove the more systemic constraint –and the UDEs related to worker safety and rig reliability.

This is very different from putting in checks and double checks on individual dominos that affect safety in domino lines (such as double checking for signatures on routine inspections). Sure, we need checks and in cases, double checks, too, but after we’ve removed the constraints which pressurize and affect the behaviors of many people –causing them to take short cuts because they want to achieve the quotas and perceive that doing so means they are doing a good job. Thus, if we put in place those checks and double checks without removing the big constraints (one after another) which pressurize the system, we get fooled into thinking we have error-proofed the system! But we haven’t error-proofed the system because we haven’t removed the pressure.

Even worse: not only have we not error-proofed the system at a high level by removing the source of pressure which destabilized the dominos, we will have raised the costs of having checks and double checks on unstable dominos –dominos that would be just fine had we removed the pressures which destabilize them. Unfortunately, we also would have increased bureaucracy, slowed down the system –even clogged it with unneeded procedures.

Without working from the most systemic constraint to the least systemic constraints we can actually make things worse, more expensive, and slower. We recognize that with off-shore drilling there a multiple, large, and very complex systems involved. Therefore, it makes sense for each large system to work on its systemic constraints first and to coordinate and communicate across systems.

As a Commissioner explained the mandate from the NC is, "We must begin by putting safety and reliability first." The UDEs of human death and threats to the environment are indeed, of paramount and systemic importance. Focusing on the big constraints that put pressure on the system and which compromise safety and reliability (and make them unstable) is essential to error-proofing the entire system.

We turn to Deming for insights on the big constraints of management practices in PART II, which will be published on Thursday 21 April 2011.

Authors’ Note: for references and suggested readings from Deming and Goldratt, please send us an e-mail.

Editor’s Note: The columns published in THE DEMING FILES have been written under the Editorial Guidelines set by The W. Edwards Deming Institute. The Institute views these columns as opportunities to enhance, extend, and illustrate Dr. Deming’s theories. The authors have knowledge of Dr. Deming’s body of work, and the content of each column is the expression of each author’s interpretation of the subject matter.