Measuring the Unmeasurable: Gage R&R for Transactional Six Sigma Projects
How many times has this happened to you? You’re leading a Six Sigma project on a transactional process of some kind, something not directly tied to manufacturing or measurement of product quality. You get to the Measure phase of your Six Sigma project and struggle to figure out how to satisfy the requirement for a Gage R&R statistic to interpret. If that’s ever happened to you, read on for a solution to this sticky problem.
Where Gage R&R Fits into a Six Sigma Project
Before we get into the details, I want to spend a few words talking about where a gage study fits into a Six Sigma project and a little bit on the "spirit" of the Gage R&R requirement. Gage R&R is the second step in the Measure phase of the Six Sigma DMAIC process. Gage R&R comes after process mapping and building a data collection plan, and before we calculate the baseline capability of our process to be improved. Gage R&R also comes up again in the Control phase of the Six Sigma DMAIC process for the purpose of ensuring that we are able to measure the critical control parameters adequately to maintain the gains that we have achieved.
There are good reasons why Gage R&R is placed where it is in the process. A gage study follows process mapping because we must understand the process we are trying to improve and where the data about that process can be found before we can measure it. A gage study precedes calculating baseline capability because we need to be able to ensure that the data is good before we use it.
The reason we want to do a gage study boils down to confidence and good decision making. In Measure, we do a gage study of the data used to generate the Project Y or Critical To Quality (CTQ) measurement. This is the issue that is most important to the customer of the process. Why do we need to have confidence in this data? So we can be confident that, as we carry that data forward to capability analysis and root cause analysis in the Analyze phase, we can trust the conclusions that we will draw and the results we will see. That’s it, confidence and good decision making.
To understand the importance of a gage study, imagine your automobile speedometer for a moment. Imagine you’re driving down the road and your speedometer indicates that you're traveling at 55 miles per hour just as you pass a parked police car. Imagine your surprise if that policeman pulled you over and wrote you a ticket for going 65 miles per hour! It would have been great to know that your speedometer was inaccurate by 10 miles per hour. You might have made a different decision while passing the parked police car.
Getting Back to our Initial Six Sigma Project Problem
We are leading a Six Sigma project with attribute data rather than continuous data measured on a device. What do we do to ensure that we can trust the data, decisions, and conclusions that will follow? Attribute agreement is the answer.
Attribute agreement is a method of comparing the responses made by "appraisers" when judging the characteristic of interest. In an attribute agreement study there are four possible levels of analysis of the responses: 1. appraiser against themselves; 2. appraiser against other appraisers; 3. appraiser against a standard (if one exists); and 4. overall appraiser capability.
A case study helps explain the tool and how to interpret the results.

A Six Sigma project has been chartered to look into the high occurrence of OffQuality product due to expired shelf life. This type of offquality product typically accumulates about $1mm annually.
 Our Data: Classification codes of OffQuality reasons from ERP
 Our Problem: Determine if we can trust the data that everything classified as shelf life is really a shelf life issue
 Possible Choices: SL=Shelf Life; EP=Experimental Product; RT=Retained Sample
 Each appraiser judged the samples twice
Our Data: (Click on image to enlarge.)
Figure 1
Figure 2
Once the proper selections have been made, go ahead and conduct the analysis and you’ll get results that look like this: (Click on image to enlarge.)
Figure 5
Interpreting these graphs is as follows: The graph on the left shows how much an appraiser agrees with their own earlier decisions across successive trials. This graph indicates that we may have a training issue with appraiser number 3 regarding their understanding of the criteria for the decision. The graph on the right indicates a percentage of agreement compared with the standard, if one exists. (If no standard is chosen then this panel will be blank.) This graph indicates that appraiser number 2 agrees 100 percent with the standard, while appraisers number 1 and 3 appear to be somewhat confused about the standard.
Fleiss’ Kappa Statistic
Next we move on to interpret the session window statistics, but before we go there a brief explanation of the Kappa statistic.
The basis for the Kappa statistic is a comparison to random chance. Imagine flipping a coin to make a quality decision on a process, that’s random chance. Kappa compares the results gathered through the study with the possibility that those results could be randomly generated as if flipping a coin or rolling a die.
Kappa ranges from 1 to +1 with a value of 0 indicating random chance. The closer the Kappa statistic gets to 1, the less likely that the results are the result of random chance. Said a different way, the less random chancelike the results, the more likely that the appraisers (getting back to the Six Sigma project case study) are actually able to discern differences between the categories.
Kappa values less than 0 indicate that the responses are worse than random chance would generate. It’s sort of the statistical equivalent to the old test taking advice of answering C when you don’t know the answer. You’ll be right some of the time. This indicates that the appraiser can not distinguish the categories or is not willing to try.
The Hypothesis regarding Kappa goes as follows:
 H0: The agreement within appraiser is due to chance
 H1: The agreement within appraiser is not due to chance
The way to related the Kappa statistic to a typical Gage R&R result is to subtract Kappa from 1 to get an approximation of a Gage R&R value. So if Kappa is .9, subtract .9 from 1 and the remainder is .1 or 10 percent Gage R&R. This is just a way to translate the Kappa result into terms that Six Sigma Master Black Belts and Black Belts understand. The same rules of interpretation of a gage study result apply with attribute studies. Just to refresh, the AIAG guidelines for acceptability of gage studies are:
Gage R&R > 30 percent = unacceptable, measurement process needs improvement
Gage R&R between 10 percent and 30 percent = Marginal, measurement system needs improvement
Gage R&R < 10 percent = acceptable
Interpret Attribute studies using the same rules.
Below is the statistical results for the two panel graph shown in figure 5 along with the specific interpretation. (Figures 6 and 7) (Click on image to enlarge.)
Figure 6
Figure 7
Figure 8
Six Sigma Project Case Study Conclusion
The final conclusion from this Six Sigma project case study was that something needed to be done to improve the ability of engineers making this decision to make a better decision about how to categorize scrap product. This one finding, when corrected, reduced the occurrence of the problem by nearly 50 percent and allowed the team to correctly interpret the magnitude of the problem originally stated. Failure to address the attribute agreement issues would have resulted in a vastly different set of solutions than resulted after this problem was corrected.
Use Attribute Agreement Analysis for Good Decision Making
Attribute agreement analysis is an effective method for delivering a statistical interpretation of a subjective judgment decision made by people, allowing fact based improvements to be identified, implemented and measured. Attribute agreement analysis allows those leading Six Sigma projects without continuous data to measure the quality of that data and boost confidence in the capability of the system, and decisions that are made to improve it.