Statistical Analysis for Decision Making

Pocket Stats, Part 3: Sample Size for Pass-Fail Tests

Contributor: Andy Sleeper
Posted: 10/07/2009
Rate this Column: 
Be the first!

The meeting was tense. After a splashy release, users have complained about the uPod music player. Vibration from jogging or dancing tends to lock up the uPod, requiring the user to remove and replace the battery. Bloggers have been ruthless, and this product is fast becoming a stinker for Durian Computer Company.

The engineering team has duplicated the problem on a vibration table, and now they propose a software patch to fix the problem. Before releasing the patch to the world, the company must verify the problem is solved. How many uPods need to be tested on the vibration table to provide high confidence that the problem is fixed? All eyes turn to you. Fresh from Six Sigma Black Belt training, the team expects you to have the answer to this question.

"First, I have a question for you," you say. "We can’t prove that the problem is completely solved without testing every unit we sell, but we can come close. What percentage of uPods would be an acceptably small number to still have this vibration problem?" After some discussion, 1 percent seems reasonable, and this would represent a huge improvement over the estimated 10 percent that now have the problem.

Without touching your calculator, you announce: "If we test 300 on the vibration table, and all pass, this will give us 95 percent confidence that the failure rate is less than 1 percent."

Little do they know that you gained this remarkable ability not from your expensive Black Belt training, but from an article you read on the Internet!

First, I’ll give you the exact formula, and then the Pocket Stats version you can memorize.

Exact Version of the Formula

Here is the formula to calculate the required sample size for pass-fail tests, assuming zero failures: (Click on diagram to enlarge.)

In this formula, C% is the confidence level, expressed as a percentage. Dividing this by 100 percent converts the confidence into a number between 0 and 1. Also, p is the probability of defective units that you want high confidence of detecting, expressed as a number between 0 and 1. The sample size n is the minimum number of units that must be tested with zero failures. Since n is usually not an integer, round up the results of the calculation. Using the example where p = 0.01 and C% = 95 percent, n = 298.07, which rounds up to 299.

This table lists the sample size required for several situations: (Click on diagram to enlarge.)

A common sense solution would be to test 100 units to look for a 1 percent defective rate. Since 1 percent = 0.01, and 1/.01 = 100, it makes some sense that testing 100 might be enough. But according to the formula, this common sense solution provides only 63.2 percent confidence. If you test 100 units and have zero failures, you still have a 36.8 percent probability that the failure rate could be larger than 1 percent.

Pocket Stats Version of the Formula

Here is the Pocket Stats version of this formula that you can memorize:

First calculate 1/p in your head. For round numbers, this is often easy.

  • If you test 1/p units with zero failures, you have 63 percent confidence.
  • If you test 2 Ø (1/p) units with zero failures, you have 86 percent confidence.
  • If you test 3 Ø (1/p) units with zero failures, you have 95 percent confidence.

The Pocket Stats will often require one more unit than the formula. In the example, Pocket Stats indicates 300, when the exact answer is 299. This is a conservative or safe approximation.

This formula applies also to incoming inspection. In this field, Acceptable Quality Limit (AQL) refers to the highest probability of defective units considered to be acceptable. This is p in the above formula. If AQL is 2 percent, 1/AQL = 50. Testing a sample of n = 2 Ø 50 = 100 units with zero failures provides approximately 86 percent confidence that the AQL is satisfied, with a 14 percent risk that the defective rate is greater than 2 percent.

In some circles, 1-p has been called "reliability." Using the initial example, testing 299 with zero failures provides 95 percent confidence of 99 percent reliability. I find this terminology confusing, because "reliability" means different things in different situations. But I admit that "reliability" is simpler than "one minus the probability of defective units."

Pocket Stats Makes Calculating Sample Size for Pass-Fail Tests Easy

Sample size problems are often very difficult. But in this case, with pass-fail tests, a relatively simple formula is available, and the Pocket Stats version is easy enough to memorize. In my career, few formulas have been as useful as this one.

Now go be the hero in your next meeting!

Contributor: Andy Sleeper