Sampling

	Sampling

Statistical methods are used to describe populations by using samples. If the population of interest is small enough, statistical methods are not required; every member of the population can be measured. If every member of the population has been measured, a confidence interval for the mean is not necessary, because the mean is known with certainty (ignoring measurement error).

For statistical methods to be valid, all samples must be chosen randomly. Suppose the time to fail for 60 watt light-bulbs is required to be greater than 100 hours, and the customer verifies this on a monthly basis by testing 20 light-bulbs. If the first 20 light-bulbs produced each month are tested, any inferences about the reliability of the light-bulbs produced during the entire month would be invalid, because the 20 light-bulbs tested do not represent the entire month of production. For a sample to be random, every member of the population must have an equal chance of being selected for testing.

Now suppose that when 20 light-bulbs, randomly selected from an entire month's production, are place on a test stand, the test is ended after 10 of the light-bulbs fail. An example of what this test data might look like is given in the table below.

23 hours	63 hours	90 hours
39 hours	72 hours	96 hours
41 hours	79 hours	Ten units survive for 96 hours without failing
58 hours	83 hours

What is the average time to fail for the light bulbs? It is obviously not the average time to fail of the 10 failed light bulbs, because if testing was continued until all 20 light-bulbs failed, ten data points would be added to the data set that are all greater than any of the ten previous data points. When testing is ended before all items fail, the randomness of the sample has been destroyed. Since only ten of the light-bulbs failed, these ten items are NOT a random sample that is representative of the population. The initial sample of 20 items is a random sample representative of the population. By ending testing after 10 items have failed, the randomness has been destroyed by systematically selecting the 10 items with the smallest times to fail.

The situation described above is called censoring. Statistical inferences can be made using censored data, but special techniques are required. The situation described above is right censoring; the time to fail for a portion of the data is not known, but it is known that the time to fail is greater than a given value. Right censored data may be either time censored of failure censored. If testing is ended after a predetermined amount of time, it is time censored. If testing is ended after a predetermined number of failures, the data is failure censored. The data in the table above are failure censored. Time censoring is also known as Type I censoring, and failure censoring is also known as Type II censoring.

The opposite of right censoring is left censoring; For a portion of the data the absolute value is not known, but it is known that the absolute value is less than a given value. An example of this is an instrument used to measure the concentration of chemicals in solution. If the instrument cannot detect the chemical below certain concentrations, it does not mean there is no chemical present, but that the level of chemical is below the detectable level of the instrument.

Another type of right censoring is multiple censoring. Multiple censoring occurs when items are removed from testing at more than one point in time. Field data is often multiply censored. An example of multiple censoring is given in the table below. A "+" next to a value indicates the item was removed from testing at that time without failing.

112 hours	172 hours	220 hours
145 + hours	183 hours	225 + hours
151 hours	184 + hours	225 + hours
160 hours	191 hours	225 + hours
166 + hours	199 hours	225 + hours

There are two types of data commonly used in reliability engineering; continuous and discrete. Continuous variables are unlimited in their degree of precision. For example, a rod may be 5 inches long, or 5.01 inches long, or 5.001 inches long. It is impossible to state that a rod is exactly 5 inches long, only that the length of the rod falls within a specific interval. Discrete variables are limited to specific values. For example, if a die is rolled, the result is either 1, 2, 3, 4, 5, or 6. There is no possibility of obtaining any value other than these 6.