On Saturday, March 16, 2013 10:27:21 PM UTC-7, Paul wrote: > I've found conflicting information about the degrees of > freedom to use in the chi-square distribution when > estimating failure rate from the number of failures seen > over a specified period of time. To be sure, the lower > MTBF (upper failure rate) always uses 2n+2, where n is the > number of failures. However, the upper MTBF (lower > failure rate) is shown as using both 2n and 2n+2, > depending on the source. I haven't found an online > explanation of exactly how the chi-square distribution > enters into the calculation (other > thanhttp://www.weibull.com/hotwire/issue116/relbasics116.htm, > which I'm still chewing on). So I haven't been able to > determine whether 2n or 2n+2 is correct from first > principles at this point. Based on the reasoning in the > above weibull.com page, however, I am inclined to believe > that the degrees of freedom should be 2n because we're > talking about the two tails of the *same* distribution for > upper and lower limits. But this leaves the mystery of > why 2n+2 shows up frequently. Is the reason for this > straightforward enough to explain via this newsgroup?
On Mar 17, 7:50 am, "David Jones" wrote: > The use of 2n+2 appears wrong at first sight. > > At first sight, there seem to be two valid approaches: > > (1) treat the problem as having observed n observations > from an exponential distribution. The last interval (after > the last failure) also has an exponential distribution, > but is not independent of the rest .... since the sum of > the intervals must be the total "specified period of > time". > > (2) treat the problem as having observed n, where n is a > realisation of a Poisson random variable. > > These lead to two different estimates of the rate ... the > first based on the time to the last failure, the second on > the total observation period. There are then two different > confidence intervals... the first leads to using a > chi-squared distribution with 2n degrees of freedom > because you have something proportional to the sum of n > realisations from a chi-squared distribution with 2 > degrees of freedom ... the second can also lead to using a > chi-squared distribution but for an indirect reason > resulting from the relation between the cumulative > distribution functions of the Poisson and chi-squared > distributions ... it may be that this gives a result using > 2n+2 degrees of freedom. > Howeverhttp://en.wikipedia.org/wiki/Poisson_distribution#Confidence_interval > indicates that the upper and lower limits in this approach > would use different degrees of freedom, 2n and 2n+2. > > It seems likely that approach (2) is using more > statistical information than approach (1) and hence that > the basic estimate in (2) is to be preferred. The fact > that the result in approach (2) apparently uses 2 > different degrees of freedom might be thought of as just a > mathematical artefact, or as a result of using n+1 > exponential observations (in the form of the total > interval length) that are not independent (as an > adjustment from the result for n independent observations. > > However, approach (1) is clearly an invalid representation > of the experiment, as n is not fixed, but might give a > valid inference arguing conditional on the observed n. > Approach (2) is able to give a reasonable answer for the > case where n=0. There are presumably alternative ways of > treating the problem that are applicable when the failure > time distribution is not exponential, but these must lead > to different results as n would not be Poisson.
On Mar 17, 4:44 pm, Ray Vickson wrote: > I wanted to reply to David Jones's response below, but my > browser is forcing me to reply instead to you. > > With regard to David's point (1): _given_ n observed > outcomes in (0,T) the outcome times are UNIFORMLY > distributed in (0,T); that is, the individual arrival > times are the n order statistics of the distribution > U(0,T). Therefore, the observed inter-arrival times are > NOT exponential and are NOT independent. (This is a > fundamental and well-known property of Poisson > processes.)
Dave, Ray, thanks for that.
I am indeed dealing with a fixed observation time, so it is a Poisson process. Actually, since I'm only looking at the number events in that time interval and I don't care when they occur, I like the wording of calling it a Poisson Random Variable (RV).
I guess I will have to find a textbook that derives this explicitly in order to understand the reason for 2n in the lower bound of lambda versus 2n+2 in the upper bound. I suspect that the reason is intimately related to an earlier question that I posted about the reasoning behind summing up the Poisson probabilities for 0 to r events (where r is the number of events observed) as a basis for calculating the upper bound on lambda. But the observation that the total observation interval serves as an additional item of independent information is intriguing.
The reason why this is so important (at least seemingly, in my mind) is for the special case of zero failures. The white paper http://microblog.routed.net/wp-content/uploads/2006/08/calculating_mttf_with_zero_failures.pdf says that one can only calculate the upper bound. That is, you can determine an upper bound on lambda with a confidence of (say) 95%. That is, in the chi-squared distribution (and specifically for zero failures, in the chi-square distribution), the area under the right tail is only 5%. The white paper (and other sources I've seen) says that we can abuse this process and seek an upper bound with 50% confidence as a median of sorts, since it's not easy to calculate the mean in the case of zero failures.
But if we really can do that, then it should be equally valid in the case with nonzero failures, which means both upper and lower bounds exist, expressed using the Gamma function. This seems to be a contradiction, because the upper and lower bounds use different Gamma functions. Hence, the 50%-confidence lower bound will not coincide with the 50%-confidence upper bound. In fact, it seems logical for the two 50% confidence bounds to be equal regardless of whether they were going to be used as a replacement for the median.