Drexel dragonThe Math ForumDonate to the Math Forum



Search All of the Math Forum:

Views expressed in these public forums are not endorsed by Drexel University or The Math Forum.


Math Forum » Discussions » sci.math.* » sci.stat.math.independent

Topic: Estimate failure rate: Variable degree of freedom in chi-square
Replies: 1   Last Post: Mar 17, 2013 7:50 AM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View   Messages: [ Previous | Next ] Topics: [ Previous | Next ]
David Jones

Posts: 61
Registered: 2/9/12
Re: Estimate failure rate: Variable degree of freedom in chi-square
Posted: Mar 17, 2013 7:50 AM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply



"Paul" wrote in message
news:c1b90f4f-56b8-43e1-8327-523d3d0c50d7@m12g2000yqp.googlegroups.com...

I've found conflicting information about the degrees of freedom to use
in the chi-square distribution when estimating failure rate from the
number of failures seen over a specified period of time. To be sure,
the lower MTBF (upper failure rate) always uses 2n+2, where n is the
number of failures. However, the upper MTBF (lower failure rate) is
shown as using both 2n and 2n+2, depending on the source. I haven't
found an online explanation of exactly how the chi-square distribution
enters into the calculation (other than
http://www.weibull.com/hotwire/issue116/relbasics116.htm,
which I'm still chewing on). So I haven't been able to determine
whether 2n or 2n+2 is correct from first principles at this point.
Based on the reasoning in the above weibull.com page, however, I am
inclined to believe that the degrees of freedom should be 2n because
we're talking about the two tails of the *same* distribution for upper
and lower limits. But this leaves the mystery of why 2n+2 shows up
frequently. Is the reason for this straightforward enough to explain
via this newsgroup?

==============================================================

The use of 2n+2 appears wrong at first sight.

At first sight, there seem to be two valid approaches:

(1) treat the problem as having observed n observations from an exponential
distribution. The last interval (after the last failure) also has an
exponential distribution, but is not independent of the rest .... since the
sum of the intervals must be the total "specified period of time".

(2) treat the problem as having observed n, where n is a realisation of a
Poisson random variable.

These lead to two different estimates of the rate ... the first based on the
time to the last failure, the second on the total observation period. There
are then two different confidence intervals... the first leads to using a
chi-squared distribution with 2n degrees of freedom because you have
something proportional to the sum of n realisations from a chi-squared
distribution with 2 degrees of freedom ... the second can also lead to using
a chi-squared distribution but for an indirect reason resulting from the
relation between the cumulative distribution functions of the Poisson and
chi-squared distributions ... it may be that this gives a result using 2n+2
degrees of freedom. However
http://en.wikipedia.org/wiki/Poisson_distribution#Confidence_interval
indicates that the upper and lower limits in this approach would use
different degrees of freedom, 2n and 2n+2.

It seems likely that approach (2) is using more statistical information than
approach (1) and hence that the basic estimate in (2) is to be preferred.
The fact that the result in approach (2) apparently uses 2 different degrees
of freedom might be thought of as just a mathematical artefact, or as a
result of using n+1 exponential observations (in the form of the total
interval length) that are not independent (as an adjustment from the result
for n independent observations.

However, approach (1) is clearly an invalid representation of the
experiment, as n is not fixed, but might give a valid inference arguing
conditional on the observed n. Approach (2) is able to give a reasonable
answer for the case where n=0. There are presumably alternative ways of
treating the problem that are applicable when the failure time distribution
is not exponential, but these must lead to different results as n would not
be Poisson.

David Jones

David Jones




Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© Drexel University 1994-2014. All Rights Reserved.
The Math Forum is a research and educational enterprise of the Drexel University School of Education.