Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Topic: Computing tables for use with Peirce's Criterion - in 1855 and 2008
Replies: 3   Last Post: Nov 21, 2012 9:28 AM

 Messages: [ Previous | Next ]
 sales@kt-algorithms.com Posts: 78 Registered: 1/29/05
Computing tables for use with Peirce's Criterion - in 1855 and 2008
Posted: Oct 5, 2008 9:41 AM

In 1852 Peirce described his criterion for identifying outliers in a
sample from an otherwise normal distribution (1).
In 1855 Gould, having being applying the criterion professionally,
published a set of tables of critical deviations together with a
description of the algorithm he had used to calculate them (2). The
calculations were done manually, aided only by log-tables and some
special function tables he managed to get his hands on!

Gould's first two tables list the (squares of the) critical deviation
x as a function of N, the number of values in the sample, and k, the
number of outliers being tested for. These tables differ with respect
the value for m, the 'number of unknown quantities', set to 1 and 2,
respectively.

Below I have tried to be consistent with Gould's notation.
Here are the exceptions:

Gould Here
--------------------------------
Lower case n k
psi x erfc(x/sqrt(2))
--------------------------------

Next, two equations are formulated from which the critical deviation
can be found (x>0).

Gould's equation D can be stated:

R = R1(x) = exp((x^2-1)/2) erfc(x/sqrt(2))

Defining (Gould's equation B):

Q^N = k^k (N-k)^(N-k) / N^N

Gould's equation A' can be combined with the unlabeled one just above
equation C into:

R = R2(x) = Q^(N/k) [(N-m-k x^2)/(N-m-k)]^(-(N-k)/(2k))

Now a numerical solution for x can be found after equating R1 and R2.

Gould's algorithm was based on the following pseudoconstant strategy
(3), the initial value for R being my own suggestion:

R = 0.25
<Find x from: R2(x)=R>
repeat
R = R1(x)
oldx = x
<Find x from: R2(x)=R>
until abs(x-oldx)<Eps

Although this works quite well on a computer (with some extra
termination logistics), much faster convergence can be achieved
through root-finding by Newton's Method.
I currently use the algorithm below (suitable for double precision
floating point). Some loop-invariant code have been kept explicit for

----------------------------------------------------------------------
function PeirceDeviation(N,k,m)
// Peirce's critical deviation for N samples, k potential outliers and
m
// 'unknown quantities' (usually 1).

if N-m-k<=0 then return(0) // result undefined

LnQN = k*Ln(k) + (N-k)*Ln(N-k) - N*Ln(N)

x = 1

repeat
x = min(x, sqrt((N-m)/k) - 1e-10)

// R1(x) and R2(x)
R1 = exp((x*x-1)/2)*erfc(x/sqrt(2))
R2 = exp( (LnQN - 0.5*(N-k)*ln((N-m-k*x*x)/(N-m-k)))/k )

// R1'(x) and R2'(x)
R1d = x*R1 - sqrt(2/pi/exp(1))
R2d = x*(N-k)/(N-m-k*x*x)*R2

// Newton
oldx = x
x = oldx - (R1-R2)/(R1d-R2d)
until abs(x-oldx) < N * 2E-16

return(x)
----------------------------------------------------------------------

Gould did not list x-values below 1, and when glancing down the tables
along the edge of the filled-in cells, one is tempted to ask if
something mathematically interesting can be said about the special
case x=1.
For that purpose we shall accept non-integer values for N and/or k.

Setting R1(1)=R2(1) we get:

erfc(1/sqrt(2)) = Q^(N/k) = k [(N-k)^(N-k) / N^N]^(1/k)

or, defining the ratio C = k/N:

erfc(1/sqrt(2)) = C (1-C)^(1/C-1)

So, x=1 implies a fixed k/N ratio, which does not depend on m.
C must be estimated numerically; here it is to 30 decimal places:

C = 0.58974 36745 10713 07248 06181 37574 ...

which just happens to be rather close to the simple fraction:

23/39 = 0.58974 35897 43589 ...

1. Peirce, B.: Criterion for the rejection of doubtful observations.
Astronomical Journal, vol. 2, iss. 45, p. 161-163 (1852).

2. Gould, B.A.: On Peirce's Criterion for the Rejection of Doubtful
Observations,
with tables for facilitating its application.
Astronomical Journal, vol. 4, iss. 83, p. 81-87 (1855).

3. Acton, F.S.: Real Computing Made Real: Preventing Errors in
Scientific and
Engineering Calculations. Princeton University Press (1996).

Best regards,

Knud Thomsen
Geologist, Denmark

Date Subject Author
10/5/08 sales@kt-algorithms.com
7/31/12 chris dardis
11/21/12 ninovanhooff@gmail.com
11/21/12 ninovanhooff@gmail.com