The Math Forum

Ask Dr. Math - Questions and Answers from our Archives
Associated Topics || Dr. Math Home || Search Dr. Math

Probability Density Functions

Date: 07/23/2003 at 14:13:15
From: Kevin
Subject: Probability Density Functions

My boss has given me the responsibility of taking a set of data, 
finding the correct distribution that goes with it, and then drawing 
the PDF onto the histogram chart. Unfortunately I'm not sure what 
steps are needed to go from the data to a known distribution. Since 
I'm using various sets of data that change every time I can't send in 
an example at all.

The most frustrating part is that since I don't have a statistics 
background I don't know where to begin. I've heard phrases like 
shape parameter, scale parameter, Pearson Distribution, Goodness of 
Fit test and Method of Moments, but they don't make any sense to me.  
I've done extensive research within textbooks, the Internet, and 
through Profs at the local university; however, I still don't have a 
clue where to begin.

Below are the various PDF's for the distributions.

Exponential: lambda * Exp(-1 * lambda * x)
Gamma:[x^(a - 1) * Exp(-(x/B))] / [Gamma(a) * B^a]
Normal: Exp[(-(x-u)^2) / (2 * (a^2))]/[a * (2 * PI)^0.5]
Rayleigh: [x * Exp(-0.5 * (x / a)^2)] / a^2
Weibull: [a * (x^(a-1)) * Exp(-(x/B)^a)] / (B^a)

For the Normal curve I've learned the various parameters that go into 
the function; however, none of the other curves is as well documented.

What I need is a guiding voice to tell me where to begin and how to 
go about finding out which distribution the data falls under.  
Finding out how to calculate the shape and scale variables where 
applicable would be nice as well.

Date: 07/23/2003 at 15:07:36
From: Doctor George
Subject: Re: Probability Density Functions

Hi Kevin,

Thanks for writing to Doctor Math.

As you suspect, you have a learning curve to climb. The most common 
thing to do is to fit a Pearson distribution to the data. The basic 
method of moments technique compares the first four moments of your 
data with the moments of the curves in the Pearson family.

The third and fourth moments, related to skewness and kurtosis, are 
the basis for determining which member of the Pearson family to use. 
The other moments tell you how to scale and shift the distribution 
to fit your data.

I strongly recommend that you get your hands on Norman Johnson's 
book, _Systems of Frequency Curves_.

The Rayleigh distribution is a special case of the Weibull, and 
neither of them is in the Pearson family. If you have a reason to 
fit those particular distributions they must be done separately.

The Pearson Type IV is numerically difficult to handle. It is common 
to substitute the Johnson Su distribution for it. You will find it 
in Johnson's book as well.

A key question is what information you hope to gain from the fitted 
distribution. Drawing inferences from the fitted distribution is not 
always as good an idea as it initially seems (though it may make for 
impressive presentations). It may be that the distribution will have 
little to do with your actual data in some critical respect.

There is a substantial amount of code that you will have to write. 
Many of the building blocks are available for free through the NIST 
website. You may also want to construct random number generators for 
the various members of the Pearson family to help when testing.

If need be, there are commercial software packages that do this kind 
of task. Whether or not you can integrate with them effectively is 
another issue.

Write again if I can give you more help.

- Doctor George, The Math Forum 
Associated Topics:
College Statistics

Search the Dr. Math Library:

Find items containing (put spaces between keywords):
Click only once for faster results:

[ Choose "whole words" when searching for a word like age.]

all keywords, in any order at least one, that exact phrase
parts of words whole words

Submit your own question to Dr. Math

[Privacy Policy] [Terms of Use]

Math Forum Home || Math Library || Quick Reference || Math Forum Search

Ask Dr. MathTM
© 1994- The Math Forum at NCTM. All rights reserved.