Drexel dragonThe Math ForumDonate to the Math Forum

Ask Dr. Math - Questions and Answers from our Archives
_____________________________________________
Associated Topics || Dr. Math Home || Search Dr. Math
_____________________________________________

Calculating Percentile Rank

Date: 04/30/2009 at 20:18:30
From: Susan
Subject: Percentile rank 

Percentile rank means the percentage of scores that fall "at or below"
a certain number.  If more than one data value matches the number, why
do we only count half of the data values when calculating the
percentile rank?  ie:  10, 11, 12, 12, 12, 12, 15, 18, 19, 20.  Why is
the percentile rank of 12 calculated at 4/10 instead of 6/10 
since there are 6 data values that fall "at or below" 12?  



Date: 05/01/2009 at 13:03:44
From: Doctor Peterson
Subject: Re: Percentile rank

Hi, Susan.

Percentile is not always defined exactly the same way; there are 
some tricky details, especially when you want to apply the concept 
to a small "toy" data set like this one.  In real life, you would 
apply it to, say, 30,000 scores on a standardized test, and this 
sort of problem goes away.

I'm not familiar with the specific rule you are using, but I did find
it online.  There are actually two different concepts to think about.
 First, consider the following article:

  Wikipedia: Percentile
    http://en.wikipedia.org/wiki/Percentile 

That discusses percentile in the sense of "what value is at the nth 
percentile (where n is a whole number)?"  This gives 99 points that 
divide a large data set into 100 equal parts, so that any value 
between the p/100th and the (p+1)/100th is considered to be "in" the 
pth percentile.  The adjustments in the definitions are needed to 
deal with cases where N is not a multiple of 100, so that the 
calculations do not point to individual values.

What you are asking about is percentile rank, which is somewhat 
different from that; it asks "at what percentile (again, a whole 
number) is this value?"  Here the problem with a small data set (or a 
large set with few possible values) is that the same value may 
appear in more than one "percentile" in the above sense.  We have to 
decide which one we should use--the first? the last? the middle?

The following article gives your definition in symbolic form without 
further explanation, and contrary to its earlier definition in words:

  Wikipedia: Percentile Rank
    http://en.wikipedia.org/wiki/Percentile_rank 

  cf_l + 0.5 f_i
  -------------- * 100%
         N

There cf_l is the number of scores lower than the score of interest, 
f_i is the number of scores equal to the score of interest, and N is 
the total number of scores.  So you are counting all scores below, 
and half the scores at, the given value in finding the percentage.

This definition makes good sense to me.  Basically, they don't want 
to be biased toward either the first data point with the given value
(the number of values BELOW 12, namely 2/10 = 20%) or the last (the 
number of values AT OR BELOW 12, namely 6/10 = 60%; this can also be 
taken as 100%--the number of values ABOVE 12, which gives 100%-- 
40% = 60%).  So they essentially take the average of the two.  They 
are splitting the difference between the two possible definitions.

In other words, the MIDDLE of the 12's best represents where the 
12's as a group are "at", better than either the first or the last 
of them.

If you have any further questions, feel free to write back.


- Doctor Peterson, The Math Forum
  http://mathforum.org/dr.math/ 



Date: 05/01/2009 at 16:05:15
From: Susan
Subject: Thank you (Percentile rank )

Dear Dr. Peterson,

Thank you for your very detailed answer to my question regarding
percentile rank.  I have referenced many textbooks regarding
percentile rank, but none of them have explained "why" half of the
repeating values are counted, they simply tell you to only count half
of them.  I am a 9th grade algebra teacher and I like to tell my
students the "why" behind formulas, definitions, etc. because I 
think they are more apt to remember if they understand the "why." I
whole-heartedly appreciate the time and effort you put into responding
to my question (a question that has taunted me and my colleagues for a
long time).

Thank you,

Susan
Associated Topics:
High School Statistics

Search the Dr. Math Library:


Find items containing (put spaces between keywords):
 
Click only once for faster results:

[ Choose "whole words" when searching for a word like age.]

all keywords, in any order at least one, that exact phrase
parts of words whole words

Submit your own question to Dr. Math

[Privacy Policy] [Terms of Use]

_____________________________________
Math Forum Home || Math Library || Quick Reference || Math Forum Search
_____________________________________

Ask Dr. MathTM
© 1994-2013 The Math Forum
http://mathforum.org/dr.math/