Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Notice: We are no longer accepting new posts, but the forums will continue to be readable.

Topic: Is there a way to calculate an average ranking from uneven lists?
Replies: 15   Last Post: Oct 30, 2013 12:18 PM

 Messages: [ Previous | Next ]
 Jennifer Murphy Posts: 24 Registered: 2/23/12
Re: Is there a way to calculate an average ranking from uneven lists?
Posted: Oct 30, 2013 12:18 PM

On Sun, 27 Oct 2013 23:46:59 +0000, Ben Bacarisse <ben.usenet@bsb.me.uk>
wrote:

>Jennifer Murphy <JenMurphy@jm.invalid> writes:
>

>> On Sun, 27 Oct 2013 13:36:29 -0600, Virgil <virgil@ligriv.com> wrote:
>>

>>>In article <chpq69prq63kh364qqmphkmqedhgm5ti6h@4ax.com>,
>>> Jennifer Murphy <JenMurphy@jm.invalid> wrote:
>>>

>>>> There are many lists containing rankings of great books. Some are
>>>> limited to a particular genre (historical novels, biographies, science
>>>> fiction). Others are more general. Some are fairly short (50-100 books).
>>>> Others are much longer (1,001 books).
>>>>
>>>> Is there a way to "average" the data from as many of these lists as
>>>> possible to get some sort of composite ranking of all of the books that
>>>> appear in any of the lists?

><snip>
>>>One way to compare rankings when there are different numbers of objects
>>>ranked in different rankings is to scale them all over the same range,
>>>such as from 0% to 100%.
>>>
>>>Thus in all rankings a lowest rank would rank 0% and the highest 100%,
>>>and the middle one, if there were one, would rank 50%.
>>>Four items with no ties would rank 0%, 33 1/3%, 66 2/3% and 100%,
>>>and so on.
>>>
>>>For something of rank r out of n ranks use (r-1)/(n-1) times 100%.

>>
>> In the lists I have, the highest ranking entity is R=1, the lowest is
>> R=N. For that, I think the formula is (N-R)/(N-1). No?

>
>Here's another idea to add to the mix. Some rankings (and I think this
>is one) have the property the top is more significant than the bottom.
>Anyone who picks a book to be no. 1 should have carefully weighed it up
>against no. 2 and no. 3. But what about no. 1001? How likely is it
>that some tiny alteration in the assessment might make it no. 995 or
>998? And this effect is related to the absolute length of the list, not
>just the relative position within it.
>
>Put it another way, outstanding things stand out. Once you are into the
>more run of the mill. the distinction become less significant.
>
>As a result, you might consider a negative exponential weighting -- a
>ranking of R is given a value of w^(R-1) with 0 < w < 1. Thus all first
>positions are "worth" 1, all second positions are worth w, and all third
>positions w^2 and so on.

This suggestion, in combination with James Waldby's suggestion to add up
the score, rather than average them, looks to be a very good solution.
I've run some preliminary simulations that look great.

The only question I have is whether to use a geometric or arithmetic
progression for discounting lower ranking books. I think I agree with
you that the geometric (or exponential) progression does a better job of
emphasizing the greater significance of the higher rankings.

Here's some sample data on ranks from 1-25. The first column is the raw
ranks. The next three columns shows geometric discounting using F=0.96.
The last three show arithmetic discounting with the scores going to zero
at N+1. I set the geometric discounting factor to 0.96 so that Book #2
would get the same score as for the arithmetic progression.

The geometric progression takes a greater absolute absolute discount at
the top, but a constant relative discount. The arithmetic progression
takes constant absolute discount, which results in an increasing
relative discount, but on a smaller and smaller base.

Geometric Discount Arithmetic Discount
Rank F = 0.96 Diff % N = 25 Diff %
1 1.000 -.---- -.---- 1.000 -.---- -.----
2 0.960 0.0400 96.00% 0.960 0.0400 96.00%
3 0.922 0.0384 96.00% 0.920 0.0400 95.83%
4 0.885 0.0369 96.00% 0.880 0.0400 95.65%
5 0.849 0.0354 96.00% 0.840 0.0400 95.45%
6 0.815 0.0340 96.00% 0.800 0.0400 95.24%
7 0.783 0.0326 96.00% 0.760 0.0400 95.00%
8 0.751 0.0313 96.00% 0.720 0.0400 94.74%
9 0.721 0.0301 96.00% 0.680 0.0400 94.44%
10 0.693 0.0289 96.00% 0.640 0.0400 94.12%
11 0.665 0.0277 96.00% 0.600 0.0400 93.75%
12 0.638 0.0266 96.00% 0.560 0.0400 93.33%
13 0.613 0.0255 96.00% 0.520 0.0400 92.86%
14 0.588 0.0245 96.00% 0.480 0.0400 92.31%
15 0.565 0.0235 96.00% 0.440 0.0400 91.67%
16 0.542 0.0226 96.00% 0.400 0.0400 90.91%
17 0.520 0.0217 96.00% 0.360 0.0400 90.00%
18 0.500 0.0208 96.00% 0.320 0.0400 88.89%
19 0.480 0.0200 96.00% 0.280 0.0400 87.50%
20 0.460 0.0192 96.00% 0.240 0.0400 85.71%
21 0.442 0.0184 96.00% 0.200 0.0400 83.33%
22 0.424 0.0177 96.00% 0.160 0.0400 80.00%
23 0.407 0.0170 96.00% 0.120 0.0400 75.00%
24 0.391 0.0163 96.00% 0.080 0.0400 66.67%
25 0.375 0.0156 96.00% 0.040 0.0400 50.00%
26 0.360 0.0150 96.00% 0.000 0.0400 0.00%

I think the geometric progression is probably better.

Thanks for the great suggestion.

Date Subject Author
10/27/13 Ben Bacarisse
10/28/13 Jennifer Murphy
10/30/13 Jennifer Murphy
10/27/13 James Waldby
10/28/13 Jennifer Murphy
10/28/13 Graham Cooper
10/28/13 Graham Cooper
10/28/13 Graham Cooper
10/28/13 Graham Cooper
10/28/13 Graham Cooper
10/28/13 Jennifer Murphy
10/28/13 JohnF
10/28/13 Jennifer Murphy
10/29/13 JohnF
10/30/13 Jennifer Murphy