Drexel dragonThe Math ForumDonate to the Math Forum



Search All of the Math Forum:

Views expressed in these public forums are not endorsed by Drexel University or The Math Forum.


Math Forum » Discussions » sci.math.* » sci.math.independent

Topic: Is there a way to calculate an average ranking from uneven lists?
Replies: 12   Last Post: Nov 2, 2013 12:55 PM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View   Messages: [ Previous | Next ]
Graham Cooper

Posts: 4,280
Registered: 5/20/10
Re: Is there a way to calculate an average ranking from uneven lists?
Posted: Oct 28, 2013 8:37 PM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

On Monday, October 28, 2013 2:36:13 PM UTC-7, David Bernier wrote:
> On 10/27/2013 03:20 PM, Jennifer Murphy wrote:
>

> > There are many lists containing rankings of great books. Some are
>
> > limited to a particular genre (historical novels, biographies, science
>
> > fiction). Others are more general. Some are fairly short (50-100 books).
>
> > Others are much longer (1,001 books).
>
> >
>
> > Is there a way to "average" the data from as many of these lists as
>
> > possible to get some sort of composite ranking of all of the books that
>
> > appear in any of the lists?
>
> >
>
> > I took a crack at it with a spreadsheet, but ran into problems. I will
>
> > explain it briefly here.
>
> >
>
> > If the lists are all the same length and include exactly the the same
>
> > books, the solution is relatively simple (I think). I can just average
>
> > the ranks. I can even add a weighting factor to each list to adjust the
>
> > influence on the composite ranking up or down.
>
> >
>
> > I ran into problems when the lists are of different lengths and contain
>
> > different books. I could not think of a way to calculate a composite
>
> > ranking (or rating) when the lists do not all contain the same books.
>
> >
>
> > Another complicationb is that at least one of the lists is unranked (The
>
> > Time 100). Is there any way to make use of that list?
>
> >
>
> > I created a PDF document with some tables illustrating what I have
>
> > tried. Here's the link to the DropBox folder:
>
> >
>
> > https://www.dropbox.com/sh/yrckul6tsrbp23p/zNHXxSdeOH
>
> >
>
>
>
> I have a couple of ideas...
>
>
>
> (1) The different lists have different criteria for
>
> inclusion or exclusion. They may not be explicit,
>
> but let's assume they are made explicit.
>
> An exclusion criterion "not poetry" can in principle
>
> be turned into a combination of "ors" and "inclusion factors", as
>
>
>
> "not poetry" = "is novel" or "is non-fiction" or "is historical
>
> novel".
>
>
>
> these selectors matter because Tolstoy's "War and Peace"
>
> would not appear in a list "English literature" works ...
>
> yet, it's Russian literature, has been translated in English,
>
> and has received wide acclaim.
>
>
>
> The idea would be to find all lists which, according to
>
> their explicit selection criteria, may include say
>
> "War and Peace" if all books in said category were ranked.
>
> But different lists which may include "War and Peace" will
>
> probably sometimes have different criteria.
>
>
>
> (2) To consider calibrating between lists, say if
>
> 10 out of 20 lists all included the novel
>
> "Moby Dick", then to sort of use "Moby Dick" as
>
> a benchmark.
>
>
>
> (3) My own observation with movies and books is
>
> that some books and movies seem designed to
>
> maximize sales, or to "target" a specific segment
>
> of readers & tastes, e.g. Harlequin series, which
>
> while "good reading for entertaiment", can be
>
> more easily read than "Remembrance of Things Past",
>
> a multi-volume novel by French author Marcel Proust,
>
> < http://en.wikipedia.org/wiki/In_Search_of_Lost_Time > .
>
>
>
> David Bernier
>
>
>




Its an error minimization problem.

START: LIST1=1 LIST2=1 LIST3=1


Rank List 1 List 2 List 3
1 A B F
2 B A H
3 C E C
4 D G D
5 E D A



CALC WEIGHTED AVERAGES

A = (( 100*LIST1) + (75*LIST2) + (0*LIST3) ) / 3
B = (( 75*LIST2) + (100*LIST1) ) / 2
C = (( 50*LIST1) + (50*LIST3) ) / 2
...


CALC ERROR
= |A-100| + |A-75| + |A-0|
+ |B-75| + |B-100|
+ |C-50| + |C-50|
+ ...


Randomly adjust LIST1, LIST2 & LIST3
to minimize the error.



This does not take into account some lists will be best sellers
or poor sellers, and some will have a larger spread... but that's
a lot more complicated.



Herc
--
www.PrologDatabase.com




Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© Drexel University 1994-2014. All Rights Reserved.
The Math Forum is a research and educational enterprise of the Drexel University School of Education.