
Re: Is there a way to calculate an average ranking from uneven lists?
Posted:
Oct 29, 2013 1:39 AM


On Mon, 28 Oct 2013 17:37:04 0700 (PDT), grahamcooper7@gmail.com wrote:
>On Monday, October 28, 2013 2:36:13 PM UTC7, David Bernier wrote: >> On 10/27/2013 03:20 PM, Jennifer Murphy wrote: >> >> > There are many lists containing rankings of great books. Some are >> >> > limited to a particular genre (historical novels, biographies, science >> >> > fiction). Others are more general. Some are fairly short (50100 books). >> >> > Others are much longer (1,001 books). >> >> > >> >> > Is there a way to "average" the data from as many of these lists as >> >> > possible to get some sort of composite ranking of all of the books that >> >> > appear in any of the lists? >> >> > >> >> > I took a crack at it with a spreadsheet, but ran into problems. I will >> >> > explain it briefly here. >> >> > >> >> > If the lists are all the same length and include exactly the the same >> >> > books, the solution is relatively simple (I think). I can just average >> >> > the ranks. I can even add a weighting factor to each list to adjust the >> >> > influence on the composite ranking up or down. >> >> > >> >> > I ran into problems when the lists are of different lengths and contain >> >> > different books. I could not think of a way to calculate a composite >> >> > ranking (or rating) when the lists do not all contain the same books. >> >> > >> >> > Another complicationb is that at least one of the lists is unranked (The >> >> > Time 100). Is there any way to make use of that list? >> >> > >> >> > I created a PDF document with some tables illustrating what I have >> >> > tried. Here's the link to the DropBox folder: >> >> > >> >> > https://www.dropbox.com/sh/yrckul6tsrbp23p/zNHXxSdeOH >> >> > >> >> >> >> I have a couple of ideas... >> >> >> >> (1) The different lists have different criteria for >> >> inclusion or exclusion. They may not be explicit, >> >> but let's assume they are made explicit. >> >> An exclusion criterion "not poetry" can in principle >> >> be turned into a combination of "ors" and "inclusion factors", as >> >> >> >> "not poetry" = "is novel" or "is nonfiction" or "is historical >> >> novel". >> >> >> >> these selectors matter because Tolstoy's "War and Peace" >> >> would not appear in a list "English literature" works ... >> >> yet, it's Russian literature, has been translated in English, >> >> and has received wide acclaim. >> >> >> >> The idea would be to find all lists which, according to >> >> their explicit selection criteria, may include say >> >> "War and Peace" if all books in said category were ranked. >> >> But different lists which may include "War and Peace" will >> >> probably sometimes have different criteria. >> >> >> >> (2) To consider calibrating between lists, say if >> >> 10 out of 20 lists all included the novel >> >> "Moby Dick", then to sort of use "Moby Dick" as >> >> a benchmark. >> >> >> >> (3) My own observation with movies and books is >> >> that some books and movies seem designed to >> >> maximize sales, or to "target" a specific segment >> >> of readers & tastes, e.g. Harlequin series, which >> >> while "good reading for entertaiment", can be >> >> more easily read than "Remembrance of Things Past", >> >> a multivolume novel by French author Marcel Proust, >> >> < http://en.wikipedia.org/wiki/In_Search_of_Lost_Time > . >> >> >> >> David Bernier >> >> >> > > > >Its an error minimization problem. > >START: LIST1=1 LIST2=1 LIST3=1 > > >Rank List 1 List 2 List 3 > 1 A B F > 2 B A H > 3 C E C > 4 D G D > 5 E D A > > > >CALC WEIGHTED AVERAGES > >A = (( 100*LIST1) + (75*LIST2) + (0*LIST3) ) / 3 >B = (( 75*LIST2) + (100*LIST1) ) / 2 >C = (( 50*LIST1) + (50*LIST3) ) / 2 >... > > >CALC ERROR > = A100 + A75 + A0 > + B75 + B100 > + C50 + C50 > + ... > > >Randomly adjust LIST1, LIST2 & LIST3 >to minimize the error. > > > >This does not take into account some lists will be best sellers >or poor sellers, and some will have a larger spread... but that's >a lot more complicated.
You make some interesting suggestion, but the principles are foreign to me. I'll have to study them a bit to see if I can make sense of them. They may be beyond my meager skills. :(

