On Monday, October 28, 2013 2:36:13 PM UTC-7, David Bernier wrote: > On 10/27/2013 03:20 PM, Jennifer Murphy wrote: > > > There are many lists containing rankings of great books. Some are > > > limited to a particular genre (historical novels, biographies, science > > > fiction). Others are more general. Some are fairly short (50-100 books). > > > Others are much longer (1,001 books). > > > > > > Is there a way to "average" the data from as many of these lists as > > > possible to get some sort of composite ranking of all of the books that > > > appear in any of the lists? > > > > > > I took a crack at it with a spreadsheet, but ran into problems. I will > > > explain it briefly here. > > > > > > If the lists are all the same length and include exactly the the same > > > books, the solution is relatively simple (I think). I can just average > > > the ranks. I can even add a weighting factor to each list to adjust the > > > influence on the composite ranking up or down. > > > > > > I ran into problems when the lists are of different lengths and contain > > > different books. I could not think of a way to calculate a composite > > > ranking (or rating) when the lists do not all contain the same books. > > > > > > Another complicationb is that at least one of the lists is unranked (The > > > Time 100). Is there any way to make use of that list? > > > > > > I created a PDF document with some tables illustrating what I have > > > tried. Here's the link to the DropBox folder: > > > > > > https://www.dropbox.com/sh/yrckul6tsrbp23p/zNHXxSdeOH > > > > > > > I have a couple of ideas... > > > > (1) The different lists have different criteria for > > inclusion or exclusion. They may not be explicit, > > but let's assume they are made explicit. > > An exclusion criterion "not poetry" can in principle > > be turned into a combination of "ors" and "inclusion factors", as > > > > "not poetry" = "is novel" or "is non-fiction" or "is historical > > novel". > > > > these selectors matter because Tolstoy's "War and Peace" > > would not appear in a list "English literature" works ... > > yet, it's Russian literature, has been translated in English, > > and has received wide acclaim. > > > > The idea would be to find all lists which, according to > > their explicit selection criteria, may include say > > "War and Peace" if all books in said category were ranked. > > But different lists which may include "War and Peace" will > > probably sometimes have different criteria. > > > > (2) To consider calibrating between lists, say if > > 10 out of 20 lists all included the novel > > "Moby Dick", then to sort of use "Moby Dick" as > > a benchmark. > > > > (3) My own observation with movies and books is > > that some books and movies seem designed to > > maximize sales, or to "target" a specific segment > > of readers & tastes, e.g. Harlequin series, which > > while "good reading for entertaiment", can be > > more easily read than "Remembrance of Things Past", > > a multi-volume novel by French author Marcel Proust, > > < http://en.wikipedia.org/wiki/In_Search_of_Lost_Time > . > > > > David Bernier > > >
Its an error minimization problem.
START: LIST1=1 LIST2=1 LIST3=1
Rank List 1 List 2 List 3 1 A B F 2 B A H 3 C E C 4 D G D 5 E D A
CALC WEIGHTED AVERAGES
A = (( 100*LIST1) + (75*LIST2) + (0*LIST3) ) / 3 B = (( 75*LIST2) + (100*LIST1) ) / 2 C = (( 50*LIST1) + (50*LIST3) ) / 2 ...