On 10/28/2013 05:36 PM, David Bernier wrote: > On 10/27/2013 03:20 PM, Jennifer Murphy wrote: >> There are many lists containing rankings of great books. Some are >> limited to a particular genre (historical novels, biographies, science >> fiction). Others are more general. Some are fairly short (50-100 books). >> Others are much longer (1,001 books). >> >> Is there a way to "average" the data from as many of these lists as >> possible to get some sort of composite ranking of all of the books that >> appear in any of the lists? >> >> I took a crack at it with a spreadsheet, but ran into problems. I will >> explain it briefly here. >> >> If the lists are all the same length and include exactly the the same >> books, the solution is relatively simple (I think). I can just average >> the ranks. I can even add a weighting factor to each list to adjust the >> influence on the composite ranking up or down. >> >> I ran into problems when the lists are of different lengths and contain >> different books. I could not think of a way to calculate a composite >> ranking (or rating) when the lists do not all contain the same books. >> >> Another complicationb is that at least one of the lists is unranked (The >> Time 100). Is there any way to make use of that list? >> >> I created a PDF document with some tables illustrating what I have >> tried. Here's the link to the DropBox folder: >> >> https://www.dropbox.com/sh/yrckul6tsrbp23p/zNHXxSdeOH >> > > I have a couple of ideas... > > (1) The different lists have different criteria for > inclusion or exclusion. They may not be explicit, > but let's assume they are made explicit. > An exclusion criterion "not poetry" can in principle > be turned into a combination of "ors" and "inclusion factors", as > > "not poetry" = "is novel" or "is non-fiction" or "is historical > novel". > > these selectors matter because Tolstoy's "War and Peace" > would not appear in a list "English literature" works ... > yet, it's Russian literature, has been translated in English, > and has received wide acclaim. > > The idea would be to find all lists which, according to > their explicit selection criteria, may include say > "War and Peace" if all books in said category were ranked. > But different lists which may include "War and Peace" will > probably sometimes have different criteria. > > (2) To consider calibrating between lists, say if > 10 out of 20 lists all included the novel > "Moby Dick", then to sort of use "Moby Dick" as > a benchmark. > > (3) My own observation with movies and books is > that some books and movies seem designed to > maximize sales, or to "target" a specific segment > of readers & tastes, e.g. Harlequin series, which > while "good reading for entertaiment", can be > more easily read than "Remembrance of Things Past", > a multi-volume novel by French author Marcel Proust, > < http://en.wikipedia.org/wiki/In_Search_of_Lost_Time > .
Upon further thinking, with the Top 50 in 10 Lists, I'd prefer to just "merge" the 20 Lists, with one book per line, with each line having annotations such as: "#2:Ti" for Time's #2 novel "#10:BOMC" for Book of the Month Club's #10 novel, "#42:EBrit." for Encyclopaedia Britannica's #42 novel, etc.
It remains to give some rough ordering of the merged and annotated lists, which contains all the data about the rankings of the Top 50 novels from the 10 un-merged lists, given the annotations to each book title on the merged&annotated list ...