
Re: Is there a way to calculate an average ranking from uneven lists?
Posted:
Oct 28, 2013 2:27 AM


On Mon, 28 Oct 2013 00:23:32 +0000 (UTC), James Waldby <not@valid.invalid> wrote:
>On Sun, 27 Oct 2013 14:06:56 0700, Jennifer Murphy wrote: >> On Sun, 27 Oct 2013 13:36:29 0600, Virgil wrote:> >>> Jennifer Murphy <JenMurphy@jm.invalid> wrote: >>> >>>> There are many lists containing rankings of great books. Some are >>>> limited to a particular genre (historical novels, biographies, science >>>> fiction). Others are more general. Some are fairly short (50100 books). >>>> Others are much longer (1,001 books). >>>> >>>> Is there a way to "average" the data from as many of these lists as >>>> possible to get some sort of composite ranking of all of the books that >>>> appear in any of the lists? >[snip] >>>> I ran into problems when the lists are of different lengths and contain >>>> different books. I could not think of a way to calculate a composite >>>> ranking (or rating) when the lists do not all contain the same books. >>>> >>>> Another complication is that at least one of the lists is unranked (The >>>> Time 100). Is there any way to make use of that list? >>>> >>>> I created a PDF document with some tables illustrating what I have >>>> tried. Here's the link to the DropBox folder: >>>> https://www.dropbox.com/sh/yrckul6tsrbp23p/zNHXxSdeOH >>> >>>One way to compare rankings when there are different numbers of objects >>>ranked in different rankings is to scale them all over the same range, >>>such as from 0% to 100%. >>> >>>Thus in all rankings a lowest rank would rank 0% and the highest 100%, >>>and the middle one, if there were one, would rank 50%. >>>Four items with no ties would rank 0%, 33 1/3%, 66 2/3% and 100%, >>>and so on. >>> >>>For something of rank r out of n ranks use (r1)/(n1) times 100%. >> >> In the lists I have, the highest ranking entity is R=1, the lowest is >> R=N. For that, I think the formula is (NR)/(N1). No? >> >> Two questions: >> >> 1. Do I then just average the ranks across the lists? >> >> 2. What scaled rank do I use for a book that is not ranked in a list? > >For the given problem, averages of ranks probably aren't a statistically >sound approach. For example, see the "Qualitative description" section >of article <http://en.wikipedia.org/wiki/Rating_scale>, which says: >"User ratings are at best ordinal categorizations. While it is not >uncommon to calculate averages or means for such data, doing so >cannot be justified because in calculating averages, equal intervals >are required to represent the same difference between levels of perceived >quality. The key issues with aggregate data based on the kinds of rating >scales commonly used online are as follow: Averages should not be >calculated for data of the kind collected." (etc.)
Yes, I did feel a little uneasy about averaging numbers that are not really numerical in the usual sense.
>Also see <http://en.wikipedia.org/wiki/Polytomous_Rasch_model> which in >its "The model" section has some statistical analysis that might (or might >not) apply. Also see <http://en.wikipedia.org/wiki/Likert_scale> and >some pages listed at <http://en.wikipedia.org/wiki/Category:Psychometrics>. > >Here's an approach to consider: Set up some criteria for giving points >to various books, and give each book a total score based on the number of >criteria it meets when all the lists are considered. For each list, each >book gets 1 point for each criterion that it meets. Sort the resulting >scores from large to small. > >Here's an example of a possible set of criteria: { in first place; in top 2; >in top 5; in top 10; in top 20; in top 40; in top 80; on list}. > >For example, if list 1 is { #1 Emma; #2 Mrs. Dalloway; #3 Anna Karenina; >#4 Lolita; #5 Salome; #6 Vera} and list 2 is { #1 Emma; #2 Persuasion; >#3 Northanger Abbey}, then Emma scores 16;
Do you give a book a score for being in the top 80 even if the list only has 50 or 10 entries?
>Mrs. Dalloway and Persuasion >score 7; Anna Karenina, Northanger Abbey, Lolita, and Salome score 6; >Vera scores 5. Perhaps it would work better with more and larger lists.
This is a very creative solution. I like that it is additive. This completely eliminates the problem of what to do with books that are missing from the list.
What would you say to combining your idea with Ben's. Give each #1 book a score of "1". Give each lower ranked book on each list a discounted score (geometrically or arithmetically). Then just add them up?
>Anyhow, make up a set of criteria, run all your lists against it, and >if the results aren't right, change the criteria until they are.
I think I'll do just that. :)

