
Re: Combinations with variable probability?
Posted:
Jan 17, 2009 1:40 AM


On 16 Jan, 21:47, Matt <matt271829n...@yahoo.co.uk> wrote: > On Jan 16, 7:47 pm, petertwocakes <petertwoca...@googlemail.com> > wrote: > > > > > On 16 Jan, 19:00, Matt <matt271829n...@yahoo.co.uk> wrote: > > > > On Jan 16, 5:51?pm, petertwocakes <petertwoca...@googlemail.com> > > > wrote: > > > > > A population of N=1000 listen to a particular radio show for one > > > > month. Each person listens for a different amount of time, t, > > > > expressed as a proportion of the total time the show occupied over the > > > > month from, and the times are normally distributed between 0.0 to 1.0. > > > > This is not possible. The normal distribution always ranges from oo > > > to +oo. You may have in mind some sort of "truncated" normal > > > distribution, or you may need to come up with a different model for > > > the listening times. > > > > > Each person can tell us exactly how much time, t, ?they listened for. > > > > > The show's output comprises a playlist of 100 songs, all played a > > > > different amount of times, but we know the proportion of airtime, a, > > > > occupied by each song. "Song A" ?accounted for a = 0.15 total air > > > > time. > > > > > Given ?a sample of n= 20 people at random, ?can we estimate the > > > > probability that exactly k of them heard the song? > > > > You need more information. For starters, what does "heard" mean? Heard > > > the whole song? Heard any part of it? > > > > > Is this even possible without exhaustively calculating p for each > > > > possible combination? > > > > > If instead we ask a random sample of 20 people if they heard the song, > > > > and 5 of them have, what is the probability that that particular > > > > outcome occured, given that we know a =0.15, and the value of t for > > > > each person. > > > > > Although at first it looks like a variation on the hypergeometric > > > > distribution, I'm guessing that's no use because of the variable > > > > probabilities? > > > > > What sort of things should I be studying to figure this one out? > > > Yes, approximately normal, truncated at zero and 1.0, with a mean of > > 0.5, SD ?0.34 > > "Heard" means heard any part of it. There is no unmentioned bias in > > any of the conditions > > If there's any other missing information, please infer ideal default > > values that would make either part solveable. > > Well, the next thing to figure out is what assumptions you want to > make about the fragmentation of the total song time and total > listening time. Just knowing the total proportions isn't enough. For > example, all other things being equal, a short song played frequently > is likely to be heard by more people tuning in (for just some of the > show) than a long song played less frequently. Similarly, all other > things being equal, someone tuning in often for short periods is more > likely to hear a given song than someone tuning in less often for > longer periods.
I would like to return to that, but before I can I need to find an even simpler representation because the amount of detail seems to much for a general class of problems where instead of binary success/ failure we have a known probability of success.
Given a population where each member has an individual known probability p(event) that a certain event has/hasn't happened to them, can we either: (a) take a sample of n of them and estimate the probability of k successes (b) take a real sample of n of them,determine the true value of k, and calculate what the probability was that this outcome occurred
I'm assuming (b) is easier because we know p(event) for each of the actual members.
(I'm worried about inventing examples, in case I pick one that can't work without detail, but anyway let's say, beating a score of 72 on a golf course, for which we assume the probability is purely a function of their pasthistory of 100 games. 0.0 = never beaten that score, 1.0 = always beaten that score, mean =0.5, and the distribution is, ahem, bellshaped.[ From my other post I thought 'Has rented DVD x' might work as an example too, where we pretend p is a simple function of that movie's popularity, and the number or rentals that person has made?] )
Steve

