Poisson Distribution Applied to Web Site Page Demand
Date: 07/14/2004 at 10:41:12 From: Graham Subject: Converting a per hour value to a PEAK per second value A web site delivers 150,000 pages in one hour. The web site will not deliver the pages evenly thoughout the hour--some minutes and some seconds will be busier than others. The web site must be able to cope with the peak demand on a second by second basis. How do I convert the 150,000 pages per hour figure to an accurate PEAK per second figure? This is not a real web site, rather I'm asking a theoretical question about converting a per hour rate to a per second rate when the distribution over the hour isn't uniform. I suppose in theory (if distribution within the hour were completely random) the worst case is that all the transactions could be delivered in a single second within the hour and the best case is that they are delivered equally throughout the 3600 seconds within the hour. I am after a formula that gives me a realistic figure that is statistically likely--between these two extremes. The simple route is to divide 150,0000 by 60 to get the per minute rate, then divide by 60 again to get the per second rate. In the 150,000 per hour example this makes 41.66 per second. I could then apply an arbitary uplift to cater for the variable demand--e.g. double it to 84 pages per second. But is there a more accurate solution?
Date: 07/14/2004 at 21:55:24 From: Doctor Mitteldorf Subject: Re: Converting a per hour value to a PEAK per second value Hi Graham-- As long as you understand that this is about finding a model that fits pretty well with reality, rather than about solving an equation to get an answer, then we can go forward with suggestions for a model. The mathematics of the Poisson distribution is relevant here. The catch is that you'll have to know something about the size of individiual calls for data. The Poisson distribution results whenever random, identical events come along at a constant average rate from a large number of independent sources. How good are these assumptions in your case? The "independent" part is fairly good, except that there are likely to be times of day when there are a lot more people online than other times. I'd say the same of "constant average rate". The size of page requests from different users is identical for some sites, say where everyone is downloading the same file. If it is variable, that might not be too great an obstacle, and you can use an average number. Here's how you'd do a sample Poisson calculation: Let's say you have an average demand of 42 pages per second, in lumps of 500 pages. In other words, the average user downloads 500 pages and there are 42/500 users per second, on average. (This number, 500, is crucial, and you'll want to think carefully about how best to determine it.) In our example, the average number of calls per minute is 42*60/500=5.04 users, each demanding 500 pages. What the Poisson distribution can tell you is the way that that number, whose average is 5.04, varies from minute to minute. The Poisson formula for the probability of n calls in a minute is P(n) = x^n * e^(-x) / n! where x is our number 5.04. For example, you can evaluate this formula for all numbers 10 and above and find the sum of those probabilities is .015. This tells you that there are only 1.5% of all minutes in which the demand is by more than 10 users. Given our assumption of 500 pages per user, this means that if your server can meet a peak demand of 5000 pages per minute, it will keep up with demand 98.5% of the time, and be overloaded 1.5% of the time. You should experiment with this number 500 that I introduced arbitrarily to describe the "lumpiness" of the demand, so that you see the way in which our conclusion is very sensitive to this assumption. For example, if you assumed that the size of the average page request was 100 instead of 500, you'd find that the probability of a demand of more than 5000 pages per minute would be not 1.5% but .0002%. Is this the kind of model you were looking for? - Doctor Mitteldorf, The Math Forum http://mathforum.org/dr.math/
Date: 07/15/2004 at 13:19:53 From: Graham Subject: Thank you (Converting a per hour value to a PEAK per second value) Yes, that is great. I will play about with the figures, but the formula is perfect. Thanks for your help.
Search the Dr. Math Library:
Ask Dr. MathTM
© 1994- The Math Forum at NCTM. All rights reserved.