Drexel dragonThe Math ForumDonate to the Math Forum

Ask Dr. Math - Questions and Answers from our Archives
_____________________________________________
Associated Topics || Dr. Math Home || Search Dr. Math
_____________________________________________

Poisson Distribution Applied to Web Site Page Demand

Date: 07/14/2004 at 10:41:12
From: Graham
Subject: Converting a per hour value to a PEAK per second value

A web site delivers 150,000 pages in one hour.  The web site will not
deliver the pages evenly thoughout the hour--some minutes and some
seconds will be busier than others.  The web site must be able to cope
with the peak demand on a second by second basis.  How do I convert
the 150,000 pages per hour figure to an accurate PEAK per second 
figure?  

This is not a real web site, rather I'm asking a theoretical question
about converting a per hour rate to a per second rate when the
distribution over the hour isn't uniform.  I suppose in theory (if
distribution within the hour were completely random) the worst case is
that all the transactions could be delivered in a single second within
the hour and the best case is that they are delivered equally
throughout the 3600 seconds within the hour.  I am after a formula
that gives me a realistic figure that is statistically likely--between 
these two extremes.

The simple route is to divide 150,0000 by 60 to get the per minute 
rate, then divide by 60 again to get the per second rate.  In the 
150,000 per hour example this makes 41.66 per second.  I could then 
apply an arbitary uplift to cater for the variable demand--e.g. 
double it to 84 pages per second.

But is there a more accurate solution?



Date: 07/14/2004 at 21:55:24
From: Doctor Mitteldorf
Subject: Re: Converting a per hour value to a PEAK per second value

Hi Graham--

As long as you understand that this is about finding a model that fits
pretty well with reality, rather than about solving an equation to get
an answer, then we can go forward with suggestions for a model.

The mathematics of the Poisson distribution is relevant here.  The
catch is that you'll have to know something about the size of
individiual calls for data.  

The Poisson distribution results whenever random, identical events
come along at a constant average rate from a large number of 
independent sources.  How good are these assumptions in your case? 
The "independent" part is fairly good, except that there are likely to
be times of day when there are a lot more people online than other
times.  I'd say the same of "constant average rate".  The size of page
requests from different users is identical for some sites, say where
everyone is downloading the same file.  If it is variable, that might
not be too great an obstacle, and you can use an average number.

Here's how you'd do a sample Poisson calculation:  Let's say you have
an average demand of 42 pages per second, in lumps of 500 pages.  In
other words, the average user downloads 500 pages and there are 42/500
users per second, on average.  (This number, 500, is crucial, and
you'll want to think carefully about how best to determine it.)  In
our example, the average number of calls per minute is 42*60/500=5.04
users, each demanding 500 pages.

What the Poisson distribution can tell you is the way that that 
number, whose average is 5.04, varies from minute to minute.  The
Poisson formula for the probability of n calls in a minute is 

  P(n) = x^n * e^(-x) / n!

where x is our number 5.04.  For example, you can evaluate this 
formula for all numbers 10 and above and find the sum of those
probabilities is .015.  This tells you that there are only 1.5% of all
minutes in which the demand is by more than 10 users.  Given our
assumption of 500 pages per user, this means that if your server can
meet a peak demand of 5000 pages per minute, it will keep up with
demand 98.5% of the time, and be overloaded 1.5% of the time. 

You should experiment with this number 500 that I introduced
arbitrarily to describe the "lumpiness" of the demand, so that you see
the way in which our conclusion is very sensitive to this assumption.
For example, if you assumed that the size of the average page request
was 100 instead of 500, you'd find that the probability of a demand of
more than 5000 pages per minute would be not 1.5% but .0002%.

Is this the kind of model you were looking for?

- Doctor Mitteldorf, The Math Forum
  http://mathforum.org/dr.math/ 



Date: 07/15/2004 at 13:19:53
From: Graham
Subject: Thank you (Converting a per hour value to a PEAK per second
value)

Yes, that is great.  I will play about with the figures, but the 
formula is perfect.  Thanks for your help.
Associated Topics:
College Probability

Search the Dr. Math Library:


Find items containing (put spaces between keywords):
 
Click only once for faster results:

[ Choose "whole words" when searching for a word like age.]

all keywords, in any order at least one, that exact phrase
parts of words whole words

Submit your own question to Dr. Math

[Privacy Policy] [Terms of Use]

_____________________________________
Math Forum Home || Math Library || Quick Reference || Math Forum Search
_____________________________________

Ask Dr. MathTM
© 1994-2013 The Math Forum
http://mathforum.org/dr.math/