One of the standard problems in introductory calculus courses is to find the average distance between two randomly selected points inside a unit sphere. (This problem also comes up now and then in sci.math and rec.puzzles.) The popularity of this particular problem is probably due to the fact that it happens to lead to an integral that can be evaluated in "closed form" to give a nice explicit answer (36/35). However, for shapes other than a sphere the solution is not always so simple, although sometimes a parametric formulation of the distance density can be found to simplify the analysis.
Anyway, I've been trying to compile a catalog of the average distances (and powers of distances) within various shapes of various dimensions, and also the "distance densities" for these shapes. I'd appreciate any information that anyone can provide on this subject. So far I've looked mainly at circles (spheres) and squares (cubes) of different dimensions, starting with the classic problem of the solid unit sphere in 3D space. For completeness, I'll start with this well-known case:
Let R and r be the radial distances from the origin to two randomly chosen points in a unit sphere, and let w be the angle between these vectors. The distance between the two points is
sqrt[ R^2 - 2Rrcos(w) + r^2 ]
Covering just the case R>r, we need to triple-integrate this quantity over r, R, and w, and we need to weight the quantity in proportion to the fraction of the two-point state-space corresponding to each set of parameters.
For given values of r and R, the angle w defines a circle whose circumference is proportional to sin(w), so this is the weight for the w integration. Similarly each value of r and R defines a sphere of surface area proportional to r^2 and R^2, respectively, so these are the weights for the r and R integrations.
We can restrict our analysis to just the case R>r because the other case (R<r) is symmetrical and has the same distribution of distances. Therefore, we just need to integrate the distance function with the appropriate weights for the ranges r=[0,1], R=[r,1], and w=[0,pi]. Then we divide the result by the triple integral of just the weights (rR)^2 sin(w), which is 1/9. Thus, the problem reduces to the triple integral
1 1 pi / / / ______________________ 9 | r^2 | R^2 | sin(w) / R^2 + r^2 - 2Rrcos(w) dw dR dr (1) / / / 0 r 0
which is easily evaluated to give the familiar result 36/35. In fact, we can evaluate this integral with the distance function raised to any integral power, to give the average of the nth powers of the distances as (72*2^n)/(n+3)(n+4)(n+6).
A couple of comments can be made here. First, notice that the weight for w is very fortuitous, because the factor of sin(w) enables us to evaluate the integral in closed-form (using the easy integral for sin(x)*sqrt(A-Bcos(x)) ). In contrast, the seemingly simpler case of a unit DISK is actually more difficult because it lacks this convenient weight factor.
Second, you might notice that if you try to cover both the cases R>r and R<r at once by integrating over r=[0,1] and R=[0,1] you may get a result like 21/20 instead of 36/35. The problem is that the integral of sin(w) * sqrt[R^2 - 2Rrcos(w) + r^2] over the range w=[0,pi] (divided by the integral of the weight) is
which is formally symmetrical in R and r. Now, since R+r is always non-negative there's not much ambiguity in evaluating the left hand term in the numerator; we just take the positive value (R+r)^3, glossing over the fact that there's really a square root there in the 3/2 power, so we could have taken the negative root. However, the right hand term is a bit tricky: do we evaluate this as (R-r)^3 or (r-R)^3 ? The answer depends on whether R>r or R<r. In these two cases the above expression reduces to
R + (r^2)/3R if R > r
r + (R^2)/3r if R < r
so it's necessary (not just convenient) to treat the case separately. Fortunately, due to symmetry, the cases give the same distribution of distances, so we only need to treat one of them.
So much for the easy case. Now let's consider the distances on a unit disk. This case can be formulated in essentially the same way as with the unit sphere, except that the weights are different. Each angle w now represents only a single point, so it's weight is just 1. Each radius r and R now represents a circle with length proportional to r and R respectively, so these are the weights. The triple integral of these weights is pi/8, so the average distance on a unit disk can be expressed as
1 1 pi 8 / / / ______________________ -- | r | R | / R^2 + r^2 - 2Rrcos(w) dw dR dr (3) pi / / / 0 r 0
Unfortunately, this integral isn't as easy to evaluate as the one for the sphere, because it lacks the sin(w) factor. However, by messing around with some gamma functions we can show that the result is 128/(45pi). We can also evaluate the above integral for any integer power of the distance function. For odd powers of the form 2k-1 the general result is
Incidentally, this last formula shows that if c[n] denotes the nth Catalan number, then the average (2k)th power of distances on a unit disk is just c[k+1]/(k+1).
Of course, as an alternative to (3) we could just integrate the unit disk by scanning two points (x,y) and (X,Y) orthogonally across the disk. The weight factors are all 1 in this case, and they integrate to the squared area of the disk (pi^2), so we have the quadruple integral
1 a 1 b 1 / / / / _________________ -- | | | | /(x-X)^2 + (y-Y)^2 dY dX dy dx (4) pi^2 / / / / -1 -a -1 -b
where a = sqrt(1-x^2) and b = sqrt(1-X^2). This gives the same results as (3), but it's slightly more laborious to integrate.
Now let's consider the distribution of distances on a 1x1 unit square. We could use the approach of equation (4) and just integrate the distances between two points (x,y) and (X,Y) as each parameter ranges from 0 to 1, but the resulting quadruple integral is not very easy to evaluate. A much more efficient approach is to notice that the distances on a unit square are distributed according to the parametric formulas ___________ s(u,v) = / u^2 + v^2 dens(u,v) = 4(1-u)(1-v) (5)
Therefore, we just need to integrate dens(u,v)*s(u,v) as u and v range from 0 to 1. The orthogonal way of formulating this double- integral is
This is a big improvement over the quadruple integral, but it still is not easily evaluated in closed form. Let's try polar coordinates in this parametric space by setting u = r cos(w) and v = r sin(w) and integrating over the region u<v be letting r range from 0 to 1/cos(w) at each w from 0 to pi/4. For any incremental slice of w the weight at r is proportional to r, and of course the integral of r over this region u<v is just 1/2, so we have the integral
By incrementing the exponent of r in (6) we can evaluate the average of the nth powers of distances on the unit square. In general the results are of the form
A + B sqrt(2) + C ln(1+sqrt(2) ------------------------------ (8) D
where the values of A,B,C,D are as shown below
n A B C D --- --- ---- ---- ----- 1 2 1 5 15 2 1 0 0 3 3 8 17 21 210 4 17 0 0 90 5 16 73 45 1008 6 29 0 0 210 7 384 3239 1155 47520 8 187 0 0 1575 etc.
Now, what about the unit cube, or the unit 4D hyper-cube, etc.? The nice thing about the parametric distance density equations (5) is that they immediately generalize to higher dimensions. In general the parametric equations for the distance density of a d-dimensional unit cube are
s(x1,x2,..,xd) = sqrt[ x1^2 + x2^2 + ... + xd^2 ]
dens(x1,x2,..,xd) = 2^d (1-x1)(1-x2)...(1-xd)
For even powers of the distance we can immediately evaluate the d-dimensional analogs of equation (6). We find that the average squared distance in a d-dimensional unit cube is simply d/6. (This gives a nice trivia question: In what dimensional space is the average squared distance in a unit cube equal to unity?) The average nth powers of distances in a d-dimensional unit cube are given by the following formulas for the first few even values of n:
n average nth power of distance in d-cube --- ---------------------------------------------- 2 (d)/(1*6) 4 (5d^2 + 7d)/(2*90) 6 (35d^3 + 147d^2 + 88d)/(6*1260) 8 (175d^4 + 1470d^3 + 2789d^2 + 606d)/(24*9450)
What most intrigues me about the above is the parametric density formulas for the distances inside a d-dimensional cube. Notice that in each case the density of the vector V = [x1,x2,..xd] within a bounded region B is proportional to the intersection of B with a copy of B shifted by the vector V. It seems plausible to me that this may be true in general. For example, take an arbitrary tetrahedron T and a vector V. Is it true that the density of V in T is proportional to the intersection of T with a copy of T offset (without rotation) by V?
================================================================== MathPages at --> http://www.seanet.com/~ksbrown/ ==================================================================