HEURISTIC PROOFS For many of us non-mathematicians (my background is in physics), a real mathematician's proof of the Prime Number Theorem (PNT) is beyond our understanding. As a result, many heuristic proofs can be found on the internet, meaning that the techniques employ intuitive judgements that are not necessarily correct. I developed my heuristic proof for the PNT in 1977, thinking at the time that it was legitimate.
CONSECUTIVE INTEGER HEURISTIC PROOF The heuristic proofs that I have found online are probabilistic arguments that treat the prime numbers as if they were randomly distributed. One type of proof begins with the probability P(x) that x is a prime number and consider the probability P(x+1) that the next integer x+1 is a prime [1, 2, 3]. The proof therefore depends on the possibility of having two consecutive integers that are both prime numbers, which is not realistic, except for the integers 2 and 3. The resulting differential equation has the solution P = 1/ln(cx), where the constant c is undetermined.
CONSECUTIVE PRIME HEURISTIC PROOF Another heuristic proof considers two large consecutive primes . A series of heuristic arguments leads to the differential equation dP/dx = -P(x) P(sqrt(x))/x and arrives at the solution P(x) = 0.5 / ln(x). (The numerator should be 1.)
MY HEURISTIC PROOF BASED ON THE SQUARES OF CONSECUTIVE PRIMES I developed this proof in 1977 when I was a mailman, searching for a job in physics. At the time, I thought that it was a legitimate proof. The proof follows the Sieve of Eratosthenes, but applied to two consecutive regions between the squares of primes.
Consider two ranges of integers: Range A: R_A = (p_n-1)^2 - (p_n)^2 the number of integers from from (p_n-1)^2 to (p_n)^2 -1 Range B: R_B = (p_n)^2 - (p_n+1)^2 the number of integers from (p_n)^2 to (p_n+1)^2 -1
Range A has been sieved by prime divisors up to p_n-1 so that only primes remain. The average density of primes in the range is rho_A where:
rho_A = (#primes in the range) / R_A
Now use the same sieve process that was applied to Range A and apply it to Range B, using the same prime divisors up to p_n-1.
ASSUMPTION 1: Assume that the sieve for Range A, removes the same proportion of integers from Range B as it did from Range A. This means that Range B now has a proportion rho_A of its original R_B integers remaining. The difference is that, in Range A, all the remaining integers are primes but, in Range B, there are still a few remaining non-primes, which will be removed by the final pass of the sieve, using the new prime divisor p_n.
ASSUMPTION 2: Assume that the final pass of the sieve, using prime divisor p_n, leaves a proportion (1- 1/p_n) of the remaining integers unscathed and they are all primes, with the result that:
rho_B = (1-1/p_n) rho_A [Eq'n 1]
Let y = p_n and rewrite Eq'n 1 so that the change in density (rho_B - rho_A) is given by:
delta rho(y^2) ~ - rho(y^2) / y [Eq'n 2]
The prime p_n+1 ~ p_n + 1/rho(p_n) and so the range R_n is given by:
My Consecutive-Squares-of-Primes Heuristic appears to correct the problems found in some other heuristic proofs, and yet it cannot be considered a legitimate mathematical proof. What is wrong with it? Let's examine the first equation. Mertens' Third Theorem says that:
1/ln(p_n) = 1.781 Prod_n for large n [Eq'n 8]
where Prod_n = (1-1/2) (1-1/3) (1-1/5) .... (1-1/p_n)
Knowing that the the solution of the PNT is 1/ln(x), that tells us that
rho((p_n)^2) ~ (1.781/2) Prod_n and rho((p_n-1)^2) ~ (1.781/2) Prod_n-1
rho((p_n)^2) / rho((p_n-1)^2) = (1-1/p_n)
This means that Equation 1 is correct. Equations 2 to 7, even though not mathematically very rigorous, are also essentially correct. So, what is wrong? The problem is that Equation 1, even though correct, was rationalized on the basis of two incorrect assumptions. That is deadly in mathematics. Two wrongs do not make a right, even if they give you the right answer.
THE FINAL PASS OF THE SIEVE Looking at Equation 1, it seems obvious that the largest prime divisor p_n, on the final pass of the sieve, is removing a fraction 1/p_n of the integers that remain after previous passes of the sieve, which used smaller prime divisors. This is Assumption 2, one of the two assumptions used to derive the equation.
Consider the range of 72 integers from 17^2 to 19^2 -1 (289 to 360). There are 11 primes in this range. In the final pass of the Sieve of Eratosthenes, 17 removes 2 integers: 17x17 & 17x19. Therefore there must have been 11 + 2 = 13 integers left in the range before the final pass. We have assumed that the final pass removes a proportion 1/17, but the actual proportion is 2/13 = 2.615/17. So, in this example, the final pass removes approximately 2.6 times as many integers as assumed.
I checked the 18 ranges above the squares of p_481 to p_498 in the same manner as the previous example (p_481 = 3433; p_481^2 = 11,785,489). For these 18 cases, p_n removed either 2 or 3 non-primes in every case, with an average of 2.5. The proportion removed was between 1.5 and 8.4 times the expected ratio of 1/p_n, with a weighted average of 2.9 times the expected ratio. The actual number of integers removed from the range by the final pass is always at least two. From Eq'ns 3 and 7, for large n, the number of primes in the range is approximately p_n so, if k integers are removed on the final pass, the proportion is k/p_n. k must be 2 or more and, on average, seems to be approximately 2.5 or a little higher.
In the range from (p_n)^2 to (p_n+1)^2 -1, the prime divisor p_n on the final pass always removes (p_n)^2 and always removes (p_n)(p_n+1). Using reasoning like Eq'n 3, it can be shown that (p_n)(p_n+2) is approximately the same size as (p_n+1)^2, which is the end of the range. So, about half the time, (p_n)(p_n+2) is in the range and half the time it is in the next range. The average of 2.5 integers that was seen in the 18 cases that were checked is therefore quite reasonable. From this point of view, it now seems unreasonable to expect the final pass to remove a proportion 1/p_n of the integers that remained after previous passes of the sieve, which used prime divisors smaller than p_n and have nothing to do with removing integers like (p_n)^2, (p_n)(p_n+1) or (p_n)(p_n+2), which are the integers that p_n removes. So, here is the situation: Equation 1 is correct, but 1/p_n is not the proportion of integers removed on the final pass, which was Assumption 2. The conclusion is correct, but one of the two premises is incorrect. That means the other premise must also be incorrect. Both Assumption 1 and Assumption 2 are wrong, but they were used to derive Equation 1, which is correct. This demonstrates how dangerous heuristic reasoning can be, and nowhere more so than in dealing with prime numbers.
CONCLUSION Equation 1 looks deceptively simple and yet, I have no way to rationalize it. As demonstrated earlier, if we already know that the PNT is true and if we use Mertens' Third Theorem, it demonstrates that Equation 1 is correct. But, if the PNT is used to prove Equation 1, then Equation 1 cannot be used to prove the PNT. Before, I rationalized Equation 1, based on two assumptions that turned out to be incorrect. I would dearly like to know a legitimate rationalization of Equation 1, but fear that the explanation is buried deep in complicated number theory that I won't understand. It is a humorous situation to have spent considerable effort deriving a fine-looking proof, only to find out that the satisfying understanding is an illusion, the result of wrong assumptions conspiring together to tell the truth. I am not alone. Many of you other heuristic-proofers out there are in the same boat!