The Sieve of Erathostene


Some history

One of the oldest and simplest yet interesting algorithm that involves arbitrary large data is the sieve of Erathostene.

Erathostene (?-?) was a mathematician of ancient Greece [more info sought here; to be added later].

This algorithm was once used as a speed "benchmark" for languages and compilers (which is lame in the first case, because by inlining arbitrary many steps of the sieve, you could speed the whole thing arbitrarily anyway, and in the second, as the sieve would eventually be specifically recognized by compilers to speed up benchmarks).
Here we use it not to benchmark speed, but to illustrate language expressiveness (well, lack of expressiveness, as for existing languages).


The mathematical idea behind the sieve

Here is what Erathostene found centuries BC.
  • A natural number m is a multiple of some natural number n if and only if there is a natural number k such that m=k n. We then say that n divides m, or that m is divisible by n, or that n is a divisor of m. We also call k the multiplicity factor.
  • A natural number is said to be prime if and only if it is divisible only by itself and 1. Thus a natural number m is not a prime if and only if it is divisible by a natural number strictly greater than 1 and strictly lesser than m.
  • We also know that any number not prime will have a prime divisor: take the least divisor strictly greater than 1.
  • Finally, we know that the square of this least prime divisor is not greater than the number: when you divide the number by the prime divisor, you obtain a new number stricly lesser than the original whose divisors will also divide the original one; hence the least one will be greater than the least of the original one.
  • Then, to find all primes lesser than some large (well, not too large either) number N, just consider every number below N, and eliminate all the multiples each considered number; the remaining will be the primes.
  • Now, we see that every time a marked number, its multiples will also be multiples of its least prime divisor; hence you need not mark them a second time.
  • We also see that when a prime p is reached, all multiples below its square are already marked, as the multiplicity factork being lesser than p, it will already be marked as a multiple of k's least prime divisor.
  • Thus, we can also stop marking when the square of the considered number is greater than N.


    Programming the sieve

    The idea of the sieve is simple; but then, you can deduce infinitely many programs based on this idea. Existing computer languages would require you to rewrite each from scratch. But in the same way as a human doesn't re-learn the sieve from scratch each time he sees it, we can and should have computers understand it once and for all. Here is how we could do:

    The sieve program in various programming languages and paradigms


    Variations on the sieve

    Here is the things we could tell the computer so he can enhance the implementation of the sieve:
    • Efficiency annotations
      • Eliminate multiples only from the square of the prime on.
      • Thus stop when the square exceeds the numbers to sieve.
      • Use a well-known egregious identity to compute the square together with the
      • Statically or dynamically limit (and change) the precision of integer computations as needed.
      • Use an array to implement the variable sieving function if a low enough bound is (locally) known to the sieve.
      • Else use the list of previously known primes as a parameter for the sieving function instead of dynamically producing (pseudo- or real) code to test each new prime.
      • Arbitrary dynamic combinations of the former methods may be used !
      • Expand the first steps of the sieve (i.e. inline primes up to 2,3,5, or 7, etc) to speed up the whole thing: typical programs only inline the first step to gain a factor two on space (when using arrays) and time; another factor two may be gained by inlining the four first steps instead; for growing zones, this factor can be dynamically increased, etc.

    • Input/Output annotations
      • The out() (or "say") input procedure defines what to do with a number known to be prime
      • The function may be executed step-by-step, or prime-by-prime, or continuously as needed.
      • The sieve results may be buffered or not; the buffer may be shared or not between smaller or larger sets of programs and running instanciations of a same persistent interrupted program; if needed results are not available, a sieve would of course be re-run.
      • A proof that the sieve actually yields the prime numbers shall be included, exported, and this fact pointed to in the standard list of algorithms, so that this algorithm will be automatically used by whoever wants prime numbers without explicitly programming.


    To Do on this page

  • Add more annotations, particularly about generating all primes below N or factorizing N.
  • Show how the real program would look like on our HLL.


  • Back to the list of HLL Examples.
  • Up to the HLL subproject.


    Page Maintainer:
    Faré -- rideau@clipper.ens.fr