The sieve program in various programming languages and paradigms
lame "C" version
char * sieve (int max)
{
int n, n2, m;
char * sieved ;
/* malloc'ing */
sieved = malloc(max) ;
assert(sieved) ;
memset(sieved,0,max) ;
/* loop */
for (n = 2, n2 = 4; n2<=max ; n2+=(n++<<1)+1 ) {
if (!sieved[n]) {
out(n) ;
for (m=n2; m <= max ;m+=n) sieved[m] = 1 ;
}
}
return sieved ;
}
Here are many reasons why C sucks, that we can see in this program:
- In this example, the only parameter is the upper bound of a range to sieve
to find which integers inside are primes.
Instead, we could have used the very same algorithm to compute
the indefinite list of prime numbers,
a function that associates its lowest prime factor to each integer in a range,
or an indefinite sieving function.
But in C, these mean complete rewrite of the function.
- The caller must know the low-level implementation of the result
structure to use it.
- Every type or function definition being global in C, object-oriented
style is made difficult, as the programmer must manually resolve name clash,
and/or add indirection levels; this wastes the programmer's time and leads
to inefficient code. In C++, the problem is only partially solved,
as classes are still global, statically bound, and never value-dependent.
- the preprocessor could be used to increase the genericity of the function,
but its use is heavy, very unpowerful, static, and especially horrible if
multiple instances of preprocessed the function are to be used.
- If you want time or space efficiency, you must explicitly inline
the sieving function for a right quantity of small prime numbers needed to
achieve efficiency, and pack/unpack data from the array. Yeech !
- allocating memory is painful; either the caller or the callee must do it.
If the caller does it, it can sometime be optimized, but is painful everytime.
If the callee does it, it's painful only once (unless there's an exception
mechanism, but this would be extremely painful), but it can't be
optimized, and the caller must know the internals anyway, to have a consistent
deallocation policy.
- You must explicitly give the low-level implementation of arrays,
lists, inlined code, allocation procedure. You can't let the system optimize
them according to the way the function will actually be used.
- The caller must In this example, we assumed that out() was a function
to call whenever a prime number is detected. Using data-driven style in the
program that uses this information would require either complete rewrite
of the program, or explicit forking (or use of threads).
To conclude, it's impossible to express efficiently the generic algorithm
in a one C program.
Side-effect message-passing version
var n : int = 2 ;
sieved : int->bool = function x|->false ;
fun step () =
if not sieved (n) then
begin
out(n) ;
sieved := function x|-> old sieved x or n divides x
end ;
loop
begin
step () ;
increment(n)
end ;
Pure lazy functional version
let Eratosthene's_Sieve =
let step n sieved =
let next_num = n+1 in
if not (sieved num)
then
let new_sieved m = (sieved m) or (n divides m) in
cons_stream n (step next_num new_sieved)
else step next_num sieved in
prime_stream = step 2 (fun x->false)
"Natural language" version
Consider knowing a natural number n to not be a prime as being false.
Consider number two.
A sieve step is
if the considered natural number is not known to not be a prime,
then
say that the considered natural number is prime, and
consider knowing a natural number to not be a prime as
previously knowing that and the considered natural number
not dividing this one.
To sieve is to loop on the sieve step, each time considering the next number.
Recursively export the verb to sieve.
Lesser natural language interpreters may require a more thorough quoting,
parenthesing, annotating and explicit typing than a greater one.
Variations on the sieve
Here is the things we could tell the computer so he can enhance the
implementation of the sieve:
Efficiency annotations
- Eliminate multiples only from the square of the prime on.
- Thus stop when the square exceeds the numbers to sieve.
- Use a well-known egregious identity to compute the square
together with the
- Statically or dynamically limit (and change) the precision
of integer computations as needed.
- Use an array to implement the variable sieving function
if a low enough bound is (locally) known to the sieve.
- Else use the list of previously known primes as a parameter
for the sieving function instead of dynamically
producing (pseudo- or real) code to test each new prime.
- Arbitrary dynamic combinations of the former methods may be used !
- Expand the first steps of the sieve (i.e. inline primes
up to 2,3,5, or 7, etc) to speed up the whole thing:
typical programs only inline the first step to gain a factor
two on space (when using arrays) and time;
another factor two may be gained by inlining the four first
steps instead;
for growing zones, this factor can be dynamically increased,
etc.
Input/Output annotations
- The out() (or "say") input procedure defines what to do with
a number known to be prime
- The function may be executed step-by-step, or prime-by-prime,
or continuously as needed.
- The sieve results may be buffered or not;
the buffer may be shared or not
between smaller or larger sets of programs and
running instanciations
of a same persistent interrupted program;
if needed results are not available, a sieve would of course
be re-run.
- A proof that the sieve actually yields the prime numbers shall
be included, exported, and this fact pointed to in the
standard list of algorithms, so that this algorithm will be
automatically used by whoever wants prime numbers without
explicitly programming.
To Do on this page
Add more annotations, particularly about generating all primes below N or
factorizing N.
Show how the real program would look like on our HLL.
Back to the list of HLL Examples.
Up to the HLL subproject.
Page Maintainer:
Faré
-- rideau@clipper.ens.fr