Prime Number Algorithm

Dijkstra [Dahl, Dijkstra, Hoare page 27-49] describes a prime number generating algorithm. On first reading it appears to be totally different to the Sieve of Erosthenes (SoE). However as I explain here it is just a more memory optimised version of SoE.

Sieve of Erosthenes

Let us start with the SoE. The problem is to compute the list of all prime numbers below a certain value say 100.  Write out all numbers sequentially as in

2, 3, 4, 5, 6, 7, 8, 9, 10…..100

Starting from 2 strike out all numbers that are a multiple of 2 and hence we have

2,3, ,5, ,7, ,9, , 11….

Next we use ‘3’ and strike out all numbers that are a multiple of 3 leaving us with

2,3,,5,,7,,,,11,…..

We then proceed with 5 and so on until we have reached the end of our list. This can be easily implemented on a computer as long the maximum number is small. The amount of memory required is proportional to the maximum.

Optimisation

The above solution does not scale. The memory requirements can be reduced significantly. To do that we could break the problem so that we consider, say, the first ten numbers then the next ten and so on.
Thus

[2,3,4,5,6,7,8,9,10]  -> {2,3,5,7}
{2,3,5,7} [11,12,13,14,...20] -> {2,3,5,7,11,13,17,19}
{2,3,5,7,11,13,17,19} [21,22,....30] -> {2,3,5,7,11,13,17,19, 23,29}

…and so on. Thus we consider only ten new numbers at every iteration. Hence the amount of memory required will be just what is required to store the prime numbers computed so far plus a small overhead, thus reducing significantly the memory requirements compared to a direct implementation of SoE.

Dijkstra’s algorithm

Now why do we need to proceed with ten candidates at a time? Why not  just use one? No particular reason… and that leads us to Dijkstra’s algorithm or that is  Dijkstra’s algorithm: candidate set contains only one element.

The following program in C++ is an implementation of the algorithm.

    int main()
    {
      using namespace std;
      vector<unsigned int> primes = { 2 };
      unsigned int k = 2;
      for (k = k + 1; k < MAX_VAL; k += 2)
      {
        bool isPrime = true;
        auto n = primes.size();
        for (auto i = 0; i < n; ++i)
        {
          if (k % primes[i] == 0)
          {
            isPrime = false;
            break;
          }
        }
        if (isPrime)
          primes.push_back(k);
      }
      for (auto k : primes)
      {
        printf("%d\n", k);
      }
    }

Notice that in Line 6, we increment k by 2, as we need to check only for odd numbers.

Other Optimisations

A few more optimisations are possible. If a number N is not prime then there exists at least one divisor X that is prime and less than or equal to \sqrt{N}.
Thus to test if a number N is prime we need to consider only all primes X <= \sqrt{N}. Now we don’t need to compute \sqrt{N}; all we need is to compute the square. For example consider a prime number say 7. All non-prime numbers less than 49 (or 7*7) are divisible by at least one prime number below 7. The next prime after 7 is 11. Hence all non-primes below 121 (11*11) are divisible by at at least one prime less than 11.

Thus we have


    int main()
    {
      using namespace std;
      vector<unsigned int> primes = { 2 };
      vector<unsigned int> q = primes;
      unsigned int index = 0;
      unsigned int square = primes[index] * primes[index];
      for (unsigned int k = 2 + 1; k < MAX_VAL; k += 2)
      {
        if (k >= square)
        {
          ++index;
          square = primes[index] * primes[index];
        }
        bool isPrime = true;
        for (unsigned int i = 0; i < index; ++i)
        {
          if (k % primes[i] == 0)
          {
            isPrime = false;
            break;
          }
       }
        if (isPrime)
        {
          primes.push_back(k);
        }
      }
      for (auto k : primes)
      {
        printf("%d\n", k);
      }
    }

One other optimisation that Dijkstra came up with is to avoid computing k % prime[i] ==0 as this involves a division. We want to avoid multiplication and division. This is done by maintaining a tracking vector q such that q[i] = n *prime[i] for some positive number n.

This modification is accomplished in the following code:


    int main()
    {
      using namespace std;
      vector<unsigned int> primes = { 2 };
      vector<unsigned int> q = primes;
      unsigned int index = 0;
      unsigned int square = primes[index] * primes[index];
      for (unsigned int k = 2 + 1; k < MAX_VAL; k += 2)
      {
        if (k >= square)
        {
          ++index;
          square = primes[index] * primes[index];
        }
        bool isPrime = true;
        for (unsigned int i = 0; i < index; ++i)
        {
          while (k > q[i])
            q[i] += primes[i];
          if (k == q[i])
          {
            isPrime = false;
            break;
          }
        }
        if (isPrime)
        {
          primes.push_back(k);
          q.push_back(k);
        }
      }
      for (auto k : primes)
      {
        printf("%d\n", k);
      }
    }

Notice that q[i] is intialised to prime[i] and it is modified only by adding prime[i] , as many times as required, ensuring that q[i] remains a multiple of prime[i] . At Line 20 we know that k <= q[i]. In addition we can state that q[i] is the smallest such value. Hence if k = q[i] it cannot be a prime. Thus we have replaced division by addition. Of course it does not make sense to replace n*q[i] with n additions of q[i]. Consider a case that for some value of k, k <= q[i]. Then in the next iteration when we consider k+1, it is likely that k+1 will either be <= q[i] or require just one more addition of prime[i]. The reason we use a while clause rather than an if clause at Line 18 is because we may break out of the loop if k is not prime and hence q[i] for larger values of i may not have the relationship k <= q[i].

What is the point of this exercise

While this algorithm is very elegant, you might wonder of what use it is. A good software engineer is after all a humble programmer and would rarely implement new algorithms, definitely not a prime number computer. The reason for this blog is to show how to use invariants. By enforcing q[i] = n *prime[i] at all times we are able to reason about code at other locations of the program. Thus using invariants in our code will improve code quality.

References

  1. O.-J. Dahl,E. W. Dijkstra,C. A. R. Hoare, “Structured Programming”, Academic Press, 1972, London
Advertisements

About The Sunday Programmer

Joe is an experienced C++/C# developer on Windows. Currently looking out for an opening in C/C++ on Windows or Linux.
This entry was posted in Algorithm, Software Engineering. Bookmark the permalink.

3 Responses to Prime Number Algorithm

  1. Pingback: Interesting Algorithms | Joe's Blog

  2. Pamala says:

    You post very interesting articles here. Your blog deserves much more traffic.

    It can go viral if you give it initial boost, i know useful
    tool that can help you, just search in google: svetsern traffic tips

  3. King says:

    You have got interesting content here. Your site can go viral,
    you need some initial traffic only. How to get initial traffic??
    Search google for: marihhu’s tips

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s