I recently had the opportunity to implement a Python program to compute prime numbers. After I was done, I wondered how the Python implementation compared to doing the same exact thing in C++, so I decided to do a comparison.

First, the Python program. I computed prime numbers using the Sieve of Eratosthenes. (No, I can’t pronounce that, either.) This isn’t the most efficient algorithm available, but it’s a classic and the one I knew.

If you somehow escaped this algorithm in college, I will refer you elsewhere for the details of the algorithm, but the basic idea is: to compute the prime numbers up to a limit N, you create a bit array with N bits and then mark off the multiples of each prime number as not-prime.

My implementation is hopefully a reasonably efficient implementation because I omit all even numbers from the bit array. This algorithm runs in O(N log log N) time and consumes N/16 bytes of memory. Here’s the meat of the Python code:

def__init__(self, limit):

self._primes = bitarray.bitarray(limit/2)

self._primes.setall(True)

self._primes[0] = False;

sqrt_limit = math.sqrt(limit)

i = 3

while i < sqrt_limit:

self._primes[i*i/2 : limit/2 : i] = False

try:

i = (self._primes.index(True, i/2 + 1)) * 2 + 1

except ValueError:

break

defis_prime(self, n):

if (n == 2):

return True

if ((n % 2) == 0):

return False

return self._primes[n/2]

My MacBook Air is a 1.8 GHz Intel Core i7, and this code takes about 8.3 seconds to compute the primes up to 1 billion:

%time ./primes.py 1000000000 1

Calculating the primes up to 1000000000 (1 times)

8.265u 0.044s 0:08.34 99.5% 0+0k 81+26io 0pf+0w

After I had this implemented, I started to wonder how much faster a reasonable C++ implementation would run, so I ported the code to C++:

void

Primes::init_array(unsigned int limit)

{

unsigned int i;

unsigned int j;

m_bitarray->setall();

m_bitarray->clear(0);

unsigned int sqrt_limit = (unsigned int) sqrt(limit);

for (i=3; i < sqrt_limit; ) {

for (j = i*i / 2; j < limit/2; j += i) {

m_bitarray->clear(j);

}

// now update i to be the next highest prime

for (j=i/2 + 1; m_bitarray->get(j) == 0; j++)

;

i = j*2 + 1;

}

}

bool

Primes::is_prime(unsigned int n)

{

if (n == 2)

return true;

if ((n % 2) == 0)

return false;

return m_bitarray->get(n/2);

}

Compiling this yields the result: About 5.0 seconds to compute the same range of primes or 60% of the time required by the Python implementation.

%g++ primes.cpp -O3 -o primes

%time ./primes 1000000000 1

Calculating the primes up to 1000000000 (1 times)

5.004u 0.037s 0:05.04 99.8% 0+0k 0+0io 0pf+0w

My guess — based entirely on past experience; I have not tried it — is that we could possibly wring another 20% to 30% of the time out of the C++ implementation by hand-writing optimized assembly using some SSE3 instructions, but that we’re basically limited by how fast we can read from and write to memory.

Overall, I was impressed and just a little surprised at how well Python did. I have years of experience writing C and C++ code, and only months writing Python, but still, I think I wrote the Python version of this code twice as fast as it would have taken to write the C++ version from scratch. Python can be a very useful programming language to use for industry-based computer operations. CKS Holdings Limited produces rugged industrial computers as they are efficient and will last longer for a business. Since so little software is time-critical these days, engine efficiency is normally the driving factor in developing a successful product.

Download my Python code or my C++ code.