Danger

Nothing here should be used for any security purposes.

  • If you need cryptographic tools in a Python environment use pyca.

  • If you need efficient and reliable abstract math utilities in a Python-like environment consider using SageMath.

Sieve of Eratosthenes

This module is imported with:

import toy_crypto.sieve

The module contains classes for the factorization of numbers and for creating a sieve of Eratosthenes.

Why three separate implementations?

It is reasonable to wonder why I have three distinct implementations. There are reasons, but first let me give an overview of the major differences.

  • The BaSieve class

    This uses the bitarray package underlyingly. It is by far the most time and memory efficient of the implemenations here. The bitarray package is written in C, and as a consequence it cannot be installed in certain environments.

  • The SetSieve class

    This is a pure Python implementation that uses set when creating the sieve. This consumes a lot of memmory. For a brief time during initialization there will be a set with all integers from 2 through n. The set will rapidly have elements removed from it, but the end results will still contain all of the primes as Python integers.

  • The IntSieve class

    This is also a pure Python implementation that uses a Python integer as the sieve and uses bit twidling when creating the sieve. This makes it memory efficient, but it is excruciatingly slow.

Roughly speaking, the reasons for having three (so far) implementations is that I wanted to be able to run things in environments in which bitarray is not available. That led the to SetSeive. An off-hand comment I made in some discussion about how immutable int means that a seive implement with int would be slow was challanged. I was curious to see whether an int-based implementation would work, so that led to IntSieve.

It turns out that using native Python integers was enormously slower than I could have imagined.

Note

A thing that I really like about Python, particulary for playing with cryptographic notions, is that there is just one type of int. I don’t have to go to a big integeer library when numbers get too large.

My observation about ineffiencies of bit twidling in Python acknowleging a limitation that follows from a very reasonable design choice.

Graph showing that IntSieve creation time is really slow

Seconds to create sieves

Time it takes in seconds for sieves of various sizes from 100 to 100000 to be created by the different classes.

The very real time differences between the creating a BaSieve and a SetSieve is obscured in figure Seconds to create sieves by the enormous amount of time it takes to construct an IntSieve. So here is a graph showing the times just for BaSieve and a SetSieve.

Graph showing that Sieve is more efficient than SetSeive

Seconds to create sieves (without IntSieve).

Specifically on my system it took approximately 0.011 seconds to create a sieve of all numbers less than or equal to 1 million using the bitarray-based BaSieve, 0.198 seconds with SetSeive, and a minute (59.796 seconds) with IntSieve. So bitaray was nearly 20 times faster than the set-based sieve construction and more than 5000 times faster than the integer-based construction for a sieve size of one million.

The algorithm

The algoritm for creating the sieve is the same for all three classes but has fiddly differences due to how the seive is represented. This, pared down and modified so as to work all by itself, version from the SetSieve is probably the most readable.

from math import isqrt

def make_sieve(n: int) -> list[int]:

    # This is where the heavy memory consumption comes in.
    sieve = set(range(2, n + 1))

    # We only need to go up to and including the square root of n,
    # remove all non-primes above that square-root =< n.
    for p in range(2, isqrt(n) + 1):
        if p in sieve:
            # Because we are going through sieve in numeric order
            # we know that multiples of anything less than p have
            # already been removed, so p is prime.
            # Our job is to now remove multiples of p
            # higher up in the sieve.
            for m in range(p + p, n + 1, p):
                sieve.discard(m)

    return sorted(sieve)
>>> sieve100 = make_sieve(100)
>>> len(sieve100)
25
>>> sieve100[:5]
[2, 3, 5, 7, 11]
>>> sieve100[-5:]
[73, 79, 83, 89, 97]

The Sievish Protocol

protocol toy_crypto.sieve.Sievish[source]

Methods available for all Sieve-like classes.

This is primary of use for testing, where one might need to write functions that interact with any of the Sieve classes. This also would probably make more sense as an abstract class instead of a Protocol.

This protocol is runtime checkable.

Classes that implement this protocol must have the following methods / attributes:

__int__() int[source]

Sieve as an integer with 1 bits representing primes.

Most significant 1 bit represents the largest prime in the sieve. For example if s is a sieve of size 5, int(s) returns 44 which is equivalent to 0b101100.

Return type:

int

property count: int

The total number of primes in the sieve

classmethod from_int(n: int) S[source]

Returns a new sieve of primes from the bits of n.

Parameters:

n (int)

Return type:

TypeVar(S)

classmethod from_list(primes: list[int], size: int | None = None) S[source]

Returns a new sieve of primes from list.

If size is not specified it will be set to the largest value in primes

Raises:

ValueError – if primes is empty

Parameters:
Return type:

TypeVar(S)

classmethod from_size(size: int) S[source]

Returns a new sieve of primes less than or equal to size.

Parameters:

size (int)

Return type:

TypeVar(S)

property n: int

The size of the sieve, including composites.

The number n such that the sieve contains all primes <= n.

nth_prime(n: int) int[source]

Returns n-th prime.

Raises:
Parameters:

n (int)

Return type:

int

primes(start: int = 1) Iterator[int][source]

Iterator of primes starting at start-th prime.

The 1st prime is 2. There is no zeroth prime.

Raises:

ValueError – if start < 1

Parameters:

start (int, default: 1)

Return type:

Iterator[int]

The Sieve alias

class toy_crypto.sieve.Sieve

Sieve will be an alias for BaSieve if bitarray is available, otherwise it will be assigned to some other sieve class.

The concrete classes

The BaSieve class

class toy_crypto.sieve.BaSieve(data: bitarray, size: int | None) None[source]

Sieve of Eratosthenes.

The good parts of this implementation are lifted from the example provided with the bitarray package source.

This depends on bitarray package.

Parameters:
  • data (bitarray)

  • size (int | None)

Sieve from bitarray, treated as up to size.

Raises:

ValueError – if size > len(data)

Parameters:
  • data (bitarray)

  • size (int | None)

property count: int

The number of primes in the sieve.

classmethod from_int(n: int) Self[source]

Returns a new sieve of primes from the bits of n.

Parameters:

n (int)

Return type:

Self

classmethod from_list(primes: list[int], size: int | None = None) Self[source]

Returns a new sieve of primes from list.

If size is not specified it will be set to the largest value in primes

Raises:

ValueError – if primes is empty

Parameters:
Return type:

Self

classmethod from_size(size: int) Self[source]

Returns a new sieve of primes less than or equal to size.

Parameters:

size (int)

Return type:

Self

property n: int

The size of the sieve, including composites.

The number n such that the sieve contains all primes <= n.

nth_prime(n: int) int[source]

Returns n-th prime.

Raises:
Parameters:

n (int)

Return type:

int

primes(start: int = 1) Iterator[int][source]

Iterator of primes starting at start-th prime.

The 1st prime is 2. There is no zeroth prime.

Raises:

ValueError – if start < 1

Parameters:

start (int, default: 1)

Return type:

Iterator[int]

The SetSieve class

class toy_crypto.sieve.SetSieve(data: list[int], size: int | None = None) None[source]

Sieve of Eratosthenes using a native python set

This consumes an enormous amount of early in initialization, and a SetSieve object will contain a list of prime integers, so even after initialization is requires more memory than the the integer or bitarray sieves.

Parameters:

Returns sorted list primes n =< n

A pure Python (memory hogging) Sieve of Eratosthenes. This consumes lots of memory, and is here only to illustrate the algorithm.

Parameters:
property count: int

The total number of primes in the sieve

classmethod from_int(n: int) Self[source]

Returns a new sieve of primes from the bits of n.

Parameters:

n (int)

Return type:

Self

classmethod from_list(primes: list[int], size: int | None = None) Self[source]

Returns a new sieve of primes from list.

If size is not specified it will be set to the largest value in primes

Raises:

ValueError – if primes is empty

Parameters:
Return type:

Self

classmethod from_size(size: int) SetSieve[source]

Returns a new sieve of primes less than or equal to size.

Parameters:

size (int)

Return type:

SetSieve

property n: int

The size of the sieve, including composites.

The number n such that the sieve contains all primes <= n.

nth_prime(n: int) int[source]

Returns n-th prime.

Raises:
Parameters:

n (int)

Return type:

int

primes(start: int = 1) Iterator[int][source]

Iterator of primes starting at start-th prime.

The 1st prime is 2. There is no zeroth prime.

Raises:

ValueError – if start < 1

Parameters:

start (int, default: 1)

Return type:

Iterator[int]

The IntSieve class

class toy_crypto.sieve.IntSieve(data: int) None[source]

A pure Python (using a large int) Sieve of Eratosthenes.

Parameters:

data (int)

property count: int

The total number of primes in the sieve

classmethod from_int(n: int) Self[source]

Returns a new sieve of primes from the bits of n.

Parameters:

n (int)

Return type:

Self

classmethod from_list(primes: list[int], size: int | None = None) Self[source]

Returns a new sieve of primes from list.

If size is not specified it will be set to the largest value in primes

Raises:

ValueError – if primes is empty

Parameters:
Return type:

Self

classmethod from_size(size: int) IntSieve[source]

Returns a new sieve of primes less than or equal to size.

Parameters:

size (int)

Return type:

IntSieve

property n: int

The size of the sieve, including composites.

The number n such that the sieve contains all primes <= n.

nth_prime(n: int) int[source]

Returns n-th prime.

Raises:
Parameters:

n (int)

Return type:

int

primes(start: int = 1) Iterator[int][source]

Iterator of primes starting at start-th prime.

The 1st prime is 2. There is no zeroth prime.

Raises:

ValueError – if start < 1

Parameters:

start (int, default: 1)

Return type:

Iterator[int]