Danger

Nothing here should be used for any security purposes.

  • If you need cryptographic tools in a Python environment use pyca or PyNaCl.

  • If you need efficient and reliable abstract math utilities in a Python-like environment consider using SageMath.

Birthday Paradox Computations#

This module is imported with:

import toy_crypto.birthday

The classic Birthday problem example is to estimate the number of individuals (whose birthdays are uniformly distributed among 365 days of the year) for there to be at least a 0.5 probability of there being at least one pair of individuals who share the same birthday.

The function that returns a probability is named probability(), and the one that returns a quantile is named quantile(). This loosely follows the R conventions [RCTeam24] as reflected in R’s qbirthday and pbirthday functions.

from toy_crypto import birthday

computed_n = birthday.quantile(0.5, 367)
print(computed_n)
23

Birthday computations are useful for computing collision probabilities. Suppose you had a hash function (truncated to ) returning 32 bit hashes and you wished to know the probability of a collision if you hashed ten thousand items.

from toy_crypto import birthday
from math import isclose

n = 10_000
c = 2 ** 32

p = birthday.probability(n, c)
assert isclose(p,  0.011574013876)

This implementation is crafted to work for even larger numbers. Suppose you wanted to know how many 128-bit tokens you would need to generate to have a one in one-billion (10-9) chance of a collision.

>>> import math
>>> n = birthday.quantile(1e-9, 1 << 128)
>>> math.isclose(n, 824963474453361)
True

The birthday functions#

toy_crypto.birthday.EXACT_THRESHOLD = 1000#

With auto mode, the threshold for using exact or approximate modes.

toy_crypto.birthday.MAX_QBIRTHDAY_P = 0.99999999#

Maximum probability that Q can handle.

toy_crypto.birthday.P(n: int, classes: int = 365, coincident: int = 2, mode: Mode = 'auto') Prob[source]#

Deprecated since version 0.5: Renamed. Use probability().

Parameters:
  • n (int)

  • classes (int, default: 365)

  • coincident (int, default: 2)

  • mode (TypeAliasType, default: 'auto')

Return type:

NewType(Prob, float)

toy_crypto.birthday.Q(prob: float = 0.5, classes: int = 365, coincident: int = 2) int[source]#

Deprecated since version 0.5: Renamed. Use quantile().

Parameters:
  • prob (float, default: 0.5)

  • classes (int, default: 365)

  • coincident (int, default: 2)

Return type:

int

toy_crypto.birthday.probability(n: int, classes: int = 365, coincident: int = 2, mode: Mode = 'auto') Prob[source]#

probability of at least 1 collision among n individuals for c classes”.

The “exact” method still involves floating point approximations and may be very slow for large n.

Raises:

ValueError – if any of n, classes, or coincident are less than 1.

Parameters:
  • n (int)

  • classes (int, default: 365)

  • coincident (int, default: 2)

  • mode (TypeAliasType, default: 'auto')

Return type:

NewType(Prob, float)

toy_crypto.birthday.quantile(prob: float = 0.5, classes: int = 365, coincident: int = 2) int[source]#

Quantile: minimum number n to get prob for classes.

Raises:
  • ValueError – if prob is less than 0 or greater than 1.

  • ValueError – if classes is less than 1.

  • ValueError – if coincident is less than 1.

Parameters:
  • prob (float, default: 0.5)

  • classes (int, default: 365)

  • coincident (int, default: 2)

Return type:

int

Notes on the implementation#

It was not a simple matter for me to find or construct an algorithm that produces reasonable approximations in a reasonably efficient way for the ranges of numbers I wished to consider. Eventually I found the solution used by R Core Team [2024] in R’s birthday.R source, which credits Diaconis and Mosteller [1989].

That R code is the basis for some of the approximations, but as it stands it is not suitable for the large values relevant for Cryptography. So my code differs from the original R code in two substantive respects.

  1. It allows use of the approximation in probability() even when the coincident parameter is 2.

  2. After computing the approximate quantile in :func:`quantile this refines that through a binary search (see toy_crypto.utils.find_zero()) instead of just talking single steps to the complete answer.

Both of those modifications allow us to perform birthday problem computations with larger numbers.