As said before, RSA algorithm rests on simple fact: multiplying two large numbers is easy, but factoring the result into original primes is incredibly difficult. Lets see how difficult it is. I prepared tool for that: https://github.com/o-lenczyk/rsa/blob/master/crack_rsa.py and did the measurements using my pc (11th Gen Core i7-11370H)

In RSA, the public key consists of a large number: n (the modulus) and an exponent – e. The private key – d – can only be calculated by someone who knows prime factors of n – p and q. By finding that values, we can calculate d and decrypt the message.

Let me present you couple of algorithms, from most simple implementations, to highly optimized, and compare the results.

Trial division (simple implementation)

Bit SizeTime
8bit0.0000 seconds
16bit0.0022 seconds
32bit2.57 minutes
64bit~21,000 years
128bit~3.8×1023 years (or about 28 trillion times the age of the universe)

Trial division is exponential time algorithm. It means it use most straightforward method, but the runtime grows explosively with complexity, making it impractical for large numbers. The strategy is essentially brute-force search.

Trial division is trying to divide number n by every odd integer. It is not quite practical, because there is huge gap between 32bit primes used, and 64bit

gmpy2 – trial division

Bit SizeTime
8bit0.0000 seconds
16bit0.0016 seconds
32bit5.50 minutes
64bit~45,000 years
128bit~8.3×1023 years (or about 60 trillion times the age of the universe)

gmpy2 is a python library that uses low-level c library. But for this particular usecase, repeatedly calling gmpy2.next_prime() did not help a lot. Overhead of calls was greater than using non-prime numbers for trial division. It turns out, that simply checking for odd numbers is better here.

Pollard’s Rho (custom implementation)

Bit SizeTime
8bit0.0000 seconds
16bit0.0005 seconds
32bit0.0457 seconds
64bit103 minutes
128bit~400,000 years

Pollard’s rho is another exponential time algorithm, but with significant improvement over trial division. Instead of a linear search, it takes a “random walk” through the numbers, and uses cycle-finding trick to discover a number. Because of randomness, it is stochastic algorithm, and there is element of luck involved in finding the solution.

SymPy factorint (trial division only)

Bit SizeTime
8bit0.0001 seconds
16bit0.0039 seconds
32bit2.47 minutes
34bit7.06 minutes
36bit34.05 minutes
38bit~2,5h
64bit~17,000years
128bit~320 quadrillion years (about 23 trillion times the current age of the universe)

Sympy is a python library for computer algebra. It has tools related to number theory. I have used factorint function for cracking RSA. By default, it automatically selecting best algorithm, but i wanted to check how each implementation behaves. It is slightly faster than most basic trial division implementation.

SymPy factorint (Pollard’s Rho only)

Bit SizeTime
8bit0.0001 seconds
16bit0.0031 seconds
32bit0.0730 seconds
34bit0.5716 seconds
36bit1.1293 seconds
38bit1.4727 seconds
40bit1.2474 seconds
64bit105.75 minutes
128bit~1.5 to 2.5 million years, on average.

Sympy Pollard’s rho results are almost identical as custom implementation

SymPy factorint (Pollard’s p-1 only)

Bit SizeTime
8bit0.0001 seconds
16bit0.0011 seconds
32bit0.0045 seconds
34bit0.2573 seconds
36bit0.4208 seconds
38bit0.2225 seconds
40bit0.4577 seconds
64bitwill not work
128bitwill not work

Pollard’s p-1 is a special purpose algorithm, that is not guarantee to work. If the number has a special property, then is broken into primes instantly. Otherwise, the algorithm will not work at all. I will not go into the much details, but it is using fermat’s little theorem, and “smoothness” concept. Key take is: p-1 and q-1 should not be “smooth” if we want to implement secure RSA

SymPy factorint (ECM only)

Bit SizeTime
8bit0.0001 seconds
16bitdid not work
32bitdid not work
34bitdid not work
36bitdid not work
38bitdid not work
40bitdid not work
64bit7.1949 seconds
128bit534.05 minutes
256bit~100 years

ECM (Elliptic Curve Method) is an enhancement of Pollard’s p-1 method, acting as a generalization that adds “another dimension” to the attack.

While the p-1 method is stuck with the single, fixed number p-1, ECM’s breakthrough is its ability to create “alternative universes.” Each new elliptic curve it tries generates a completely different mathematical group, which has its own unique size (or “order”). The algorithm then bets that one of these new orders will be “smooth”.

It is still stochastic algorithm, and sometimes fail, due to “bad luck”.

SymPy factorint (auto algorithm selection)

Bit SizeTime
8bit0.0001 seconds
16bit0.0095 seconds
32bit0.0108 seconds
34bit0.1922 seconds
36bit0.6174 seconds
38bit0.1154 seconds
40bit0.2449 seconds
64bit6.8013 seconds
128bit???

Finally, I let the sympy choose the best algorithm itself. By mixing different methods, the results are clearly the best so far.

SageMath (factorization using Sage)

Bit SizeTime
8bit0.6318 seconds
16bit0.0001 seconds
32bit0.0003 seconds
34bit0.0007 seconds
38bit0.0016 seconds
40bit0.0010 seconds
64bit0.0206 seconds
128bit2.60 minutes
256bit~250 years

SageMath is computer algebra system, with dozen of specialized libraries, easy to use with Python. It includes number theory tools like factorization, that uses highly optimized PARI/GP C library. The results are two orders of magnitude better than plain python.

cypari2 (Pari/GP)

Bit SizeTime
8bit0.0281 seconds
16bit0.0063 seconds
32bit0.0068 seconds
34bit0.0077 seconds
36bit0.0072 seconds
38bit0.0076 seconds
40bit0.0078 seconds
64bit0.0270 seconds
128bit2.59 minutes
256bit~250 years

cypari2 serves the similar purpose, but instead giving an access to broad collection of tools, it provides python interface for PARI/GP alone. Results are almost identical.

python-flint

Bit SizeTime
8bit0.0002 seconds
16bit0.0001 seconds
32bit0.0008 seconds
34bit0.0168 seconds
36bit0.0149 seconds
38bit0.0085 seconds
40bit0.0104 seconds
64bit0.0479 seconds
128bit2.80 minutes
256bit~250 years

Python library that gives an access to FLINT – Fast Library for Number Theory. It is a modern, highly optimized C library for advanced number theory calculations. It is a direct competitor to PARI/GP. Both libraries uses algorithm cascade similar to this:

  • Trial division
  • Pollard’s rho
  • Pollard’s p-1
  • ECM
  • Quadratic Sieve (QS)
  • Number Field Sieve (NFS)

Both QS and NFS are sub-exponential sieve methods. Far more complex algorithms that work by collecting thousands of mathematical relationships and then using linear algebra on massive matrices to find the factors. They are used only by most sophisticated tools.

QS – core concept is that if:

  • x2≡y2(modN))

then most likely:

  • gcd(x - y, N) will be a factor of N

Next step is finding x and y, where again, “smooth” concept is used. Finally, when enough relations is collected, it looks for numbers multiplied together forms a perfect square

NFS – the most powerful factorization algorithm known today. Unbeatable for numbers over 110 digits. Has enormous complexity. It is an abstract over QS. NFS searches for polynomial equation, which has special relationship with number N, we are searching. It is scoring each equation, and spends a lot of time to find best equation, before solving it.

YAFU (standalone tool)

Bit SizeTime
8bit0.0330 seconds
16bit0.0218 seconds
32bit0.0265 seconds
34bit0.0327 seconds
36bit0.0454 seconds
38bit0.0740 seconds
40bit0.1599 seconds
64bit0.2154 seconds
128bit56.2382 seconds
256bit~80-100 years

YAFU – yet another factoring utility. Free, high-performance standalone command line tool. One of most the fastest and most powerful tools for factorization. Created for single purpose, and was 3x faster than SageMath multi tool. It uses own algorithmic cascade and Self-Initializing Quadratic Sieve (SIQS) – more advanced version of QS. SIQS tests few routes first, before digging deeper. Self-Initializing means it is creating next polynomial quickly, instead doing it from scratch. Thanks to that, most time consuming part is done in productive range.

CADO-NFS (standalone tool)

Bit SizeTime
8bittoo small
16bittoo small
32bittoo small
64bittoo small
128bit4.50 minutes
130bit5.56 minutes
132bit8min
256bit~50 years

CADO-NFS is research grade tool for high performance, distributed computing and world record factorizations. It was slower than YAFU, because it is designed for much higher numbers, and more computing power. The overhead was too much for 128x128bit prime.

FactorDB (online database)

Bit SizeTime
8bit0.2378 seconds
16bit0.1419 seconds
32bittoo long
64bittoo long
128bittoo long

Online database that stores known factorizations. Instead of computing, we are asking for the result. Works only for really small numbers

Wolfram Alpha (online API)

Bit SizeTime
8bit2.0049 seconds
16bit1.7616 seconds
32bit1.6952 seconds
34bit1.4012 seconds
36bit1.8980 seconds
38bit1.8980 seconds
40bit1.7296 seconds
64bit1.7817 seconds
128bittoo long

Online engine, powered by Mathematica. Can compute much higher numbers than factordb, but still relatively small for RSA key. For 64x64bit number, it was faster than fastest python implementation.

summary

  • Right algorithm choice can lead to astronomical differences
  • C libraries are hundreds time faster than python implementations
  • huge complexity jump with doubling key size. That is why modern RSA keys are at least 2048 bit length.
  • some algorithms base on pure luck. they can solve the problem fast, but are unreliable
  • most powerful tools use cascade of tools to solve problem efficiently
  • together with small key size “smoothness” is critical key vulnerability

Leave a Reply

Your email address will not be published. Required fields are marked *

+