• Simulacra and Simulation

    rsa 105 – textbook rsa vs padding

    Before starting, let me introduce two terms used in the title:

    • Textbook RSA – most basic implementation of RSA, used for educational purposes. It is easy to understand, but highly insecure. One of the vulnerabilities of textbook RSA lies in its deterministic nature. It means that encrypting the same message with public key, always gives the same output. This characteristic can be exploited.
    • Padding – Crucial security measure, that makes textbook RSA much more resilient. Involves adding random data to a message before encryption. It removes deterministic nature from textbook RSA, making it significantly more secure.

    There are multiple of attacks, that exploits lack of padding in textbook RSA:

    • LOW EXPONENT ATTACK
      To make it work, the attack needs 2 things:
      – small e
      – short message
      The formula for maximum message length to make the attack work is `m < n^(1/e)`, where n is key length, and e=exponent. i.e. 512 bit key, and tiny e=3, maximum message is 21 ASCII characters. There are two countermeasures: using reasonably high exponent, and minimum message length. Arbitral padding will prevent too short messages
    • SHORT MESSAGE ATTACK
      Requirements for successful attack:
      – short message
      – no padding
      Idea behind this attack is simple. Knowing that encrypted message is short, we are encrypting every combination – brute force – and then compare encrypted messages. If we have some assumptions about the message, it is possible to create a dictionary, and test most probable options first. Simple padding (like PKCS 1.5) adds random bits at the beginning of the message. It makes the output nondeterministic, and resilient to brute force attack.
    • HOMOMORPHIC ATTACK. It requires 2 encrypted unpadded messages. This attack is interesting, because it does not reveal the message, but lets attacker manipulate data. Knowing encrypted(300) and encrypted(500), an attacker can construct (encrypted 300*500) by calculating encrypted(300) * encrypted(500). By adding padding, we are removing this property.

    Here you can find script exploiting no padding vulnerability: https://github.com/o-lenczyk/rsa/blob/master/no_padding_attacks.py

    sample output:

    ======================================================================
        SUMMARY TABLE: RSA Textbook Vulnerabilities Across Key Sizes
    ======================================================================
    Key Size    RSA Bits    Encrypt     Low Exp     Homom       Brute
    (primes)    (modulus)   Time        Attack      Attack      Attack
    ----------------------------------------------------------------------
    512         1024        0.000002    0.000045    0.000087    0.000107
    1024        2048        0.000003    0.000030    0.000066    0.000093
    2048        4096        0.000002    0.000036    0.000018    0.000078
    ----------------------------------------------------------------------
    TOTALS                  0.000007    0.000111    0.000171    0.000278
    ======================================================================

    Regardless of the key size, lack of padding is critical textbook vulnerability. In all test cases, breaking the encryption was literally instant.

    +
  • Simulacra and Simulation

    rsa 104 – carefully preparing weakest private key ever

    In previous post I was testing and comparing different factorization algorithms.

    Based on my observation, the two most critical factors determining the time required to crack private key are

    • size (bit length)
    • the mathematical properties of its prime factors.

    bit length

    Most straightforward factor. Most of the known algorithms has exponential complexity. So factoring 2048bit number is not twice as hard as 1024bit number – it is trillions time harder. Current standard is 4096, because it is completely impossible with current technology.

    prime factor properties – hidden weakness

    Pollard’s p-1 algorithm introduces “smoothness” concept. A number is “smooth” if all his prime factors are small. For example:

    • 3,024 seems reasonably large. But its prime factorization is 2^4 × 3^3 × 7. Since its largest prime factor is just 7, this is a 7-smooth number.
    • 3,026 is very close to 3,024. But its prime factorization is 2 × 1513. Because it has a large prime factor (1513), it is not a smooth number.

    Pollard’s p-1 works incredibly fast, if p-1 or q-1 is smooth.

    https://github.com/o-lenczyk/rsa/blob/master/rsa_from_scratch.py#L60

    Above you can find a function to generate smooth primes, that can be used to create extremely insecure RSA key pair.

    core logic

    1. Dynamic smoothness bound calculation:
      Instead of a fixed bound, this function calculates initial smoothness bound, based on the required prime length. This ensures, that for larger primes, there is a larger pool of small primes to work with. Initially I tried fixed, small bound. This was leading to extremely long generation times.
    2. Generating small primes:
      helper function to generate all prime numbers up to current smoothness_bound. Those primes will be building p-1
    3. Constructing p-1
      function will multiply small primes, until p-1 is close to desired length of p
    4. primality test of (p-1)+1
      initially it was done with sympy, but c library is many times faster
    5. Smoothness check
      calculating max prime power of p-1 (e.g. 2^8, 3^5). Initially I was checking only for max factor, but this was insufficient. If p-1 contained a prime like 17^6 = 24 137 569, but maximum bound was set to 200 000, the attack would fail. Or took significantly more time.
    6. max_prime_power threshold
      the function rejects primes where max prime power exceeds a predefined threshold – currently 20 000 000. Initially it was set to 200 000, because finding such a prime took a long time. Anyway, the longer we will look for vulnerable primes, the faster the Pollard’s p-1 will crack such a key.
    7. iterative bound increase
      if no prime is found in 10 000 attempts, the current smoothness bound is doubled

    vulnerable key

    Finding such a weak example is not an easy task, but after tuning few parameters I was able to generate super-smooth 4096bit key:
    https://github.com/o-lenczyk/rsa/blob/master/smooth-keys.txt

    cracking it took just 8s. Cracking almost any other key is not possible.

    Unique characteristic of this p-1 is max prime factor = 32309, same as max power. This is extremely small number for 2.39×10^599. Or two hundred thirty-nine octononagintacentillion. Or number 10^519 times larger than the number of atoms in the observable universe. Interestingly, computations took less than hour to finish. Below is full list of factors of vulnerable p-1:

    Factors of p-1: {2: 1, 47: 1, 269: 1, 383: 1, 491: 1, 709: 1, 967: 1, 1277: 1, 1279: 1, 1381: 1, 1657: 1, 1667: 1, 2377: 1, 2887: 1, 2957: 1, 2969: 1, 3389: 1, 3793: 1, 4201: 1, 4349: 1, 4357: 1, 4517: 1, 4919: 1, 5407: 1, 5563: 1, 5573: 1, 5647: 1, 5867: 1, 6301: 1, 6577: 1, 6637: 1, 6733: 1, 6883: 1, 6959: 1, 7027: 1, 7459: 1, 7537: 1, 7879: 1, 7949: 1, 8179: 1, 8209: 1, 8297: 1, 8423: 1, 8627: 1, 8693: 1, 8699: 1, 8731: 1, 8821: 1, 8963: 1, 9311: 1, 9467: 1, 9623: 1, 9923: 1, 10061: 1, 10987: 1, 11093: 1, 11161: 1, 11621: 1, 12041: 1, 12253: 1, 12511: 1, 12613: 1, 12899: 1, 13109: 1, 13291: 1, 13469: 1, 13591: 1, 13711: 1, 14173: 1, 14431: 1, 14533: 1, 14561: 1, 14731: 1, 14747: 1, 14869: 1, 14951: 1, 15013: 1, 15139: 1, 15277: 1, 15559: 1, 15773: 1, 15907: 1, 15919: 1, 15959: 1, 16333: 1, 16921: 1, 16927: 1, 17033: 1, 17321: 1, 17327: 1, 17341: 1, 17683: 1, 17827: 1, 17957: 1, 18131: 1, 18223: 1, 18587: 1, 19267: 1, 19429: 1, 19463: 1, 19751: 1, 19867: 1, 19973: 1, 20123: 1, 20327: 1, 20389: 1, 20483: 1, 21397: 1, 21401: 1, 21529: 1, 21559: 1, 21839: 1, 22037: 1, 22073: 1, 22109: 1, 22147: 1, 22271: 1, 22409: 1, 22469: 1, 22619: 1, 22811: 1, 23011: 1, 23173: 1, 23677: 1, 25733: 1, 25799: 1, 26399: 1, 26731: 1, 27061: 1, 27673: 1, 27773: 1, 27917: 1, 28051: 1, 28297: 1, 28447: 1, 29131: 1, 29179: 1, 29201: 1, 29387: 1, 29483: 1, 29573: 1, 29587: 1, 29663: 1, 29983: 1, 30161: 1, 30203: 1, 30661: 1, 30949: 1, 31547: 1, 31667: 1, 31751: 1, 32261: 1, 32309: 1}

    what to do

    Easiest way to mitigate the risk is to generate “safe” prime:

    • (p-1)/2 is a prime number. It means that p-1 has large prime factor, so it is not “smooth”. This structure completely neutralizes the vulnerability exploited by algorithms like Pollard’s p-1

    +
  • Simulacra and Simulation

    rsa 103 – cracking

    As said before, RSA algorithm rests on simple fact: multiplying two large numbers is easy, but factoring the result into original primes is incredibly difficult. Lets see how difficult it is. I prepared tool for that: https://github.com/o-lenczyk/rsa/blob/master/crack_rsa.py and did the measurements using my pc (11th Gen Core i7-11370H)

    In RSA, the public key consists of a large number: n (the modulus) and an exponent – e. The private key – d – can only be calculated by someone who knows prime factors of n – p and q. By finding that values, we can calculate d and decrypt the message.

    Let me present you couple of algorithms, from most simple implementations, to highly optimized, and compare the results.

    Trial division (simple implementation)

    Bit SizeTime
    8bit0.0000 seconds
    16bit0.0022 seconds
    32bit2.57 minutes
    64bit~21,000 years
    128bit~3.8×1023 years (or about 28 trillion times the age of the universe)

    Trial division is exponential time algorithm. It means it use most straightforward method, but the runtime grows explosively with complexity, making it impractical for large numbers. The strategy is essentially brute-force search.

    Trial division is trying to divide number n by every odd integer. It is not quite practical, because there is huge gap between 32bit primes used, and 64bit

    gmpy2 – trial division

    Bit SizeTime
    8bit0.0000 seconds
    16bit0.0016 seconds
    32bit5.50 minutes
    64bit~45,000 years
    128bit~8.3×1023 years (or about 60 trillion times the age of the universe)

    gmpy2 is a python library that uses low-level c library. But for this particular usecase, repeatedly calling gmpy2.next_prime() did not help a lot. Overhead of calls was greater than using non-prime numbers for trial division. It turns out, that simply checking for odd numbers is better here.

    Pollard’s Rho (custom implementation)

    Bit SizeTime
    8bit0.0000 seconds
    16bit0.0005 seconds
    32bit0.0457 seconds
    64bit103 minutes
    128bit~400,000 years

    Pollard’s rho is another exponential time algorithm, but with significant improvement over trial division. Instead of a linear search, it takes a “random walk” through the numbers, and uses cycle-finding trick to discover a number. Because of randomness, it is stochastic algorithm, and there is element of luck involved in finding the solution.

    SymPy factorint (trial division only)

    Bit SizeTime
    8bit0.0001 seconds
    16bit0.0039 seconds
    32bit2.47 minutes
    34bit7.06 minutes
    36bit34.05 minutes
    38bit~2,5h
    64bit~17,000years
    128bit~320 quadrillion years (about 23 trillion times the current age of the universe)

    Sympy is a python library for computer algebra. It has tools related to number theory. I have used factorint function for cracking RSA. By default, it automatically selecting best algorithm, but i wanted to check how each implementation behaves. It is slightly faster than most basic trial division implementation.

    SymPy factorint (Pollard’s Rho only)

    Bit SizeTime
    8bit0.0001 seconds
    16bit0.0031 seconds
    32bit0.0730 seconds
    34bit0.5716 seconds
    36bit1.1293 seconds
    38bit1.4727 seconds
    40bit1.2474 seconds
    64bit105.75 minutes
    128bit~1.5 to 2.5 million years, on average.

    Sympy Pollard’s rho results are almost identical as custom implementation

    SymPy factorint (Pollard’s p-1 only)

    Bit SizeTime
    8bit0.0001 seconds
    16bit0.0011 seconds
    32bit0.0045 seconds
    34bit0.2573 seconds
    36bit0.4208 seconds
    38bit0.2225 seconds
    40bit0.4577 seconds
    64bitwill not work
    128bitwill not work

    Pollard’s p-1 is a special purpose algorithm, that is not guarantee to work. If the number has a special property, then is broken into primes instantly. Otherwise, the algorithm will not work at all. I will not go into the much details, but it is using fermat’s little theorem, and “smoothness” concept. Key take is: p-1 and q-1 should not be “smooth” if we want to implement secure RSA

    SymPy factorint (ECM only)

    Bit SizeTime
    8bit0.0001 seconds
    16bitdid not work
    32bitdid not work
    34bitdid not work
    36bitdid not work
    38bitdid not work
    40bitdid not work
    64bit7.1949 seconds
    128bit534.05 minutes
    256bit~100 years

    ECM (Elliptic Curve Method) is an enhancement of Pollard’s p-1 method, acting as a generalization that adds “another dimension” to the attack.

    While the p-1 method is stuck with the single, fixed number p-1, ECM’s breakthrough is its ability to create “alternative universes.” Each new elliptic curve it tries generates a completely different mathematical group, which has its own unique size (or “order”). The algorithm then bets that one of these new orders will be “smooth”.

    It is still stochastic algorithm, and sometimes fail, due to “bad luck”.

    SymPy factorint (auto algorithm selection)

    Bit SizeTime
    8bit0.0001 seconds
    16bit0.0095 seconds
    32bit0.0108 seconds
    34bit0.1922 seconds
    36bit0.6174 seconds
    38bit0.1154 seconds
    40bit0.2449 seconds
    64bit6.8013 seconds
    128bit???

    Finally, I let the sympy choose the best algorithm itself. By mixing different methods, the results are clearly the best so far.

    SageMath (factorization using Sage)

    Bit SizeTime
    8bit0.6318 seconds
    16bit0.0001 seconds
    32bit0.0003 seconds
    34bit0.0007 seconds
    38bit0.0016 seconds
    40bit0.0010 seconds
    64bit0.0206 seconds
    128bit2.60 minutes
    256bit~250 years

    SageMath is computer algebra system, with dozen of specialized libraries, easy to use with Python. It includes number theory tools like factorization, that uses highly optimized PARI/GP C library. The results are two orders of magnitude better than plain python.

    cypari2 (Pari/GP)

    Bit SizeTime
    8bit0.0281 seconds
    16bit0.0063 seconds
    32bit0.0068 seconds
    34bit0.0077 seconds
    36bit0.0072 seconds
    38bit0.0076 seconds
    40bit0.0078 seconds
    64bit0.0270 seconds
    128bit2.59 minutes
    256bit~250 years

    cypari2 serves the similar purpose, but instead giving an access to broad collection of tools, it provides python interface for PARI/GP alone. Results are almost identical.

    python-flint

    Bit SizeTime
    8bit0.0002 seconds
    16bit0.0001 seconds
    32bit0.0008 seconds
    34bit0.0168 seconds
    36bit0.0149 seconds
    38bit0.0085 seconds
    40bit0.0104 seconds
    64bit0.0479 seconds
    128bit2.80 minutes
    256bit~250 years

    Python library that gives an access to FLINT – Fast Library for Number Theory. It is a modern, highly optimized C library for advanced number theory calculations. It is a direct competitor to PARI/GP. Both libraries uses algorithm cascade similar to this:

    • Trial division
    • Pollard’s rho
    • Pollard’s p-1
    • ECM
    • Quadratic Sieve (QS)
    • Number Field Sieve (NFS)

    Both QS and NFS are sub-exponential sieve methods. Far more complex algorithms that work by collecting thousands of mathematical relationships and then using linear algebra on massive matrices to find the factors. They are used only by most sophisticated tools.

    QS – core concept is that if:

    • x2≡y2(modN))

    then most likely:

    • gcd(x - y, N) will be a factor of N

    Next step is finding x and y, where again, “smooth” concept is used. Finally, when enough relations is collected, it looks for numbers multiplied together forms a perfect square

    NFS – the most powerful factorization algorithm known today. Unbeatable for numbers over 110 digits. Has enormous complexity. It is an abstract over QS. NFS searches for polynomial equation, which has special relationship with number N, we are searching. It is scoring each equation, and spends a lot of time to find best equation, before solving it.

    YAFU (standalone tool)

    Bit SizeTime
    8bit0.0330 seconds
    16bit0.0218 seconds
    32bit0.0265 seconds
    34bit0.0327 seconds
    36bit0.0454 seconds
    38bit0.0740 seconds
    40bit0.1599 seconds
    64bit0.2154 seconds
    128bit56.2382 seconds
    256bit~80-100 years

    YAFU – yet another factoring utility. Free, high-performance standalone command line tool. One of most the fastest and most powerful tools for factorization. Created for single purpose, and was 3x faster than SageMath multi tool. It uses own algorithmic cascade and Self-Initializing Quadratic Sieve (SIQS) – more advanced version of QS. SIQS tests few routes first, before digging deeper. Self-Initializing means it is creating next polynomial quickly, instead doing it from scratch. Thanks to that, most time consuming part is done in productive range.

    CADO-NFS (standalone tool)

    Bit SizeTime
    8bittoo small
    16bittoo small
    32bittoo small
    64bittoo small
    128bit4.50 minutes
    130bit5.56 minutes
    132bit8min
    256bit~50 years

    CADO-NFS is research grade tool for high performance, distributed computing and world record factorizations. It was slower than YAFU, because it is designed for much higher numbers, and more computing power. The overhead was too much for 128x128bit prime.

    FactorDB (online database)

    Bit SizeTime
    8bit0.2378 seconds
    16bit0.1419 seconds
    32bittoo long
    64bittoo long
    128bittoo long

    Online database that stores known factorizations. Instead of computing, we are asking for the result. Works only for really small numbers

    Wolfram Alpha (online API)

    Bit SizeTime
    8bit2.0049 seconds
    16bit1.7616 seconds
    32bit1.6952 seconds
    34bit1.4012 seconds
    36bit1.8980 seconds
    38bit1.8980 seconds
    40bit1.7296 seconds
    64bit1.7817 seconds
    128bittoo long

    Online engine, powered by Mathematica. Can compute much higher numbers than factordb, but still relatively small for RSA key. For 64x64bit number, it was faster than fastest python implementation.

    summary

    • Right algorithm choice can lead to astronomical differences
    • C libraries are hundreds time faster than python implementations
    • huge complexity jump with doubling key size. That is why modern RSA keys are at least 2048 bit length.
    • some algorithms base on pure luck. they can solve the problem fast, but are unreliable
    • most powerful tools use cascade of tools to solve problem efficiently
    • together with small key size “smoothness” is critical key vulnerability

    +
  • Simulacra and Simulation

    rsa 102 – implementation

    In a previous post, I introduced the RSA algorithm and outlined its importance. Here, I will present a naive Python implementation. Later, I will highlight its shortcomings and address the flaws.

    The code can be found here: https://github.com/o-lenczyk/rsa

    Lets explain two helper functions:

    def gcd(candidate_exponent, totient_phi):
        while totient_phi != 0:
            candidate_exponent, totient_phi = totient_phi, candidate_exponent % totient_phi
        return candidate_exponent

    This function computes greatest common divisor. Example:

    If you call gcd(14, 8), the steps are:

    • (14, 8) → (8, 14 % 8 = 6)
    • (8, 6) → (6, 8 % 6 = 2)
    • (6, 2) → (2, 6 % 2 = 0)
    • (2, 0) → returns 2

    So, the GCD of 14 and 8 is 2.

    def mod_inverse(encryption_exponent_e, totient_phi):
        m0, x0, x1 = totient_phi, 0, 1
        if totient_phi == 1:
            return 0
        while encryption_exponent_e > 1:
            q = encryption_exponent_e // totient_phi
            totient_phi, encryption_exponent_e = encryption_exponent_e % totient_phi, totient_phi
            x0, x1 = x1 - q * x0, x0
        if x1 < 0:
            x1 += m0
        return x1

    This function finds a number that, when multiplied by encryption_exponent_e and divided by totient_phi, leaves a remainder of 1. This is needed in RSA to calculate the private key.

    • Save the original value of totient_phi (call it m0).
    • Set up two numbers, x0 and x1, to help keep track of the calculation.
    • If totient_phi is 1, return 0 (no inverse exists).
    • Repeat these steps while encryption_exponent_e is greater than 1:
      • Divide encryption_exponent_e by totient_phi and get the whole number part (call it q).
      • Update encryption_exponent_e and totient_phi to the next pair of numbers in the process (like in the Euclidean algorithm).
      • Update x0 and x1 to keep track of the calculation.
    • If the answer (x1) is negative, add m0 to make it positive.
    • Return x1 — this is the modular inverse.
    def generate_keypair(p, q):
        if p == q:
            raise ValueError('p and q cannot be equal.')
        
        n = p * q
        phi = (p-1) * (q-1)
        
        # Choose an integer e such that e and phi(n) are coprime
        e = random.randrange(1, phi)
        
        # Use Euclid's Algorithm to verify that e and phi(n) are coprime
        g = gcd(e, phi)
        while g != 1:
            e = random.randrange(1, phi)
            g = gcd(e, phi)
            
        # Use Extended Euclidean Algorithm to generate the private key
        d = mod_inverse(e, phi)
        
        # Return public and private keypair
        # Public key is (e, n) and private key is (d, n)
        return ((e, n), (d, n))

    By having two above functions already, generating keypair is fairly straightforward

    Last two functions are encrypt and decrypt

    def encrypt(public_key, plaintext):
        e, n = public_key
        cipher = [pow(ord(char), e, n) for char in plaintext]
        return cipher

    What is happening here:

    • ord(char) converts a character (like 'A') into its ASCII numeric value (e.g., 65)
    • pow(ord(char), e, n) computes (m ^ e) mod n – RSA encryption formula
    • [ … for char in plaintext] does this for every character in the plaintext, producing a list of encrypted numbers
    def decrypt(private_key, ciphertext):
        d, n = private_key
        plain = [chr(pow(char, d, n)) for char in ciphertext]
        return ''.join(plain)

    Last but not least:

    • for each number, compute (char ^ e) mod n – the RSA decryption formula
    • chr(…) – Converts the resulting number back to a character using its ASCII code
    • return ”.join(plain) – Joins the list of characters into a single string

    Rest is the matter of generating key pair, and using encrypt and decrypt function.

    +
  • Simulacra and Simulation

    RSA 101 – introduction

    RSA stands for Rivest-Shamir-Adleman, who first described the algorithm in 1977. It is used in public-key cryptography. RSA is essential for modern internet protocols. Key applications are

    • TLS/SSL. Whenever you visit a secure website, RSA is used to exchange symmetric session key
    • Digital signatures. RSA provide a way to verify authenticity and integrity of a message. Sender uses private key to “sign” a document. Then recipient uses public key to verify the signature
    • Email security. Protocols like PGP use RSA to encrypt emails
    • VPN. RSA is used during the setup of secure VPN connection to authenticate parties
    • Cryptocurrency and blockchain. RSA is used in key management and signing. But for most cases Elliptic Curve Cryptography is used instead of RSA

    Public key cryptography

    RSA is asymmetric encryption algorithm. This means it uses pair of keys

    • public key: can be shared with anyone
    • private key: must be kept secret

    Public key is used for encryption, private key is used for decryption.

    why rsa is so important

    RSA solves very important problem: how to exchange encryption keys, before establishing secure connection? This was often a logistical challenge. RSA bypasses this issue, by never sharing a secret key. Instead, only an encryption key is shared.

    mathematical asymmetry

    The key concept that makes RSA so resilient is the asymmetry of computational difficulty between two mathematical operations: multiplication and prime factorization.

    • easy direction: multiplication. It is simple, even for huge numbers. Straightforward process that takes a computer milliseconds.
    • hard direction: factorization. Guessing what was prime factors of for large numbers is computationally intensive. It can take thousands of years to complete, even for supercomputers

    generating Public key

    Public key is a tuple (e,n).

    • e – encryption exponent
    • n – RSA modulus
    • p – relatively big prime number
    • q – relatively big prime number
    n = p*q

    to calculate e, we must first find euler totient (φ):

    φ = (p-1)(q-1)

    and e is coprime to that number:

    • no common factors other than 1. In other words, greatest common divisor of e and (p-1)(q-1) is 1

    Common choice for e is 2^16+1. It is a prime number, and binary representation 10000000000000001 makes it computationally efficient. It is also large enough to avoid security risks associated with small exponents.

    example of public key

    Lets calculate simple public key

    p=5
    q=11
    n=p*q=55
    φ = (p-1)(q-1) = 4*10 = 40
    e = 3
    (e,n) = (3,55)

    Our public key is (3,55)

    encrypting the message

    To encrypt the message, sender has to calculate number c:

    c ≡ m^e (mod n)
    • m: original message
    • e: exponent, taken from public key
    • n: modulus, taken from public key

    Assume, that we want to send the number 7 (for any reason). In practice, ASCII table can be used to convert letters to numbers first.

    m = 7
    c ≡ 7^3 (mod 55) ≡ 343 (mod 55) ≡ 13

    decrypting message

    To decrypt, first we need to solve this equation:

    de = 1 (mod φ ) = 1 (mod((p-1)(q-1))
    3d ≡ 1 (mod 40)
    d ≡ 27 (mod 40)

    by having d, we can decrypt the message:

    • m: decrypted message
    • c: encrypted message
    • d: private, decryption exponent
    • n: rsa modulus
    m ≡ c^d (mod n)
    m ≡ 13^27 (mod 55) ≡ 7

    +
  • Simulacra and Simulation

    SPF and dkim RECORDS

    SPF and DKIM are email authentication mechanisms that help mail servers verify that an email actually comes from the domain it claims to be from. Without them, your emails are much more likely to be marked as spam or rejected.

    SPF – Sender Policy Framework.

    Lets mail servers check which servers are allowed to send emails on behalf of your domain. To make it work, publish a TXT record in your domain’s DNS that lists the IP addresses or services allowed to send email for your domain.

    v=spf1 include:_spf.google.com include:mx.ovh.com ~all
    
    • v=spf1 – SPF version, mandatory
    • include:_spf.google.com – authorize Google Workspace servers to send email on behalf of my domain
    • include:mx.ovh.com – authorize OVH’s mail servers to send emails for my domain
    • ~all – soft fail, any other not listed servers will be marked suspicious, but delivered (-all is hard fail)

    DKIM – Domain Keys Identified Mail

    It uses a pair of cryptographic keys:

    • Private key: stored on the mail server, used to sign outgoing emails.
    • Public key: published in DNS, used by receiving servers (like Gmail) to verify the signature.

    When Gmail receives an email from your domain:

    • It reads the DKIM signature in the email header.
    • It fetches the public key from your DNS.
    • If the signature matches, DKIM passes. If it doesn’t → DKIM fails, and Gmail may block or mark the email as spam.

    to configure, set up mail server, and configure 2 CNAME records:

    CNAME 1: ovhmo2899381-selector1._domainkey.piasecki.it  ovhmo2899381-selector1._domainkey.2779185.op.dkim.mail.ovh.net. 
    CNAME 2: ovhmo2899381-selector2._domainkey.piasecki.it  ovhmo2899381-selector2._domainkey.2779184.op.dkim.mail.ovh.net.

    CNAME is an alias, or in other words: “when someone looks up this name, go check another domain instead”.

    to wrap everything up: SPF verifies which servers are allowed to send emails for your domain. DKIM verifies that email content is authentic and was not tampered.

    +
  • Simulacra and Simulation

    Docker registry

    Docker registry is an integral part of each k8s cluster. It stores standalone application executables in form of docker images. In general, we have two options to store docker images:

    • public repositories, like docker hub, or gitlab registry
    • self hosted docker registry

    As always, self-hosting gives a set of advantages:

    • enhanced security and privacy
    • reduced dependency on external services – e.g. docker hub impose limits on anonymous and free accounts
    • compliance requirements
    • control over features and updates
    • higher download/upload speed and lower latencies
    • additional options, for docker registry that would be: multi-architecture support or automated vulnerability assessments

    Let’s take a look at possible self-hosted options:

    • docker registry provided by docker. Simple, lightweight solution with basic features only.
    • Harbor – build on top of the docker registry. Adds user management, image replication, vulnerability scanner and more. But with bigger feature list, comes bigger footprint.
    • Quay – RedHat docker registry with similar feature list as Harbor. Adds integration with OpenShift
    • Portus – not maintained for a long time
    • Artifactory – universal artifacts manager
    • Nexus – another multi-purpose repository manager
    • GitLab Container Registry – part of the GitLab package

    As I am interested in hosting docker images only, best choice would be simple docker registry or Harbor/Quay. First option seems best for simple lab environment, but for educational purposes I will chose Harbor. When the memory/cpu become an issue, I can migrate to docker registry.

    Because harbor is available as a helm chart, start by adding helm repository:

    helm repo add harbor https://helm.goharbor.io
    helm fetch harbor/harbor --untar

    Now, prepare the values. The full list can be found here: https://github.com/goharbor/harbor-helm/blob/main/values.yaml

    The values I used:

    expose:
      ingress:
        hosts:
          core: registry.piasecki.it
    externalURL: https://registry.piasecki.it
    harborAdminPassword: "redacted"
    nginx:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: kubernetes.io/arch
                    operator: In
                    values:
                      - amd64
    portal:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: kubernetes.io/arch
                    operator: In
                    values:
                      - amd64
    core:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: kubernetes.io/arch
                    operator: In
                    values:
                      - amd64
    jobservice:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: kubernetes.io/arch
                    operator: In
                    values:
                      - amd64
    registry:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: kubernetes.io/arch
                    operator: In
                    values:
                      - amd64
    trivy:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: kubernetes.io/arch
                    operator: In
                    values:
                      - amd64
    database:
      internal:
        affinity:
          nodeAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              nodeSelectorTerms:
                - matchExpressions:
                    - key: kubernetes.io/arch
                      operator: In
                      values:
                        - amd64
    redis:
      internal:
        affinity:
          nodeAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              nodeSelectorTerms:
                - matchExpressions:
                    - key: kubernetes.io/arch
                      operator: In
                      values:
                        - amd64
    exporter:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: kubernetes.io/arch
                    operator: In
                    values:
                      - amd64

    It is a little bit repetitive, but it is as it is. I had to add affinity, because harbor is not supporting ARM architecture, and in my cluster there is a mix of AMD64 and ARM.

    In k8s, affinity is a preference used during workload scheduling. Basically, there are two options, flexible and strict. The flexible is preferredDuringSchedulingIgnoredDuringExecution where scheduler tries to find the node that meets the rule. But if the node is not available, the pod will still be scheduled somewhere else. That is not the case if requiredDuringSchedulingIgnoredDuringExecution is used – the workload will not be scheduled. Both of them has IgnoredDuringExecution suffix. It means that if labels or taints are added after the pods are scheduled, they will be ignored.

    Install the harbor with:

    helm upgrade --install harbor harbor/harbor --namespace harbor --create-namespace -f values.yaml --debug 

    After a few moments, the pods will show up.

    NAME                                READY   STATUS    RESTARTS      AGE
    harbor-core-77db798559-95t49        1/1     Running   0             109s
    harbor-database-0                   1/1     Running   0             109s
    harbor-jobservice-5757787b4-xb68f   1/1     Running   0             109s
    harbor-portal-6ccd874b6-bmjkg       1/1     Running   0             109s
    harbor-redis-0                      1/1     Running   0             109s
    harbor-registry-9847665dc-t58m7     2/2     Running   0             109s
    harbor-trivy-0                      1/1     Running   0             109s

    Now, add the DNS entry for the hostname used in the config. I will edit pihole secret.

    k edit cm pihole-custom-dnsmasq --namespace pihole

    and restart the deployment:

    k rollout restart deployment/pihole 

    check if working:

    dig +short registry.piasecki.it                                                                                                            
    192.168.1.43
    192.168.1.23

    The chart already prepared ingress, so we can already open the page in the browser.

    If everything works, simply execute docker login:

    docker login https://registry.piasecki.it 

    and provide username and the password.

    Lets check if it works:

    docker pull ubuntu
    docker tag ubuntu registry.piasecki.it/library/ubuntu:latest
    
    docker push registry.piasecki.it/library/ubuntu:latest                                                                                     
    
    The push refers to repository [registry.piasecki.it/library/ubuntu]
    687d50f2f6a6: Pushed
    latest: digest: sha256:b0c08a4b639b5fca9aa4943ecec614fe241a0cebd1a7b460093ccaeae70df698 size: 529

    There are two ways to authorize k3s to pull images from the docker registry:

    I will cover the second option, as it is little bit more complicated.

    Docker credentials are stored in ~/.docker/config.json. Let’s create a secret from that file:

    kubectl create secret generic regcred \
        --from-file=.dockerconfigjson=/home/{{user}}/.docker/config.json \
        --type=kubernetes.io/dockerconfigjson \
        --namespace=kube-system
    
    secret/regcred created

    you can inspect it:

    k get secret -n kube-system regcred -o yaml                                                                                                

    Easiest way to use it across whole cluster, would be to add it imagePullSecrets to default service account:

    kubectl patch serviceaccount default -p '{"imagePullSecrets": [{"name": "regcred"}]}'

    More granular options would be adding imagePullSecrets to custom service accounts, or even single workloads.

    Now lets check if everything works:

    kubectl run harbor-test-pod --image=registry.piasecki.it/library/ubuntu:latest --restart=Never --command -- sleep 3600 

    That sums up the harbor installation in k3s. In upcoming posts I will cover more advanced harbor features such as LDAP integration, vulnerability scanning or image replication.

    + ,
  • Simulacra and Simulation

    Unbound: self-hosted DNS resolver

    For those who do not know, Domain Name System is the internet component that translate human friendly domain names (like blog.piasecki.it) to IP addresses. The system is hierarchical, and before getting and IP, you need to ask a couple servers first.

    • Recursive DNS resolver – usually handled by your ISP. Alternatively can be provided by third-party services like Google or Cloudflare.
    • Root server lookup – the answer will be authoritative server for top-level domain (TLD)
    • TLD query – this will return authoritative server for specific domain and the subdomains

    Unbound is a recursive DNS resolver, so you can use it without relying on 3rd party resolvers.

    The main consideration behind using Unbound is privacy – you will not use Google or ISP as a proxy to fetch the IP.

    Another Unbound usecase is bypassing content-blocking mechanisms implemented by the ISP. The provider can block the traffic in two ways:

    • DNS blocking – it is very easy to set up and maintain, but not quite effective
    • IP blocking – implementing is challenging. There is a risk of overblocking, for example, when dealing with Content Delivery Networks. It requires constant updates, because sites can change the IP addresses. And it can be bypassed by using the VPN.

    Because of that, domain name blocking is most popular content blocking attempt. Unbound can be used to bypass domain blocking by the ISP, by executing those queries directly.

    Other unbound features are:

    • Caching – frequently used domains will be fetched faster. It takes about 10-12ms to ping google/cloudflare, but only 3ms for my k3s host
    • Prefetching – when DNS entry is about to expire, Unbound will automatically refresh it, making the subsequent query even faster
    • DNSSEC – guarantee authenticity of DNS record, using public key infrastructure. It also confirms that DNS responses have not modified in transit. I have not configured it yet – I will cover it later.

    Without further ado, here is configuration I used:

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: unbound-main-conf
    data:
      unbound.conf: |
        server:
        verbosity: 0
        interface: 0.0.0.0
        port: 53
        do-ip4: yes
        do-udp: yes
        do-tcp: yes
    
        cache-max-ttl: 86400       # Maximum time (in seconds) to cache a record 
        cache-min-ttl: 0           # Minimum time (in seconds) to cache a record
        msg-cache-size: 50m        # Message cache size
        rrset-cache-size: 100m     # Resource record cache size
        infra-cache-numhosts: 10000  # Number of hosts in the infrastructure cache
        prefetch: yes              # Enable prefetching of popular cache entries
        prefetch-key: yes          # Also prefetch DNSKEY and other key information
    
        access-control: 0.0.0.0/0 allow   # Allow all IPv4 clients
        access-control: ::0/0 allow       # Allow all IPv6 clients
    
        do-ip6: no
        prefer-ip6: no
        #root-hints: "/var/lib/unbound/root.hints"
        harden-glue: yes
        harden-dnssec-stripped: yes
        use-caps-for-id: no
        edns-buffer-size: 1232
        prefetch: yes
        num-threads: 1
        so-rcvbuf: 1m
    
        private-address: 192.168.1.0/24
        private-address: 169.254.0.0/16
        private-address: 172.16.0.0/12
        private-address: 10.0.0.0/8
        private-address: fd00::/8
        private-address: fe80::/10
    ---
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: unbound-a-records-conf
    data:
      a-records.conf: |
        server:
            # Your internal network name and addresses go here
            private-domain: "your-awesome-domain.local"
            domain-insecure: "your-awesome-domain.local"
            local-zone: "your-awesome-domain.local" transparent
    
            local-data: "k8s.your-awesome-domain.local IN A 172.30.0.1"
            #local-data-ptr: "172.30.0.1 your-awesome-domain.local"
    ---
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: unbound-forward-records-conf
    data:
      forward-records.conf: |
        forward-zone:
            # Forward all queries (except those in cache and local zone) to
            # upstream recursive servers
            name: "."
            # Queries to this forward zone use TLS
            forward-tls-upstream: yes
    
    ---
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: unbound-srv-records-conf
    data:
      srv-records.conf: |
        # SRV records
        # _service._proto.name. | TTL | class | SRV | priority | weight | port | target.
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: dns
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: unbound-dns
      strategy:
        type: Recreate
      template:
        metadata:
          labels:
            app: unbound-dns
        spec:
          nodeSelector:
            kubernetes.io/arch: amd64
          containers:
            - image: mvance/unbound
              imagePullPolicy: IfNotPresent
              name: dns-unbound
              ports:
                - containerPort: 53
                  protocol: UDP
              resources: {}
              volumeMounts:
                - name: unbound-main-conf-volume
                  mountPath: /opt/unbound/etc/unbound/unbound.conf
                  subPath: unbound.conf
                - name: unbound-a-conf-volume
                  mountPath: /opt/unbound/etc/unbound/a-records.conf
                  subPath: a-records.conf
                - name: unbound-forward-conf-volume
                  mountPath: /opt/unbound/etc/unbound/forward-records.conf
                  subPath: forward-records.conf
                - name: unbound-srv-conf-volume
                  mountPath: /opt/unbound/etc/unbound/srv-records.conf
                  subPath: srv-records.conf
          restartPolicy: Always
          volumes:
            - name: unbound-main-conf-volume
              configMap:
                name: unbound-main-conf
            - name: unbound-a-conf-volume
              configMap:
                name: unbound-a-records-conf
            - name: unbound-forward-conf-volume
              configMap:
                name: unbound-forward-records-conf
            - name: unbound-srv-conf-volume
              configMap:
                name: unbound-srv-records-conf

    All that’s left, is to reconfigure Pi-hole: remove external upstream DNS servers, and add Unbound.

    And since we are talking about security and privacy in the context of DNS, here are handy filters for Pi-hole:

    +
  • Simulacra and Simulation

    k3s restore etcd

    Etcd backups proved their usefulness faster than I thought. I screwed up the configuration so badly, I didn’t even recognize it anymore. I did not want to waste more time, and just restored the DB from the backup. Here are the commands:

    list backups

    k3s etcd-snapshot ls                                                                                                    
    (...)
    k3s_etcd_snapshot-fedora-1.home-1732186802.zip file:///var/lib/rancher/k3s/server/db/snapshots/k3s_etcd_snapshot-fedora-1.home-1732186802.zip 12785806 2024-11-21T12:00:02+01:00
    k3s_etcd_snapshot-fedora-1.home-1732208402.zip file:///var/lib/rancher/k3s/server/db/snapshots/k3s_etcd_snapshot-fedora-1.home-1732208402.zip 12729498 2024-11-21T18:00:02+01:00

    stop nodes

    execute on all of the nodes

    systemctl stop k3s

    restore backup

    on the node where backup is located

    k3s server \
      --cluster-reset \
      --cluster-reset-restore-path=<PATH-TO-SNAPSHOT>

    Start the control plane

    systemctl start k3s

    REMOVE DB ON the remaining nodes

    rm -rf /var/lib/rancher/k3s/server/db/

    start rest of the nodes

    systemctl start k3s

    I hope I will not be in this situation soon. But it is good to know, there is reliable restore mechanism in case of any other issues.

    +
  • Simulacra and Simulation

    infracost

    If you were using terraform excessively, you probably have seen output like that:

    • 84 resources to add
    • 17 resources to modify
    • 28 resources to remove

    You can carefully review the changes, but there is always a room for an error. When deploying resources to the public cloud, you have to be extra careful, because mistakes in the configuration can be costly. You can easily create fleet of 100 servers instead of 10, or enable expensive settings by the mistake.

    Those scenarios can be mitigated too some degree, by having budget caps, budget alerts, or budget monitoring. But they all will trigger post factum, when some of the damage is already done.

    Recently, I was wondering if is there a tool for estimating infrastructure cost before running terraform apply. Infracost is doing exactly that task. It is checking the terraform plan for changes, then contacts their API for cloud prices.

    Because it is CLI tool, it can be easily integrated into existing pipelines, and can be a part of merging process.

    Sample output:

    infracost breakdown --path . --terraform-var-file=dev.tfvars --show-skipped
    INFO Autodetected 1 Terraform project across 1 root module
    INFO Found Terraform project main at directory . using Terraform var files softserve.tfvars
    WARN 2 google_container_node_pool prices missing across 1 resource
         2 google_container_cluster prices missing across 1 resource
    
    
    Project: main
    
     Name                                                                                                     Monthly Qty  Unit                    Monthly Cost   
                                                                                                                                                                  
     module.gke_cluster.google_container_cluster.primary                                                                                                          
     ├─ Cluster management fee                                                                                        730  hours                         $73.00   
                                                                                                                                                                  
     google_compute_global_address.public_gw_addr                                                                                                                 
     └─ IP address (unused)                                                                                           730  hours                          $7.30   
                                                                                                                                                                  
     module.gke_cluster.google_container_node_pool.pools["app-node-pool"]                                                                                         
     ├─ Instance usage (Linux/UNIX, on-demand, e2-small)                                                            2,190  hours                      not found   
     └─ Balanced provisioned storage (pd-balanced)                                                                    300  GB                         not found   
                                                                                                                                                                  
     google_kms_crypto_key.gke_crypto_key                                                                                                                         
     ├─ Key versions                                                                                    Monthly cost depends on usage: $0.06 per months           
     └─ Operations                                                                                      Monthly cost depends on usage: $0.03 per 10k operations   
                                                                                                                                                                  
     module.ssl_certificates["grafana"].google_dns_record_set.challenge["grafana-dev.goog.onefor.fun"]                                                            
     └─ Queries                                                                                         Monthly cost depends on usage: $0.40 per 1M queries       
                                                                                                                                                                  
     module.ssl_certificates["webapp"].google_dns_record_set.challenge["webdev.goog.onefor.fun"]                                                                  
     └─ Queries                                                                                         Monthly cost depends on usage: $0.40 per 1M queries       
                                                                                                                                                                  
     OVERALL TOTAL                                                                                                                                      $80.30 
    
    *Usage costs can be estimated by updating Infracost Cloud settings, see docs for other options.
    
    ──────────────────────────────────
    37 cloud resources were detected:
    ∙ 6 were estimated
    ∙ 23 were free
    ∙ 8 are not supported yet, see https://infracost.io/requested-resources:
      ∙ 2 x google_certificate_manager_certificate
      ∙ 2 x google_certificate_manager_dns_authorization
      ∙ 1 x google_binary_authorization_policy
      ∙ 1 x google_certificate_manager_certificate_map
      ∙ 1 x google_certificate_manager_certificate_map_entry
      ∙ 1 x google_compute_security_policy
    
    
    Project ┃ Baseline cost ┃ Usage cost* ┃ Total cost ┃
    main    ┃           $80 ┃           - ┃        $80 ┃

    I am surprised why isn’t it already a part of public cloud infrastructure. It would so much convenient for the users, to know approximate price beforehand, instead of using price calculators manually.

    + ,