Why You Should Totally Roll Your Own AES Cryptography

Software developers are usually told to ‘never write your own cryptography’, and there definitely are sufficient examples to be found in the past decades of cases where DIY crypto routines caused real damage. This is also the introduction to [Francis Stokes]’s article on rolling your own crypto system. Even if you understand the mathematics behind a cryptographic system like AES (symmetric encryption), assumptions made by your code, along with side-channel and many other types of attacks, can nullify your efforts.

So then why write an article on doing exactly what you’re told not to do? This is contained in the often forgotten addendum to ‘don’t roll your own crypto’, which is ‘for anything important’. [Francis]’s tutorial on how to implement AES is incredibly informative as an introduction to symmetric key cryptography for software developers, and demonstrates a number of obvious weaknesses users of an AES library may not be aware of.

This then shows the reason why any developer who uses cryptography in some fashion for anything should absolutely roll their own crypto: to take a peek inside what is usually a library’s black box, and to better understand how the mathematical principles behind AES are translated into a real-world system. Additionally it may be very instructive if your goal is to become a security researcher whose day job is to find the flaws in these systems.

Essentially: definitely do try this at home, just keep your DIY crypto away from production servers :)

27 thoughts on “Why You Should Totally Roll Your Own AES Cryptography

  1. I always translate to French, ROT13, put it through a Polybius/Playfair square then encode it with a one time pad based on values derived from reel to reel tape of 20M band radio emissions from Jupiter recorded in 1974, then finally 56 bit DES with the key PA55W0RD, just so they feel like they’re getting somewhere when they start.

    1. Not sure those voyager transmissions meet the requirements for randomness. In fact, feel confident that with guessed plaintext and frequency analysis, that’s all totally doable.

  2. It’s fact it’s: don’t roll your own crypto, do not accelerate your crypto.
    Yes any accelerator is riddled with side channel attack (mostly power) which leaks keys. That’s why serious (ie certified) crypto is only done in software mostly (unless you use security aware hw, which is very uncommon).

    1. I wonder if anyone has done any analysis of the cryptography stuff built into modern CPUs (modern x86 chips for example include dedicated AES-specific instructions)

      1. Oddly enough I would probably totally trust the AES-specific instructions. The part I would not fully trust is Intel’s DRNG (digital random number generator)*.

        But as long as it is not the only source of entropy, it is far better than nothing.

        ARM chips mostly uses Ring oscillators (an odd number of NOT gates in a ring) for their source of randomness. The only data about how intel’s source their physical entropy that I’ve seen is that they “asynchronously on a self-timed circuit and uses thermal noise within the silicon to output a random stream of bits at the rate of 3 GHz” ** – to me that sounds like its mode of operation is going to be extremely similar to a Ring oscillator. But that it “needs no dedicated external power supply to run, instead using the same power supply as other core logic” ** would ring many alarm bells for me.

        If you have ever built your own random number generator circuits, and tested the output with the “dieharder” software, you quickly realise how easy it is to get very poor random numbers out from any physical entropy source and then to totally hide how bad it really is with a good cryptographic hashing algorithm (which Intel and ARM do use, I’m guessing AMD would be the same but not looked into it).

        * https://en.wikipedia.org/wiki/RDRAND#Reception
        ** https://www.intel.com/content/www/us/en/developer/articles/guide/intel-digital-random-number-generator-drng-software-implementation-guide.html

  3. 1. You must, MUST! coding data to deleting dependencies of data structure. Good idea is huffman but no optimal.
    2. get any standard encryption function and use it many times 100K or more. sometimes add data or use different functions.
    time to decrypt is very big

    think what You are decoding if you dont know what you search ?
    (ale i tak admin kasuje komentarze)

  4. This gives you a conceptual understanding on how AES works, but not about what makes encryption secure.

    Ex: There is an S-box to substitute bytes into other bytes.
    It is important to grasp this concept, however, it doesn’t tell you why specific bytes are translated into other bytes. It is not entirely random. And if you don’t understand why that is, or how you can analyse the result for strength, you will most likely do something random when designing your own crypto, which is one of the reasons of the “Don’t roll youe own”-advice.

    However, understanding crypto is important as a lot of people still have problems simply configuring encryption in a given situation. (Operational mode, key length, number of rounds, etc…)

  5. AES is absolutely my favorite two way encryption algorithm! What makes it so great? All the constants can be generated and all the operations are simple! It’s no wonder that it’s implemented in silicon everywhere because it requires so very little to implement it.

    1. I lke RC4, great stream cipher in terms of simplicity and stronger than TEA without being really any more complex. Probably has no place outside of a classroom these days. But I value a learning experience more than I value production-ready.

  6. Rolling your own gives you a chance to thinker with the parameters and learn a lot about cryptography. Things I want to take a look at now: When replacing the S box with a linear function (S=S+1), what breaks? If we only run 2 rounds instead of the full rounds, can we crack it easily? More advanced but definitely in reach of the people here, get your ChipWhisperer or other side channel tools and do play with the timing analysis. Maybe compare timing versus different levels of compiler optimization could be a fun project.

    The last one, I’ve played with quite a bit with is Differential Fault Analysis (DFA) where we use Fault Injection to get the chip to make a computational mistake and leak the key. If we insert a glitch in round 9 after Shift Rows and before MixColumns we get 4 bytes of corruption. (More bytes = glitch too early, less bytes = too late). Then we can compute the key after ~100 unique faults you can compute the key.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.