Code So Sneaky You Have To Explain It

Your mission, should you choose to accept it, is to code a program that leaks information to the user but does so in a way that can’t be discovered in a code audit. This was the challenge for the 2014 Underhanded C contest; the seventh time they’ve held the event. [Richard Mitton] took part and wrote a very entertaining entry. He didn’t win, but he did just share the details of his super-sneaky code.

The challenge set out for the Citizen-Four-like coders set up a scenario where they were writing a program for a shady company (or sketchy government entity) which makes completely secret decisions based on publicly posted social media. The twist is they were tasked with getting code past an audit that leaked the decisions made by this program to the users being secretly observed.

Above is the core trick which [Richard] used after taking inspiration from Heartbleed. The struct assignment has an off-by-one error in it which is shown corrected in the lower code block. This, used in conjunction with malloc and free, allows memory to be used under the guise of storage during the encryption process. Secretly, this same bit of memory is accessed later and leaked to the user being targeted.

Have your own Underhanded C that you’re dying to share? We want to hear about it so send us a tip!

45 thoughts on “Code So Sneaky You Have To Explain It

      1. Why can’t the open bracket can’t be on the same line as first item in the declaration if you are so concern about a new line?

        Anything to make code readable by showing which bracket belongs to for someone else that might want to read the code what should be a priority – at least for the open source projects.

  1. it seems that this example is based on malloc and free not clearing or randomizing mem buffers. I wonder if openbsd would be affected?

    because its more efficient to NOT clear ram before you give it to a user, I guess that’s a problem that sneaky people can use. is it still a good idea to give uncleared buffers from the os to the user, in this day and age?

      1. Even then that alone isn’t always enough. Because the memory isn’t touched after being written to before it’s freed the compiler is free to optimize your overwrite out of the program. It’s been one big issue for a number of libraries that need secure storage of data in memory. The only way to do it is with compiler extensions that say “don’t optimize this out or I’ll take you out back and shoot you”, along with telling the kernel this particular page can’t be swapped out for good measure. It’s not a simple problem to solve at the application level.

        1. Wait, how would the compiler know it could do that? The writes are to a pointer returned by malloc(), and then passed to free(), and free is prototyped as taking “void *”. Which means free is explicitly saying “I can screw with the stuff you send me” (which, of course, it can, when it passes it to someone else).

          Unless this is a specific optimization for malloc/free, the compiler can’t get rid of anything there. It has absolutely no idea that free isn’t going to use that data.

      1. no, it’s a better reason to learn the C API properly, calloc vs malloc, malloc doesn’t necessarily clear, calloc does. memory alloc versus clear alloc.

        it’s almost shocking how many people don’t know this simple difference.

          1. Not knowing the language/API is common in a lot of languages, it’s not anything special about C. There are good reasons for the differences between the two. even if you don’t know calloc exists, knowing how malloc works is important.

  2. Uhm malloc/free are user space convenience implementations, not system calls. And in C++, new keyword also chain to malloc in most libraries. The only way to actually allocate memory in C is through brk/sbrk. And most OSs have an option on whether to pre-initialize newly mapped heap pages. The bug is actually in HR with a hiring manager making an offer to someone who doesn’t know that.

    Of course you are going to get memory from malloc you just free’d unless you do due diligence to audit and/or change the implementation or user space heap management for a secure application. And the +1 has nothing to do with anything. The same exploit can be made with or without it.

    1. >The only way to actually allocate memory in C is through brk/sbrk.

      :)

      >Of course you are going to get memory from malloc you just free’d unless you do due
      >diligence to audit and/or change the implementation or user space heap management
      >for a secure application.

      I argued about this with someone when the heartbleed stuff came out; Mixing sensitive data in with the same pools of memory that get handed out to other parts of an application seems like a bad idea(tm) to me and also having sensitive data for many unrelated clients connected via sockets etc floating around in the same address space seems pretty bad too.

  3. I did not spot the error in the first code snippet because there is no error.
    The presumed “error” is probably in the code which does something with the buffer.

    If “char buffer [140*4]; is supposed to have a meaning to reserve a buffer for for 140 items with a size of 4 characters it’s probably saver to write:
    char buffer [ (140 + 1) * 4];

    Personally I don’t like these magic velue’s very much and would probably use another typedef and/or a #define for the buffer size.

    P.s: When you #define the buf size and also use this #define in the loops which step through your buffer then your’re a bit closer to “self documenting code” and you only have to change the single #define to change the buffer size (limited ram on avr’s etc).

    1. Yeah. You might get 140*5 instead the other way. On the other hand, it wouldn’t hurt to use index 0 in the rest of the code. Save an extra piece of memory too. If you make an array size 140 * 4, you get 560 things in it. I think the compiler would evaluate the math first anyway.

      A better practice for making 140 four byte entries would be to make a struct of 4 bytes and then making that into an array. Plus you can union the four bytes into an int and long int for quick practical stuff and still use the individual bytes. Laziness is OK in programmers, but they should have the right kind of functional lazy.

      1. I think the 0 index itself is what’s being used to get the info out. All you have to do is let the buffer overflow (or underflow in this case) when the writing takes place and read the unprotected memory (0 index of the current or MSB/LSM of the adjacent array) at your leisure with another program. Actually, it wouldn’t even have to go that far now that I think about it. Though, I could be wrong.

  4. How am I supposed to know what the array is supposed to be holding. If each item in the array is supposed to be 4 bytes then I would pick a type that explicitly holds 4 bytes. I think doing calculations for buffer lengths is sloppy. Use a const for the length and a proper type.

  5. The original article had the best for last: “If you try to debug it using Visual Studio, the bug vanishes! Running a program (even a release-build one) from inside the debugger enables the CRT debug heap, which automatically clears data upon free.”

    1. This is actually wrong. The CRT heap puts either “FEEEFEEE” for free’d memory and “CCCCCCCC” marker in the last/first bytes of the buffers and then checks for this “magic” number when de/allocating. If it does not match, it asserts. This is smart, but if you happen to store these “magic” value in there, you’ll get an (invalid) assert you can’t get rid of.

  6. Personally I’d probably do something more nefarious with the compiler and the way that it packs structures.
    For instance if you made the time value 64 bits ( because you are ‘worried’ about the 2038 problem ) and you make the compiler align it to 64 bits (because you are ‘worried’ about performance of accessing the time variable when it is non-aligned), then you gain a 2 byte gap between the 140 byte buffer and the 8 byte time, which you can leak info into.

  7. BTW, it’s another reason why you should run your software through Valgrind (or AddressSanitizer). It does not take much more time, and this is going to be spot very fast. Whenever the program is reading out of the expected area, Valgrind will complain, and tell you to fix your program

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s