TruffleHog Sniffs Github for Secret Keys

Secret keys are quite literally the key to security in software development. If a malicious actor gains access to the keys securing your data, you’re toast. The problem is, to use keys, you’ve got to write them down somewhere – oftentimes in the source code itself. TruffleHog has come along to sniff out those secret keys in your Github repository.

It’s an ingenious trick — a Python script goes through the commit history of a repository, looking at every string of text greater than 20 characters, and analyzing its Shannon entropy. This is a mathematical way of determining if it looks like a relatively random string of numbers and letters. If it has high entropy, it’s probably a key of some sort.

Sharing source code is always a double-edged sword for security. Any flaws are out for all to see, and there are both those who will exploit the flaws and those who will help fix them. It’s a matter of opinion if the benefits outweigh the gains, but it’s hard to argue with the labor benefits of getting more eyes on the code to hunt for bugs. It’s our guess though, that a lot of readers have accidentally committed secret keys in a git repository and had to revert before pushing. This tool can crawl any publicly posted git repo, but might be just as useful in security audits of your own codebase to ensure accidentally viewable keys are invalidated and replaced.

For a real world example of stolen secret keys, read up on this HDMI breakout that sniffs HDCP keys.

21 thoughts on “TruffleHog Sniffs Github for Secret Keys

  1. I bet TruffleHog would catch public keys. Those aren’t necessarily security holes. For example, if you’re going to set up a web of trust of some sort, you’re probably going to compile the trust anchor key (or certificate) into your code.

    Of course, it’s on you to insure that you closely hold and protect the related *private* key…

    Many high value trust anchor private keys exist solely on paper. You use a computer with no network connections at all to generate the root keypair, then you have it print out the private key, which you lock in the safe. You then immediately generate a subsidiary key pair, turn the public key into a certificate signed by the root private key. You then print out that subsidiary cert and key, then you turn off the computer (and if you’re paranoid, you shred it and throw the pieces into a dumpster fire). The printed out root private key you lock away in a safe, and you take the two certificates and the subsidiary private key and load them into a connected computer. The private key you lock away with whatever protection is appropriate for your application. The two certificates you can publish.

    You do all of that so that if the private key you actually use is ever compromised, you can pull the root private key out of the safe and use that to issue a CRL and a new certificate. You also mitigate that compromise risk by setting a low(er) expiration date on the subsidiary certificate and shortly before it expires you go back and perform the ritual of issuing a new subsidiary cert from the paper root using a disconnected computer. You do all of this because there is no easy remediation to a compromise of the root key other than a universal, global firmware update and flag day, which is comparatively very, very expensive.

    1. That’s the whole point though. While everything you say is true, for people that are not making root keys sometimes the private key is in code that accidentally gets committed, of the side is accessing some database with a simple password and that gets committed. I would be willing to bet there are loads of these passwords hiding in GitHub commits.

  2. I spend half of my “security hat” time chastising developers for putting keys into source code. You can never say never, but I have yet to come across a case in practice where it wasn’t a mistake or just laziness.

    1. Finding a way to contact the project owner or authors is not hard to do, commits have email inside the author field, github probably also has way to send a message to the users, trough a pull request or by some other ways.

      So a tool that looks for key, password or secrets could be used to find the issues and notify the people that can fix it.

      However the question is how to prevent the huge number of potential false positive:
      – making a tool that understand how to find passphrases might be hard
      – I’d guess many passphrases would default to CHANGEME in the configuration files.
      – the configuration files might be shipped in locations where they are not actually used, but instead are used as documentation or examples.
      – Many software have default ssl certificates that are shipped with them, the user is then expected to generate new ones. I wonder whether such practice makes sense though, ssh for instance will generate new keys at boot if none are present. Now that we have letencrypt such software could use its protocol to generate valid keys.
      – “Rooting” software: Some consumer devices running GNU/Linux or Android don’t allow the user access to the underlying system. The Archos 605 WiFi is an example among many other. In that case there are tools to exploit security issues in the default firmware in order to help the user regain control of their devices. They sometimes ship dropbear binaries with hardcoded password.
      – Setup and security lying elsewhere: SSH is easily available everywhere, with a LEDE image, without a web interface, you still need to login in the device to setup a passphrase or SSH keys. You can however connect the device directly to your computer: this will prevent an attacker from being able to connect to it before you.

      A good way to solve some of the problems above(routing and OpenWRT/LEDE) would be to inject the user’s keys or password inside the image or binary before installing it to the device, but currently AFAIK no such tool exist.

      If there is a way to keep the number of false positive is low (which would lead to a huge number of missed keys), it would be great to have a tool that scan for private keys and that would email the user.
      Would a way to make the repository owner indicate if private keys are to be found in the repository, a good way to deal with false positives?

      Another thing to do would be to find a way to deal with many of the issues above. Some known software ship with default ssl certificates, which are publicly known.
      Some documentation on good practices to follow might be a start to find ways to deal with theses.


    1. Here we go again … somebody writes “NotAHack” and fails to provide a link to their hack-worthy work. OK, you are right, hackaday, please refund his subscription fees.

      Disdain is cheap. Provide something better.

      1. The SD die will probably end up on one shredded piece and intact except for the bond wires so the data would be recoverable.

        If you just write all ones to an SD card then it’s deleted and wouldn’t even be forensically recoverable.

        On the other hand HDD data is much more recoverable to you need about 35 passes of random data or just take the platter(s) out and fold them, that works well :)

          1. If it were simply an urban mist then the US Department of Defence would have a specification for wiping data.
            Standard DoD 5220.22-M

            Sure, I take you point that 35 passes is not necessity any more but 7 passes random is probably as good as complementary and then one pass random. I don’t have any software that does complementary though.

            And as for bad sectors, if your software wont write to bad sectors then your not taking security seriously.

            Some prople believe that an OS like windows will *delete* files when in reality the file remains fully intact. windows just hides it from you.

          2. I know there a specs that say wipe 35 times, but maybe the military guys who write this where just really careful/paranoid or the specs are simply really old?
            Concerning bad sectors, i was talking about the controller on the SD card that mapes access to one sector to another because the original sector has gone bad. You can’t overwrite the bad sector, except maybe using special commands to the SD card.
            Yes i know about Windows and sometimes it’s a good thing, especially with people that are not so good with computers…

          3. It’s a change in technology. Hard drives are analogue and on the old ones you could pull three layers of data off them. Newer hard drives use the same margin more resourcefully to get higher capacities.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s