How The NSA Can Read Your Emails

Since [Snowden]’s release of thousands of classified documents in 2013, one question has tugged at the minds of security researchers: how, exactly, did the NSA apparently intercept VPN traffic, and decrypt SSH and HTTP, allowing the NSA to read millions of personal, private emails from persons around the globe? Every guess is invariably speculation, but a paper presented at the ACM Conference on Computer and Communications Security might shed some light on how the NSA appears to have broken some of the most widespread encryption used on the Internet (PDF).

The relevant encryption discussed in the paper is Diffie–Hellman key exchange (D-H), the encryption used for HTTPS, SSH, and VPN. D-H relies on a shared very large prime number. By performing many, many computations, an attacker could pre-compute a ‘crack’ on an individual prime number, then apply a relatively small computation to decrypt any individual message that uses that prime number. If all applications used a different prime number, this wouldn’t be a problem. This is the difference between cryptography theory and practice; 92% of the top 1 Million Alexa HTTPS domains use the same two prime numbers for D-H. An attacker could pre-compute a crack on those two prime numbers and consequently be able to read nearly all Internet traffic through those servers.

This sort of attack was discussed last spring by the usual security researchers, and in that time the researchers behind the paper have been hard at work. The earlier discussion focused on 512-bit D-H primes and the LogJam exploit. Since then, the researchers have focused on the possibility of cracking longer 768- and 1024-bit D-H primes. They conclude that someone with the resources of cracking a single 1024-bit prime would allow an attacker to decrypt 66% of IPsec VPNs and 26% of SSH servers.

There is a bright side to this revelation: the ability to pre-compute the ‘crack’ on these longer primes is a capability that can only be attained by nation states as it’s on a scale that has been compared to cracking Enigma during WWII. The hardware alone to accomplish this would cost millions of dollars, and although this computation could be done faster with dedicated ASICs or other specialized hardware, this too would require an enormous outlay of cash. The downside to this observation is, of course, the capability to decrypt the most prevalent encryption protocols may be in the hands of our governments. This includes the NSA, China, and anyone else with hundreds of millions of dollars to throw at a black project.

86 thoughts on “How The NSA Can Read Your Emails

  1. Well, with lots of computer cycles the NSA can crack ahead on primes and keep on cracking on larger and larger prime sets as technology advances. Then they will accumulate a dossier that can apply to any newly encrypted information.
    Of course, the problem gets large quickly. Anyone could go to 2048 or 4096 or 8192 digits in their encryption primes. This would render useless anything the NSA could toss at them, until the fabled quantum decryption defeats them. Of course, we can use quantum encryption….

    1. But then, the encryption technologies will only be utilised by the masses when widely acceptable standards say they will. And guess who has a big seat on most of those standard committees?

      This, naturally only affects the masses. Given that alliances like FIVEEYES allow those few nation-States likely to have invested these sorts of resources to funnel masses of internal and external traffic (“I’ll give you your citizen’s bulk data if you give me mine”) anyone seriously running guns will be ahead of the curve. The ones who aren’t have already been caught.

    2. There are a _lot_ of 2048 bit primes, and even with millions of dollars of hardware, precomputing for just one is expensive and time consuming. It’s completely impractical to guess and do this for any prime someone might theoretically choose in the future.

    1. Well Bitcoin does represent a massive amount of distributed processing power. I would be almost surprised if someone wasn’t using it for something above and beyond distributing virtual cookies.

      1. Those hundreds of billions spent, were likely justified to have done this, while bitcoin did it. That is how i get in trouble. Hopefully enough people flame me and tell me to put on a tin hat so they figure I tarred and feathered myself and no further rendition is needed. lol.

    1. The RSA key isn’t the point of attack here, it’s the DH exchange. That’s specified at ciphersuite selection time during the TLS handshake, rather than at keygen time.

      …but you should probably be moving up from 2048-bit RSA keys anyways.

  2. A few things to note:

    1/ The headline probably isn’t accurate as long as you’re using a webmail provider, because most of those have moved to ECDH. Unlike normal DH, ECDH isn’t believed to be impacted by this.

    2/ Of the groups allowed by openssh, only oakey group 2 is vulnerable. Remove that from your config and you should be fine (although PFS is broken). You can also generate your group yourself, although you’re going to want to start it right before you go on vacation or something.

    3/ Since it didn’t get much mention, I’ll bring it up- IPSEC (actually IKE) specifies group 2 params, which are vulnerable to this attack.

    1. As for point #1…

      Does the encryption type really matter when your own government forces a provider to hand over said emails? Didn’t Google have an issue with this in China a few years back?

      Yeah, I know Google isn’t based in China, but the user was.

      1. Yes, it does, even if it’s only a herd protection.

        By forcing governments to use lawful processes to get access to user data we ensure 1/ that there is a limit to the scale at which they can operate, 2/ that individual applications of the process can be contested (as both Google and Microsoft routinely do), and 3/ that tools like transparency reports and canaries remain effective indicators to the public of how their government is acting.

        Of course, this only applies pressure to the extent that 1/ anybody cares and 2/ democracy works; both of those claims deserve careful and ongoing scrutiny. But we know that in some cases governments respond to pressure on these topics, and we can hope that they will do so again.

        Neither of which is a huge comfort in the moment, of course- but it’s much better than nothing.

    1. 128 hex values, five colour groupings.
      512 bits. Key size? Logjam exploit?
      Digits group: hex; nibbles = ASCII
      11 maroon: 90 / 45f / 89e3e / 7; 90 45 f8 9e 3e 7 = Eøž>
      11 teal: dbba0 / 1 / 806 / 8f; db ba 01 80 68 f = Ûº€h
      16 brown: 8045f / 1a9f3 / 976 /8c0; 80 45 f1 a9 f3 97 68 c0 = Eñ©ó—hÀ
      13 lime: a4 / f83 / 1df8fd / ea; a4 f8 31 df 8f de a = ¤ø1ߏÞ
      8 purple: 37 / 18 / 3 / 035; 37 18 30 35 = 705

      I got nothing. Maybe it’s a prime? Or just pretty?

      1. The full number is;
        9FDB8B8A 004544F0 045F1737 D0BA2E0B 274CDF1A 9F588218
        FB435316 A16E3741 71FD19D8 D8F37C39 BF863FD6 0E3E3006
        80A3030C 6E4C3757 D08F70E6 AA87103

        As rj noted, this is the default D-H key for Apache

        First three bytes of maroon set are the html value for a purple
        First three bytes of teal set are the html value for a goldenrod
        First three bytes of brown set are the html value for a different purple
        First three byte of lime set are the html value for lime
        First three byte of purple set are the html value for a dark royal purple

        Still got no idea on the coloursets

  3. There is no such thing as privacy on a network. Rather, the way they have been engineered ensures that privacy is the last thing you may have. Just play nice, and keep your private matters well away.

  4. The signal-to-noise ration at HAD continues to drop. This is a topic of interest to system administrators that has been covered by approximately four hundred and seventy two tech blogs today. There are hundreds of worthy projects on hackaday.io that have never been covered in the blog. Why not show off the hacks and leave /. to cover the generic tech news?

    1. sorry I don’t visit 472 tech blogs. I visit 3-5 websites daily. I enjoyed reading this article. For all the crap on HaD, this is a peculiar complaint. Do not cover big stories because other competing sites will cover the? At most I’m scrolling through 10 posts on HaD a day. Is this really a complaint-worthy? Maybe, I’ve made content complaints before… I have a complaint now, your signal to noise analogy is totally garbage.

      1. My comment is here to push back against the trend at HAD to blog about general tech topics rather than focus on their core content. You say you’ve made content complaints before so I assume you agree that the comments are an appropriate venue for those complaints. I’m not overly concerned that you think your complaints were valid and mine are not. Since there is no other way to give feedback to the editors on what I want to see at HAD, I’m using the comments. If they add a Reddit-style rating system I’ll use that and not bother you anymore.

        Sorry you didn’t understand the SNR analogy. You see, signal refers to the content you would like to receive. In my case that would be articles about hacker friendly hardware and the projects people have built with it. Noise would be general tech articles that read like they were lifted from USA Today. So, more fluff pieces or fewer hacks drops the SNR. The math works out.

    2. While I certainly understand the point of the OP, I agree with Teller. Many of us simply do not have time to ingest the content from that many sites in a day.

      I am much more contented to see Brian’s take on this issue than some unknowledgeable junior desk jockey, or worse, an AI reporter.

      Keep up the great work!

  5. “This is the difference between cryptography theory and practice; 92% of the top 1 Million Alexa HTTPS domains use the same two prime numbers for D-H.”
    This is the example of one of the biggest problems in security software engineering, advanced encryption algorithms are worthless if the implementation is poor.

    1. Unfortunately, this was an example of reasonable practice until very recently. It’s very similar to the way in which F4 is used as the public exponent in basically all RSA operations, despite effectively cutting your key size in half.

      1. You can only take your encryption so far before future-proofing comes at the cost of actually being usable today. I don’t think it’s possible to avoid getting screwed by progress, at least not 100% of the time.

      2. It’s actually still arguably a reasonable practice now, because if you pick random primes, you have to make your protocol significantly more complicated in order to exchange them securely, and make sure your opponent isn’t deliberately choosing unsuitable ones. A simpler approach is to choose common primes that are so large they can’t reasonably be precomputed even by a nation-state.

  6. Seems to me the NSA could just use a social engineering approach:
    Hi, AT&T, Google, et al. This is the NSA. In the interests of national security, please hand over your private encryption keys, or functionally equivalent post-decryption backdoor access. Failure to comply could be considered an act of treason. Have a nice day.

    1. The US doesn’t require key escrow, but does require companies to provide user data they already have when presented with a lawful (eg, not overbroad) court order. Typically Google, Microsoft, etc will fight those court orders; I would be shocked if they complied with a request for private key material that could compromise users other than those for which a specific court order had been issued (doing so would not generally be obligatory under US law, and could be illegal). All of this is time consuming and expensive for the NSA, and it feeds transparency reports and counterintelligence engineering efforts at the big tech firms, who generally dislike being patsies for The Man.

      All of this is to say that if the NSA has to go to the service providers and ask them to turn over material, you’re in a way better place than if they can arbitrarily decrypt your traffic.

      1. geremycondra: To me “key-escrow” is a red herring. I still believe that they use BEACONS to back door your router and a specially designed Intel CPU chip to back door your PC. Key-escrow stuff is, to me, to throw you off the scent that they can watch you develop your pre-encrypted messages, etc. BEFORE you run them through your favorite cryptography software, Like going directly to the horses’ mouth as it were. You’d be surprised what a Nation Security Letter will make a chip manufacturer, logistics company (i.e. FedEx, etc.), and router manufacturer do for National Security, And FISA warrants are not always requested if it’s a “rush-job”. They can fix that after the fact – I’m guessing…

        SOTB

        1. While NSA etc clearly can backdoor devices, they also place a premium on not getting detected. Where possible, they will prefer a lower risk method– like passive collection– to anything that might lead to them getting caught red handed.

          Not that they don’t take risks, of course– just that they avoid them when they can.

          1. geremycondra: I’m with you on the “passive collection” methods and not getting caught. However passive is like drinking from a firehouse (i.e. Echelon etc.). But the focused approach of “back-dooring” is a “targeted” event and holds down the firehouse effect. Getting caught is hard as the PC backdoor is presently part of ALL Intel chips (arguably/allegedly). Allegedly the evolutionary grandchild of MYK-78 VLSI . Good luck finding it in your PC/MAC. Also the Cisco (et al) router BEACON exploit is firmware based. Good luck reverse engineering it with sister-agency FCC running interference preventing any dabbling with your future router firmware. Even if you could get in there, could you analyze firmware code to find the BEACON back-door?

  7. Firstly: PRIME NUMBERS??? Why is that limited math function useful for cryptography? I would think NON-PRIMES would work better. Just wondering…

    Secondly: How to bypass ALL cryptography? How about a secret “back door” to your computer and can see the PRE-encrypted data in real-time or store & forward it in non-real-time? How could someone do that? I mean they couldn’t put a BEACON function in your router’s firmware before it’s delivered to you allowing a back-door to your local LAN; could they? Then being on your LAN UN-authorized by you (and invisible as they need no separate TCPIP address as they are essentially your router now) they couldn’t access a very secret function (or “chip”) secretly installed in your motherboard at the manufacturer by secret natsec letter pre-arraignments could they? Of course they wouldn’t activate any of this bigbro stuff without a proper FISA warrant would they?

    Remember that movie SNEAKERS (1992) – “No more secrets…”. And don’t forget ENEMY OF THE STATE (1998). What was Tony Scott smokin’ back in 1998? Life imitating art or vice versa?

    1. There has to be some browser addon that runs a bayesian filter or something to recognize schizophrenic ranting. It could look for triple question marks, all-caps, bold, and italics all appearing in the same paragraph, then place an appropriate warning before the text.

      :V

      1. Thanks for all the anti-schizo tips … However all that I said was from news media not my demented mind,,, :P

        Re: ALCOA… MIT said that aluminum hats un-grounded only enhances RF effect. BTW tin and aluminum are two different metals… Does use of too many ellipses make me look crazy too? :P

  8. Algebra101 says: One cannot “crack” (factor) a prime number. Only composite numbers may be factored. News reports and commentaries which belay an ignorance of this fact are awarded a -1 on their Credibility stat.

    1. If you read the research, or any of the many detailed posts on it, you’d learn that it has nothing to do with factoring prime numbers, it’s about a precomputation attack that you can compute when you know what numbers were selected.

    2. Algebra101: Although others are making it look like your making a straw-man argument… I know exactly what you mean. It seems some are a little too preoccupied trying to lead us down NSA-CSS-inspired rabbit holes. The fact that PRIMES really don’t give you many strong cryptography functions is inescapable. A large subset of factored non-primes (i.e. composite numbers) would. To “crack” a method that uses them would take a herculean mathematical feat worthy of the supercomputers at Bluffdale. That’s if they worked out their initial building power problems at NSA UDC.

      Does Diffie-Hellman really use “PRIME” numbers or is it just convenient to strategically use a deliberate misnomer to muddy the waters a bit? I mean go back to your High School math class. If you factor a huge NP you have a matrix of seemingly random numbers (but not really random at all) – use as your public-key basis. Then one could use the divisors as a private-key.

      Before you flame me it is not easy to explain a complex math function like this and sound sane. I am not good at ‘splainin’ stuff like this Lucy… :-P

  9. Very interesting read. On a public key system or in any encryption where the key can be verified to be valid, a variable amount of computing power can find the key. We simply don’t know how much computing power is used on our encryption. It’s paranoid but if you think that something should not be of public knowledge, do not rely on a electronic device, even if the encryption itself may be in some cases strong there could be a back door. Purchases on internet and bank transactions seems to be unaffected because who take the money could be easily identified. Our mobile phone gives out constantly updates on our position and our network of friends. I love electronic things and technology but have to be used carefully, thank to remind that.

      1. Yes I agree OTP is a method of cryptography where the key can not be verified to be valid (every key will work) but is unpractical. There is an interesting project here on Hackaday that use it.

  10. Are we not just all assuming the NSA/GCHQ can read anything they damn well please from any of our devices any time, by means we probably haven’t even realised might exist? A brief glance at the snowden docs or back through the history of intelligence might lead anyone to assume it’s pretty futile worrying about them, and securing our shit against “amateur” or criminal hacking is a better use of our time.

    1. I think it’s normal for the potential to be under that level of scrutiny to make people uncomfortable and paranoid, even if it weren’t possible to end up on some kind of no-fly list or other comical orwellian mishap. Not everyone can stop worrying and learn to love the bomb.

  11. the $1000000 question is is the nsa going to be commissioned by the mpaa and riaa and copyrights to or even hand over info for the copyright wars?

    or will the nsa keep the technology and info for the war on terrorism?

  12. “There is a bright side to this revelation: the ability to pre-compute the ‘crack’ on these longer primes is a capability that can only be attained by nation states”

    But they hacked the servers of the US security clearance and such, meaning that basically many hackers only have to hack the US government computers to either get the prime-crack, or to get access to their supercomputers and have them calculate it for you.

    And talking of prime crack, I’m sure you find the physical kind too in government offices :)

    1. “… can only be attained by nation states.”

      Yeah this statement is problematic for too many reasons. There are many, non nation state entities with sufficient resources and the requisite motivation to precompute these hashes. Tens of millions is a drop in the bucket for certain individuals in our nation; especially when compared to the gains to be had from scrubbing the emails of the top ten percent of stock traders and other economic analysts.

  13. Look into the history of AES… what used to be called Rijndael.. There was a competition sponsored by NIST (and the NSA) back in the 90’s to find a successor to 3DES and as I was into crypto at the time, I specifically remember that one of the requirements was that the winning algorithm must have a master key… This fact seems to have conveniently disappeared from the interwebs…. AES has a master key that the NSA has used for years…. no complex breaking of code, just a simple application of the master key…

Leave a Reply to ytssCancel reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.