Cloudbleed — Your Credentials Cached In Search Engines

In case you are still wondering about the SHA-1 being broken and if someone is going to be spending hundreds of thousands of dollars to create a fake Certificate Authority and sniff your OkCupid credentials, don’t worry. Why spend so much money when your credentials are being cached by search engines?… Wait, what?

A serious combination of bugs, dubbed Cloudbleed by [Tavis Ormandy], lead to uninitialized memory being present in the response generated by the reverse proxies and leaked to the requester. Since these reverse proxies are shared between Cloudflare clients, this makes the problem even worst, since random data from random clients was leaking. It’s sort of like Heartbleed for HTTP requests. The seriousness of the issue can be fully appreciated in [Tavis] words:

“The examples we’re finding are so bad, I cancelled some weekend plans to go into the office on Sunday to help build some tools to cleanup. I’ve informed cloudflare what I’m working on. I’m finding private messages from major dating sites, full messages from a well-known chat service, online password manager data, frames from adult video sites, hotel bookings. We’re talking full https requests, client IP addresses, full responses, cookies, passwords, keys, data, everything.”

sexAccording to Cloudflare, the leakage can include HTTP headers, chunks of POST data (perhaps containing passwords), JSON for API calls, URI parameters, cookies and other sensitive information used for authentication (such as API keys and OAuth tokens). An HTTP request to a Cloudflare web site that was vulnerable could reveal information from other unrelated Cloudflare sites.

Adding to this problem, search engines and any other bot that roams free on the Internet, could have randomly downloaded this data. Cloudflare released a detailed incident report explaining all the technicalities of what happened and how they fixed it. It was a very quick incident response with initial mitigation in under 47 minutes. The deployment of the fix was also quite fast. Still, while reading the report, a sense that Cloudflare downplayed this issue remains. According to Cloudflare, the earliest date that this problem could have started is 2016-09-22 and the leak went on until 2017-02-18, five months, give or take.

Just to reassure the readers and not be alarmist, there is no evidence of anyone having exploiting what happened. Before public exposure, Cloudflare worked in proximity with search engines companies to ensure memory was scrubbed from search engine caches from a list of 161 domains they had identified. They also report that Cloudflare has searched the web (!), in sites like Pastebin, for signs of leaks and found none.

On the other hand, it might be very well impossible to know for sure if anyone has a chunk of this data cached away somewhere in the aether. It’s impossible to know. What we would really like to know is: does [Tavis] get the t-shirt or not?

25 thoughts on “Cloudbleed — Your Credentials Cached In Search Engines

        1. It’s called a Devil’s Tooth, and apparently is very hot in taste. That photo, by the way, is one of the very best of this species – in the wild it rarely presents that beautifly.

    1. Yeah, I was thrown off pretty badly by the picture, too. I’ve never heard of trypophobia before, but I’ve totally been averse to that sort of pattern for a long time. Weird that that’s actually a thing other than my own quirkiness.

    1. Well, it’s like asking someone to try and not think about pink elephants… He came up with the name and then said he’s not going to use it, but he already invented it… it’s sort of a conundrum. But thanks for the tip [g].

    1. Security requires that you know where all your buffers are and what state they are left in when your done. Any language abstraction that does not explicitly mange this (as bare C does) will have issues. The code that this bug lived in was not written in C/C++, like most higher level languages it ran on top of C/C++. So unless your magic new language gets translated directly from MAGIC to machine code, the likely hood that its actually running on top of C/C++ is almost 100%.

      1. Rust does not sit on top of C/C++, it can be as low-level as C and as unsafe too. But it gives tools to avoid common C/C++ pitfalls while having a “magic” level of abstraction, if I understood well your requirements.

      2. This is not the case for Rust at all.

        I recommend you check it out, in a world with a dozen “best next language” a day Rust is doing rather well.

        In this case where you are parsing a string in c/c++, you might iterate over a collection in rust.

        (In this case you no longer need to worry about passing around the end point as that will be in with the collection).

        A good example is that buffer overflow which esentially boils down to


        In rust (you could do that) you would do


        Which could return a some or a none type.

        fn main() {
        let mut coll = vec!();
        match coll.get(0) {
        Some(chr) => println!(“That is a {}.”, chr),
        None => println!(“We done fucked up”)

        Omit the None and Rust will complain that you must cover a None return. (In general pattern matching should be exhaustive and Rust will warn you if you if it isn’t).

        Notice the mut before coll? Otherwise coll would be immutable and coll.push(1) would not compile.

        That’s important because in any scope there can only be 0 or 1 mutable references to a variable (and there goes memory race conditions).

        These compile time checks in theory should have little impact on the binary. So yes C/C++ binary may be smaller or faster but only at the sake of having bugs like this article.

        If you really need to do something unsafe then you can do it in an unsafe { … } scope.

    2. How about all of you actually read the blog post about how this worked out?

      Absolutely none of the buggy applications were written in C, and this is a perfect argument against using safe-ish languages like rust.

      They were written in another, higher level language which was translated into C as an intermediary, which was then compiled. This means that the C code was auto-generated and was never, ever seen by human eyes. The pointer overrun issue, which could have been corrected with a >= instead of a ==, went unseen because of this. All of this occurred because of an unfamiliarity with the high level language being used, and how it would translate down into generated C.

      It could be argued that had they not opted to go for the safe-ish higher level language they did, this kind of issue would have been caught and fixed as a typical thing to look for in C. And that’s the real danger with appealing to a ‘safer’ language. If it’s not ABSOLUTELY rock solid in how it translates then you can very easily create hidden bugs like this by not seeing how the high level concepts that are supposed to save you trouble get mechanically and rigidly implemented as lower level ones.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.