Steganography in xkcd comics without the img alt tag

Inspired by a recent Hackaday post [austin] decided to try his hand at steganography. Steganography, or ‘concealed writing’ has come a long way from ancient Greek slaves/couriers shaving their head, tattooing a message on their scalp, and regrowing their hair. We recently saw a music file masquerading as a picture of a kitten, but that method of hiding data required running a Ruby script. [austin] thought steganography would be a great way to hone his JavaScript skills, so he made an image encoder and decoder purely in JS and HTML.

Like the previous incarnation, [austin]‘s work takes a regular .PNG image file and hides stuff in the pixel data. A few of the lower bits for each pixel are modified (three bits from the red and blue, two bits from the green – a good choice, the human eye is very sensitive to green) and a file is embedded inside the .PNG image.

For an example, [austin] embedded some stuff inside the xkcd comic underneath this post’s title. Even though the image is mostly white, we can’t see anything wrong with the colors. If you’d like to decode the message, [austin] put his encoder and decoder up on github. Feel free to take a shot at it.

Comments

  1. M4CGYV3R says:

    Is there something that prevents using the Alt-text tags in HTML after doing the stego conversion?

    I was under the impression alt text had nothing to do with the image itself and was just an HTML feature.

    • florin says:

      You’re right, the image has nothing to do with the HTML. The title of this submission is intentionally misleading as XKCD has absolutely nothing to do with this project, except that Tus used an XKCD image as demonstration of what he did. Brian’s analogy that hiding data in XKCD images is somehow comparable to hiding text in HTML IMG ALT properties is laughable.

      • It was a poke at Randall Munroe’s use of the alt tag for a ‘second punchline’ in xkcd comics. So yeah, a joke, but I thought it would have been better received.

        • florin says:

          I’m sorry, Brian, I’m allergic to using this kind of references to attract readers. I think I may have overdosed on the Internet… After following this blog for a while, I understand that it’s not common to do that here, but I felt like I had to say something about it. Sorry if I sounded like a dick, keep up the good work!

    • Eirinn says:

      You should be able to use the alt tag as per usual.

  2. florin says:

    Should be noted that it is very easy to detect stego in drawings (like the XKCD), because it is often obvious and easy to detect that some pixels should be white, but they aren’t. Yes, the password does encrypt the data and this is an amazingly simple, working and praise-worthy example made by Tuseroni, but if anybody wants to seriously use steganography, they should document themselves on what images should be used and how secure and detectable it is (Wikipedia si a good place to start).

    Wonderful job, Tus!

  3. War_Spigot says:

    Could you hide the data for a smaller version of the picture inside of the picture’s data?

    • Urza9814 says:

      You could probably hide the data for the picture itself inside the picture. It’s black and white — one bit per pixel. And you’re taking eight bits per pixel to hide your data. Hell, you could maybe even store a LARGER version of the same image.

      • austin says:

        as an example, in one case i encoded a text file into an image, put the image, the encode, the decoder, and a text file with the password to that image in a zip, encoded THAT into another image.

        it has to do largely with how many pixels are in the cover image. so if an image had 1024×768 resolution it could hold (1024*768) bytes or 786,432 ~786 kilobytes the upper bounds for storing data has to do with how many pixels the image has.

        encoding in audio should allow for bigger files, video even bigger still (but i dont know if i can do video since there doesnt seem to be an api for writing to video tags like there is for audio, the processing of video would be simpler though because it would be just like processing for image but a lot more of them.)

    • it’s just bytes. You could hide anything. More realistically, if you were trying to be sneaky, you wouldn’t hide the actual data there. You’d hide a URL (probably a shortened one) to the actual data you wanted to hide (which would of course be encrypted).

      Example, a typical “bit.ly” shortened URL is http://bit.ly/xxxxxxxx where the x’s represent a hash used to reference the actual URL. If yo wanted to be sneaky, your algorithm could specify that it’s always a bit.ly URL, so you could leave off the “http://bit.ly/” part. Now, you’re only having to store 6 or so bytes of data across the entire image, plus whatever header data you need to identify the image has having your stuff hidden in it and what version of hiding system you used. Figure maybe 10-12 bytes of data would do it.

      Of course, you wouldn’t use bit.ly. You’d make and host your own shortner — and it need not be a short domain name because that’s not stored in the image. You could even ROTATE the domain name regularly so that only people getting updates on a regular basis would be able to have their software follow the path to the actual hidden data.

  4. orenbeck says:

    Sometimes, more is learned in the creation of an excellent hack such as this, than is first appreciated. The same sense of adding a whole realm and it’s derivatives to our Grimoire is to be found by studying how they did it. Points to austin for gifting Hackerdom with a touchstone primer.

    When any of us master replicating someone else’s implementation , the experience points we gain are priceless. If noting else, for the addictive buzz in our Egoboost. It’s that way for me. Might not we call that Egoboo affect alteration a-mindhacking multiplier?

    I’d even go so far as to make it a challenge to us at replicating some featured hacks.. possibly as a cloning to cement the “how-to” in our minds? Or better still- OUR personal best at going further on their path..

    Now I’ve got to block out some time for replicating how this Hack is best applied in my uses for it:}

  5. I love the way he’s implemented with Javascript, because it means you could have a very simple plugin for most browers that would overlay the hidden message across the image as you viewed websites. You could even have a javascript file hosted on a secured server somewhere and reference it with resource tag on the web page, so that only users with proper access (a token cookie or a certificate, for example) would load the javascript to reveal the text, while everyone else would get an essentially empty javascript file.

    This would be an ideal way to have a “hidden” web underneath the current web without necessarily having to hide the servers. I think if you were to do it for real, you’d want to obscure things a bit more, like having the identifier that indicates context exists (and where it starts, and which algorithm to use for decryption) be located at an offset byte computed with a salted hash of the image itself, so that without the salt you couldn’t even programatically identify which images on the net contained the hidden content.

    @M4CGYV3R – I think the reference to alt text is simply a way of saying that he doesn’t mean the pseudo-steganographic (oh yeah, I made up that word) way he uses alt tags to add to the humor or message in his comics. He’s saying this is “real” steganography.

  6. S says:

    This is how it looks like with auto gamma adjusted:

    I realize that true stealth wasn’t the auhor’s intention and it’s interesting as a proof of concept and certainly will fool most occasional unaware viewers easily but not even the simplest systematic scan.

    • S says:
    • As you say, he wasn’t looking to be super secret with it. As a proof of concept it’s awesome. If you were doing it in a way you wanted real secrecy, you’d likely use far more complex images, and your algorithm would be setup to skip “pure” color pixels (lets say any of the named colors in html) — especially white and black. You probably also want to avoid some ranges common in flesh tones.

      • Dax says:

        The error is least noticeable in pure white and black because displays tend to “clip” the black end, and the relative error to a full white is small.

        The signal is most noticeable in the midtones, where it looks like extra noise that shouldn’t be there.

    • austin says:

      hmm interesting. however there is one saving grace, they do still have to get the password, and they have to use the same random number generator…but i always figured if they found the data it wouldnt be much to figure out the order just from looking for the header of the underlying data (which they could probably just try every header there’s not that many)

      but you are correct there is clearly more work to be done here

  7. crow1170 says:

    Haha! I see what you did there. I had to install blender to do it, but that’s quite clever.

  8. Mojo says:

    I’ve been working on something like this for a client, that uses the canvas element to pull stego’d js from an image, bypassing the need to compromise the script scanner on the target device.

    • austin says:

      hmm curious what the reason is for that.
      it would allow you to put a JS file on someones server but you couldnt link straight to it, nor would you be able to use it without having a javascript script running on the page to decode it from the image also the canvas cannot read images cross origin without permission from the server.

      however you could make a polyglot gif that is simultaneously a gif and a js file, that gif could then be directly linked to and used with
      (http://en.wikipedia.org/wiki/Polyglot_%28computing%29)

  9. YouCantSeeMe says:

    But won’t this interfere with the steggo data that Randall already hides in the images?

  10. Headbonk says:

    Sounds like the thing the guys on Spore came up with to store creature data in the images

  11. Andreas says:

    I made a comparison similar to S’s:
    http://foto.photosphere.nl/directories/misc/comparison%20open%20source%20violin.png
    The first is the original comic, the second is the comic with hidden data, third is original with sharpening on 99, fourth is comic with hidden data and also sharpening on 99.

  12. Truth says:

    It made me laugh when I missed that the data was also compressed, probably the default for a Belnder3D file. I really do love the UNIX “file” command and it’s associated magic library.

    $ file vziYDD7k.part
    vziYDD7k.part: gzip compressed data, from NTFS filesystem (NT)
    $ zcat vziYDD7k.part > vziYDD7k
    $ file vziYDD7k
    vziYDD7k: Blender3D, saved as 32-bits little endian with version 2.52.0005
    $

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s