Steganography In Xkcd Comics Without The Img Alt Tag

March 10, 2012

Inspired by a recent Hackaday post [austin] decided to try his hand at steganography. Steganography, or ‘concealed writing’ has come a long way from ancient Greek slaves/couriers shaving their head, tattooing a message on their scalp, and regrowing their hair. We recently saw a music file masquerading as a picture of a kitten, but that method of hiding data required running a Ruby script. [austin] thought steganography would be a great way to hone his JavaScript skills, so he made an image encoder and decoder purely in JS and HTML.

Like the previous incarnation, [austin]’s work takes a regular .PNG image file and hides stuff in the pixel data. A few of the lower bits for each pixel are modified (three bits from the red and blue, two bits from the green – a good choice, the human eye is very sensitive to green) and a file is embedded inside the .PNG image.

For an example, [austin] embedded some stuff inside the xkcd comic underneath this post’s title. Even though the image is mostly white, we can’t see anything wrong with the colors. If you’d like to decode the message, [austin] put his encoder and decoder up on github. Feel free to take a shot at it.

28 thoughts on “Steganography In Xkcd Comics Without The Img Alt Tag”

M4CGYV3R says:

March 10, 2012 at 8:58 am

Is there something that prevents using the Alt-text tags in HTML after doing the stego conversion?

I was under the impression alt text had nothing to do with the image itself and was just an HTML feature.

Report comment

Reply
1. florin says:
  
  March 10, 2012 at 9:14 am
  
  You’re right, the image has nothing to do with the HTML. The title of this submission is intentionally misleading as XKCD has absolutely nothing to do with this project, except that Tus used an XKCD image as demonstration of what he did. Brian’s analogy that hiding data in XKCD images is somehow comparable to hiding text in HTML IMG ALT properties is laughable.
  
  Report comment
  
  Reply
  1. Brian Benchoff says:
    
    March 10, 2012 at 11:02 am
    
    It was a poke at Randall Munroe’s use of the alt tag for a ‘second punchline’ in xkcd comics. So yeah, a joke, but I thought it would have been better received.
    
    Report comment
    
    Reply
    1. florin says:
      
      March 10, 2012 at 11:47 am
      
      I’m sorry, Brian, I’m allergic to using this kind of references to attract readers. I think I may have overdosed on the Internet… After following this blog for a while, I understand that it’s not common to do that here, but I felt like I had to say something about it. Sorry if I sounded like a dick, keep up the good work!
      
      Report comment
      
      Reply
2. Eirinn says:
  
  March 10, 2012 at 9:43 am
  
  You should be able to use the alt tag as per usual.
  
  Report comment
  
  Reply
  1. No One says:
    
    March 10, 2012 at 12:22 pm
    
    Randy uses the “title” tag, not the “alt” tag to add an extra punchline to jokes. Get it right, people.
    
    Report comment
    
    Reply
  2. John Bokma says:
    
    March 11, 2012 at 1:11 pm
    
    @No One “Randy uses the “title” tag, not the “alt” tag to add an extra punchline to jokes. Get it right, people.”
    
    Neither are tags, they are attribute names.
    
    Report comment
    
    Reply
florin says:

March 10, 2012 at 9:09 am

Should be noted that it is very easy to detect stego in drawings (like the XKCD), because it is often obvious and easy to detect that some pixels should be white, but they aren’t. Yes, the password does encrypt the data and this is an amazingly simple, working and praise-worthy example made by Tuseroni, but if anybody wants to seriously use steganography, they should document themselves on what images should be used and how secure and detectable it is (Wikipedia si a good place to start).

Wonderful job, Tus!

Report comment

Reply
1. mike says:
  
  March 12, 2012 at 4:03 pm
  
  yeah, if you use any image with solid colors, or the original is available in some way, then its trivial to detect that something was done to it
  
  you may need the password to extract the exact data, but its obvious something is hidden
  
  thats why its best to delete the originals after hiding the data in them
  
  ive also used http://sourceforge.net/projects/hide-in-picture/ to do the same task
  
  Report comment
  
  Reply
War_Spigot says:

March 10, 2012 at 9:11 am

Could you hide the data for a smaller version of the picture inside of the picture’s data?

Report comment

Reply
1. Urza9814 says:
  
  March 10, 2012 at 9:50 am
  
  You could probably hide the data for the picture itself inside the picture. It’s black and white — one bit per pixel. And you’re taking eight bits per pixel to hide your data. Hell, you could maybe even store a LARGER version of the same image.
  
  Report comment
  
  Reply
  1. austin says:
    
    March 10, 2012 at 10:05 am
    
    as an example, in one case i encoded a text file into an image, put the image, the encode, the decoder, and a text file with the password to that image in a zip, encoded THAT into another image.
    
    it has to do largely with how many pixels are in the cover image. so if an image had 1024×768 resolution it could hold (1024*768) bytes or 786,432 ~786 kilobytes the upper bounds for storing data has to do with how many pixels the image has.
    
    encoding in audio should allow for bigger files, video even bigger still (but i dont know if i can do video since there doesnt seem to be an api for writing to video tags like there is for audio, the processing of video would be simpler though because it would be just like processing for image but a lot more of them.)
    
    Report comment
    
    Reply
2. Andrew Pollack says:
  
  March 10, 2012 at 9:58 am
  
  it’s just bytes. You could hide anything. More realistically, if you were trying to be sneaky, you wouldn’t hide the actual data there. You’d hide a URL (probably a shortened one) to the actual data you wanted to hide (which would of course be encrypted).
  
  Example, a typical “bit.ly” shortened URL is http://bit.ly/xxxxxxxx where the x’s represent a hash used to reference the actual URL. If yo wanted to be sneaky, your algorithm could specify that it’s always a bit.ly URL, so you could leave off the “http://bit.ly/” part. Now, you’re only having to store 6 or so bytes of data across the entire image, plus whatever header data you need to identify the image has having your stuff hidden in it and what version of hiding system you used. Figure maybe 10-12 bytes of data would do it.
  
  Of course, you wouldn’t use bit.ly. You’d make and host your own shortner — and it need not be a short domain name because that’s not stored in the image. You could even ROTATE the domain name regularly so that only people getting updates on a regular basis would be able to have their software follow the path to the actual hidden data.
  
  Report comment
  
  Reply
orenbeck says:

March 10, 2012 at 9:11 am

Sometimes, more is learned in the creation of an excellent hack such as this, than is first appreciated. The same sense of adding a whole realm and it’s derivatives to our Grimoire is to be found by studying how they did it. Points to austin for gifting Hackerdom with a touchstone primer.

When any of us master replicating someone else’s implementation , the experience points we gain are priceless. If noting else, for the addictive buzz in our Egoboost. It’s that way for me. Might not we call that Egoboo affect alteration a-mindhacking multiplier?

I’d even go so far as to make it a challenge to us at replicating some featured hacks.. possibly as a cloning to cement the “how-to” in our minds? Or better still- OUR personal best at going further on their path..

Now I’ve got to block out some time for replicating how this Hack is best applied in my uses for it:}

Report comment

Reply
Andrew Pollack says:

March 10, 2012 at 9:19 am

I love the way he’s implemented with Javascript, because it means you could have a very simple plugin for most browers that would overlay the hidden message across the image as you viewed websites. You could even have a javascript file hosted on a secured server somewhere and reference it with resource tag on the web page, so that only users with proper access (a token cookie or a certificate, for example) would load the javascript to reveal the text, while everyone else would get an essentially empty javascript file.

This would be an ideal way to have a “hidden” web underneath the current web without necessarily having to hide the servers. I think if you were to do it for real, you’d want to obscure things a bit more, like having the identifier that indicates context exists (and where it starts, and which algorithm to use for decryption) be located at an offset byte computed with a salted hash of the image itself, so that without the salt you couldn’t even programatically identify which images on the net contained the hidden content.

@M4CGYV3R – I think the reference to alt text is simply a way of saying that he doesn’t mean the pseudo-steganographic (oh yeah, I made up that word) way he uses alt tags to add to the humor or message in his comics. He’s saying this is “real” steganography.

Report comment

Reply
S says:

March 10, 2012 at 9:32 am

This is how it looks like with auto gamma adjusted:

I realize that true stealth wasn’t the auhor’s intention and it’s interesting as a proof of concept and certainly will fool most occasional unaware viewers easily but not even the simplest systematic scan.

Report comment

Reply
1. S says:
  
  March 10, 2012 at 9:43 am
  
  Err… the link got lost?
  http://img824.imageshack.us/img824/2097/violinadjusted.png
  
  Report comment
  
  Reply
2. Andrew Pollack says:
  
  March 10, 2012 at 9:47 am
  
  As you say, he wasn’t looking to be super secret with it. As a proof of concept it’s awesome. If you were doing it in a way you wanted real secrecy, you’d likely use far more complex images, and your algorithm would be setup to skip “pure” color pixels (lets say any of the named colors in html) — especially white and black. You probably also want to avoid some ranges common in flesh tones.
  
  Report comment
  
  Reply
  1. Dax says:
    
    March 10, 2012 at 6:48 pm
    
    The error is least noticeable in pure white and black because displays tend to “clip” the black end, and the relative error to a full white is small.
    
    The signal is most noticeable in the midtones, where it looks like extra noise that shouldn’t be there.
    
    Report comment
    
    Reply
3. austin says:
  
  March 10, 2012 at 10:18 am
  
  hmm interesting. however there is one saving grace, they do still have to get the password, and they have to use the same random number generator…but i always figured if they found the data it wouldnt be much to figure out the order just from looking for the header of the underlying data (which they could probably just try every header there’s not that many)
  
  but you are correct there is clearly more work to be done here
  
  Report comment
  
  Reply
crow1170 says:

March 11, 2012 at 3:12 am

Haha! I see what you did there. I had to install blender to do it, but that’s quite clever.

Report comment

Reply
1. austin says:
  
  March 11, 2012 at 12:11 pm
  
  yeah i thought id be meta with it
  btw the project for that can be found here:
  http://www.thingiverse.com/thing:3193
  i forgot to cite them in the project.
  
  Report comment
  
  Reply
Mojo says:

March 11, 2012 at 4:51 am

I’ve been working on something like this for a client, that uses the canvas element to pull stego’d js from an image, bypassing the need to compromise the script scanner on the target device.

Report comment

Reply
1. austin says:
  
  March 11, 2012 at 12:22 pm
  
  hmm curious what the reason is for that.
  it would allow you to put a JS file on someones server but you couldnt link straight to it, nor would you be able to use it without having a javascript script running on the page to decode it from the image also the canvas cannot read images cross origin without permission from the server.
  
  however you could make a polyglot gif that is simultaneously a gif and a js file, that gif could then be directly linked to and used with
  (http://en.wikipedia.org/wiki/Polyglot_%28computing%29)
  
  Report comment
  
  Reply
YouCantSeeMe says:

March 11, 2012 at 7:51 am

But won’t this interfere with the steggo data that Randall already hides in the images?

Report comment

Reply
Headbonk says:

March 11, 2012 at 12:39 pm

Sounds like the thing the guys on Spore came up with to store creature data in the images

Report comment

Reply
Andreas says:

March 12, 2012 at 12:45 am

I made a comparison similar to S’s:
http://foto.photosphere.nl/directories/misc/comparison%20open%20source%20violin.png
The first is the original comic, the second is the comic with hidden data, third is original with sharpening on 99, fourth is comic with hidden data and also sharpening on 99.

Report comment

Reply
Truth says:

March 12, 2012 at 3:47 am

It made me laugh when I missed that the data was also compressed, probably the default for a Belnder3D file. I really do love the UNIX “file” command and it’s associated magic library.

$ file vziYDD7k.part
vziYDD7k.part: gzip compressed data, from NTFS filesystem (NT)
$ zcat vziYDD7k.part > vziYDD7k
$ file vziYDD7k
vziYDD7k: Blender3D, saved as 32-bits little endian with version 2.52.0005
$

Report comment

Reply