Cheating AI Caught Hiding Data Using Steganography

AI today is like a super fast kid going through school whose teachers need to be smarter than if not as quick. In an astonishing turn of events, a (satelite)image-to-(map)image conversion algorithm was found hiding a cheat-sheet of sorts while generating maps to appear as it if had ‘learned’ do the opposite effectively[PDF].

The CycleGAN is a network that excels at learning how to map image transformations such as converting any old photo into one that looks like a Van Gogh or Picasso. Another example would be to be able to take the image of a horse and add stripes to make it look like a zebra. The CycleGAN once trained can do the reverse as well, such as an example of taking a map and convert it into a satellite image. There are a number of ways this can be very useful but it was in this task that an experiment at Google went wrong.

A mapping system started to perform too well and it was found that the system was not only able to regenerate images from maps but also add details like exhaust vents and skylights that would be impossible to predict from just a map. Upon inspection, it was found that the algorithm had learned to satisfy its learning parameters by hiding the image data into the generated map. This was invisible to the naked eye since the data was in the form of small color changes that would only be detected by a machine. How cool is that?!

This is similar to something called an ‘Adversarial Attack‘ where tiny amounts of hidden data in an image or other data-set will cause an AI to produce erroneous output. Small numbers of pixels could cause an AI to interpret a Panda as a Gibbon or the ocean as an open highway. Fortunately there are strategies to thwart such attacks but nothing is perfect.

You can do a lot with AI, such as reliably detecting objects on a Raspberry Pi, but with Facial Recognition possibly violating privacy some techniques to fool AI might actually come in handy.

Shakespeare in a Zip in a RAR, Hidden in an Image on Twitter

Steganography involves hiding data in something else — for example, encoding data in a picture. [David Buchanan] used polyglot files not to hide data, but to send a large amount of data in a single Twitter post. We don’t think it quite qualifies as steganography because the image has a giant red UNZIP ME printed across it. But without it, you might not think to run a JPG image through your unzip program. If you did, though, you’d wind up with a bunch of RAR files that you could unrar and get the complete works of the Immortal Bard in a single Tweet. You can also find the source code — where else — on Twitter as another image.

What’s a polyglot file? Jpeg images have an ICC (International Color Consortium) section that defines color profiles. While Twitter strips a lot of things out of images, it doesn’t take out the ICC section. However, the ICC section can contain almost anything that fits in 64 kB up to a limit of 16 MB total.

The ZIP format is also very flexible. The pointer to the central directory is at the end of the file. Since that pointer can point anywhere, it is trivial to create a zip file with extraneous data just about anywhere in the file.

Continue reading “Shakespeare in a Zip in a RAR, Hidden in an Image on Twitter”

Pea-Whistle Steganography

Do you see the patterns everywhere around you? No? Look closer. Still no? Look again. OK, maybe there’s nothing there.

[Oona Räisänen] hears signals and then takes them apart. And even when there’s nothing there, she’s thinking “what if they were?” Case in point: could one hypothetically transmit coded information in the trilling of a referee’s whistle at the start of a soccer match?

acme-spektriTo you, the rapid pitch changes made by the little ball that’s inside a ref’s whistle sounds like “trilling” or “warbling” or something. To [Oona], it sounds like frequency-shift key (FSK) modulation. Could you make a non-random trilling, then, that would sound like a normal whistle?

Her perl script says yes. It takes the data you want to send, encodes it up as 100 baud FSK, smoothes it out, adds some noise and additional harmonics, and wraps it up in an audio file. There’s even a couple of sync bytes at the front, and then a byte for packet size. Standard pea-whistle protocol (PWP), naturally. If you listen really closely to the samples, you can tell which contains data, but it’s a really good match. Cool!

[Oona] has graced our pages before, naturally. From this beautiful infographic tracing out a dial-up modem handshake to her work reversing her local bus stop information signs or decoding this strange sound emitted by a news helicopter, She’s full of curiosity and good ideas — a hacker’s hacker. Her talk on the bus stop work is inspirational.. She’s one of our secret heroes!

Stegosploit: Owned by a JPG

We’re primarily hardware hackers, but every once in a while we see a software hack that really tickles our fancy. One such hack is Stegosploit, by [Saumil Shah]. Stegosploit isn’t really an exploit, so much as it’s a means of delivering exploits to browsers by hiding them in pictures. Why? Because nobody expects a picture to contain executable code.

stegosploit_diagram[Saumil] starts off by packing the real exploit code into an image. He demonstrates that you can do this directly, by encoding characters of the code in the color values of the pixels. But that would look strange, so instead the code is delivered steganographically by spreading the bits of the characters that represent the code among the least-significant bits in either a JPG or PNG image.

OK, so the exploit code is hidden in the picture. Reading it out is actually simple: the HTML canvas element has a built-in getImageData() method that reads the (numeric) value of a given pixel. A little bit of JavaScript later, and you’ve reconstructed your code from the image. This is sneaky because there’s exploit code that’s now runnable in your browser, but your anti-virus software won’t see it because it wasn’t ever written out — it was in the image and reconstructed on the fly by innocuous-looking “normal” JavaScript.

232115_1366x1792_scrotAnd here’s the coup de grâce. By packing HTML and JavaScript into the header data of the image file, you can end up with a valid image (JPG or PNG) file that will nonetheless be interpreted as HTML by a browser. The simplest way to do this is send your file myPic.JPG from the webserver with a Content-Type: text/html HTTP header. Even though it’s a totally valid image file, with an image file extension, a browser will treat it as HTML, render the page and run the script it finds within.

The end result of this is a single image that the browser thinks is HTML with JavaScript inside it, which displays the image in question and at the same time unpacks the exploit code that’s hidden in the shadows of the image and runs that as well. You’re owned by a single image file! And everything looks normal.

We like this because it combines two sweet tricks in one hack: steganography to deliver the exploit code, and “polyglot” files that can be read two ways, depending on which application is doing the reading. A quick tag-search of Hackaday will dig up a lot on steganography here, but polyglot files are a relatively new hack.

[Ange Ablertini] is the undisputed master of packing one file type inside another, so if you want to get into the nitty-gritty of [Ange]’s style of “polyglot” file types, watch his talk on “Funky File Formats” (YouTube). You’ll never look at a ZIP file the same again.

Sweet hack, right? Who says the hardware guys get to have all the fun?

Random Parcel Launches Steganographic Compulsion

A mysterious CD arrives in the mail with a weird handwritten code on it. What should you do? Put it in the computer and play the thing, of course!

Some might be screaming at their screens right now… this is how modern horror films start and before you know it the undead are lurking behind you waiting to strike. Seasonal thrills aside, this is turning into an involved community effort to solve the puzzle. [Johny] published the video and posted a thread on reddit.

We ran a similar augmented reality game to launch the 2014 Hackaday Prize solved by a dedicated group of hackers. It’s really hard to design puzzles that won’t be immediately solved but can eventually be solved with technology and a few mental leaps. When we come across one of these extremely clever puzzles, we take note.

This has all the hallmarks of a good time. The audio spectrogram shows hidden data embedded in the file — a technique known as steganography. There are some real contortions to make meaning from this. When you’re looking for a solution any little hit of a pattern feels like you’ve found something. But searching for the decrypted string yields a YouTube video with the same name; we wonder if they’ve tried to recover steganographic data from that source?

[Johny] mentions that this parcel was unsolicited and that people have suggested it’s a threat or something non-sensical in its entirety. We’re hoping it’s a publicity stunt and we’re all disappointed in the end, because solving the thing is the best part and publicity wouldn’t work if there was no solution.

The bright minds of the Hackaday community should be the ones who actually solve this. So get to work and let us know what you figure out!

Transfer Data via YouTube

The original steganography technique dates back to 440 BC (according to Wikipedia) when a Greek wrote secret messages on a piece of wood, covered it in wax, and then wrote innocent text on the wax. The term, in general, means hiding a message in something that looks harmless. The LVDO project (and a recent Windows fork) says it is steganography, but we aren’t quite sure it meets the definition. What it does is converts data into a video that you can transfer like any other video. A receiver that knows what LVDO parameters you used to create the video can extract the data (although, apparently, the reproduction is not always completely error-free).

The reason we aren’t sure if this really counts as steganography is that–judging from the example YouTube video (which is not encoded)–the output video looks like snow. It uses a discrete cosine transform to produce patterns. If you are the secret police, you might not know what the message says, but you certainly know it must be something. We’d be more interested in something that encodes data in funny cat videos, for example.

Continue reading “Transfer Data via YouTube”

Encoding Data in Packet Delays

If you’ve ever been to a capture the flag hacking competition (CTF), you’ve probably seen some steganography challenges. Steganography is the art of concealing data in plain sight. Tools including secret inks that are only visible under certain light have been used for this purpose in the past. A modern steganography challenge will typically require you to find a “flag” hidden within an image or file.

[Anfractuosus] came up with a method of hiding packets within a stream of network traffic. ‘Timeshifter’ encodes data as delays between packets. Depending on the length of the delay, each packet is interpreted as a one or zero.

To do this, a C program uses libnetfilter_queue to get access to packets. The user sets up a network rule using iptables, which forwards traffic to the Timeshifter program. This is then used to send and receive data.

All the code is provided, and it makes for a good example if you’ve ever wanted to play around with low-level networking on Linux. If you’re interested in steganography, or CTFs in general, check out this great resource.