Steganography involves hiding data in something else — for example, encoding data in a picture. [David Buchanan] used polyglot files not to hide data, but to send a large amount of data in a single Twitter post. We don’t think it quite qualifies as steganography because the image has a giant red UNZIP ME printed across it. But without it, you might not think to run a JPG image through your unzip program. If you did, though, you’d wind up with a bunch of RAR files that you could unrar and get the complete works of the Immortal Bard in a single Tweet. You can also find the source code — where else — on Twitter as another image.
What’s a polyglot file? Jpeg images have an ICC (International Color Consortium) section that defines color profiles. While Twitter strips a lot of things out of images, it doesn’t take out the ICC section. However, the ICC section can contain almost anything that fits in 64 kB up to a limit of 16 MB total.
The ZIP format is also very flexible. The pointer to the central directory is at the end of the file. Since that pointer can point anywhere, it is trivial to create a zip file with extraneous data just about anywhere in the file.