Audio Fingerprinting Skips A Show’s Intro, Reliably

Lacking a DVD drive, [jg] was watching a TV series in the form of a bunch of .avi video files. Of course, when every episode contains a full intro, it is only a matter of time before that gets too annoying to sit through.

Chapter breaks reliably inserted around the intro, even when it doesn’t always occur in the same place.

The usual method of skipping the intro on a plain video file is a simple one:

  1. Manually drag the playback forward past the intro.
  2. Oops that’s too far, bring it back.
  3. Ugh reversed it too much, nudge it forward.
  4. Okay, that’s good.

[jg] was certain there was a better way, and the solution was using audio fingerprinting to insert chapter breaks. The plain video files now have a chapter breaks around the intro, allowing for easy skipping straight to content. The reason behind selecting this method is simple: the show intro is always 52 seconds long, but it isn’t always in the same place. The intro plays somewhere within the first two to five minutes of an episode, so just skipping to a specific timestamp won’t do the trick.

The first job is to extract the audio of an intro sequence, so that it can be used for fingerprinting. Exporting the first 15 minutes of audio with ffmpeg easily creates a wav file that can be trimmed down with an audio editor of choice. That clip gets fed into the open-source SoundFingerprinting library as a signature, then each video has its audio track exported and the signature gets identified within it. SoundFingerprinting therefore detects where (down to the second) the intro exists within each video file.

Marking out chapter breaks using that information is conceptually simple, but ends up being a bit roundabout because it seems .avi files don’t have a simple way to encode chapters. However, .mkv files are another matter. To get around this, [jg] first converts each .avi to .mkv using ffmpeg then splices in the chapter breaks with mkvmerge. One important element is that the reformatting between .avi and .mkv is done without completely re-encoding the video itself, so it’s a quick process. The result is a bunch of .mkv files with chapter breaks around the intro, wherever it may be!

The script is available here for anyone to play with, and the project page is a good learning reference because [jg] kindly provides all the command-line options used for each tool. Interested in using audio fingerprinting in your own projects? Remember to also check out Olaf, the Overly Lightweight Acoustic Fingerprinting method that can be implemented in embedded systems and web browsers.

Hide Secret Messages In Plain Sight With Zero-Width Characters

Fingerprinting text is really very nifty; the ability to encode hidden data within a string of characters opens up a large number of opportunities. For example, someone within your team is leaking confidential information but you don’t know who. Simply send each team member some classified text with their name encoded in it. Wait for it to be leaked, then extract the name from the text — the classic canary trap.

Here’s a method that hides data in text using zero-width characters. Unlike various other ways of text fingerprinting, zero width characters are not removed if the formatting is stripped, making them nearly impossible to get rid of without re-typing the text or using a special tool. In fact you’ll have a hard time detecting them at all – even terminals and code editors won’t display them.

To make the process easy to perform, [Vedhavyas] created a command line utility to embed and extract a payload using any text. Each letter in the secret message is converted to binary, then encoded in zero-width characters. A zero-width-non-joiner character is used for 0, and a zero-width-space character for 1.

[Vedhavyas’] tool was inspired by a post by [Tom], who uses a javascript example (with online demo) to explain what’s going on. This lets you test out the claim that you can paste the text without losing the hidden data. Try pasting it into a text editor. We were able to copy it again from there and retrieve the data, but it didn’t survive being saved and cat’d to the command line.

Of course, to get your encoding game really tight, you should be looking at getting yourself an enigma wristwatch

Continue reading “Hide Secret Messages In Plain Sight With Zero-Width Characters”

Panopticlick: You Are A Beautiful And Unique Snowflake

We all like to think we’re unique, but when it comes to remaining anonymous online that’s probably not such a good idea. By now, it’s common knowledge that advertising firms, three-letter agencies, and who-knows-who-else want to know what websites you’re visiting and how often. Persistent tracking cookies, third-party cookies, and “like” buttons keep tabs on you at all times.

For whatever reason, you might want to browse anonymously and try to plug some of the obvious sources of identity leakage. The EFF and their Panopticlick project have bad news for you.

The idea behind Panopticlick is simple: to try to figure out how identifiable you are even if you’re not accepting cookies, or if you’ve disabled Flash, or if you’re using “secure” browsers. To create a fingerprint of your browser, Panopticlick takes all the other little bits of identifying information that your browser gives up, and tries to piece them together.

For a full treatment of the project, see this paper (PDF). The takeaway from the project is that the information your browser gives up to servers can, without any cookies, specifically identify you.

fooFor instance, a server can query which plugins your browser supports, and if you’ve installed anything a tiny bit out of the ordinary, you’re fingerprinted. Your browser’s User Agent strings are often over-specific and tell which browser sub-sub-sub version you’re running on which OS platform. If you’re running Flash, it can report back which fonts you’ve got installed on your system. Any of these can be easily as rare as one-in-a-million. Combining them together (unless they’re all highly correlated) can fingerprint you uniquely.

You can’t necessarily win. If you disable Flash, the remote site doesn’t get your font list, but since only one in five browsers runs with Flash disabled, you’re still giving up two bits of information. If you run a “privacy-enhancing” niche browser, your chances of leaving a unique fingerprint go through the roof unless you’re also forging the User Agent strings.

I ran the Panopticlick experiment twice, once with a Firefox browser and once with an obscure browser that I actually use most of the time (dwb). Firefox runs a Flash blocker standard, so they didn’t get my font list. But still, the combination of browser plugins and a relatively new Firefox on Linux alone made me unique.

It was even worse for the obscure browser test. Only one in 1.4 million hits use dwb, so that alone was bad news. I also use a 4:3 aspect-ratio monitor, with 1280×1024 pixels at 24-bit color depth, which is apparently a one-in-twenty-four occurrence. Who knew?

fooFinally, I tried out the Tor browser, which not only routes your traffic through the Tor network, but also removes a lot of the specific data about your session. It fared much better, making me not uniquely identifiable: instead only one in a thousand. (Apparently a lot of people trying out the Panopticlick site ran Tor browser.)

If you’re interested in online anonymity, using something like Tor to obscure your IP address and disabling cookies is a good start. But Panopticlick points out that it may not be enough. You can never use too many layers of tinfoil when making your hat.

Try it out, and let us know in the comments how you fare.

Avoiding OS Fingerprinting In Windows

[Irongeek] has been working on changing the OS fingerprint of his Windows box. Common network tools like Nmap, P0f, Ettercap, and NetworkMiner can determine what operating system is being run by the behavior of the TCP/IP stack. By changing this behavior, you can make your system appear to be another OS. [Irongeek] started writing his own tool by checking the source of Security Cloak to find out what registry keys needed to be changed. His OSfuscate tool lets you define your own .os fingerprint file. You can pretend to be any number of different systems from IRIX to Dreamcast. Unfortunately this only works for TCP/IP. Other methods, like Satori‘s DHCP based fingerprinting, still work and need to be bypassed by other means. Yes, this is just “security through obscurity”, but it is something fun to play with.