Audio Fingerprinting Skips A Show’s Intro, Reliably

Lacking a DVD drive, [jg] was watching a TV series in the form of a bunch of .avi video files. Of course, when every episode contains a full intro, it is only a matter of time before that gets too annoying to sit through.

Chapter breaks reliably inserted around the intro, even when it doesn’t always occur in the same place.

The usual method of skipping the intro on a plain video file is a simple one:

  1. Manually drag the playback forward past the intro.
  2. Oops that’s too far, bring it back.
  3. Ugh reversed it too much, nudge it forward.
  4. Okay, that’s good.

[jg] was certain there was a better way, and the solution was using audio fingerprinting to insert chapter breaks. The plain video files now have a chapter breaks around the intro, allowing for easy skipping straight to content. The reason behind selecting this method is simple: the show intro is always 52 seconds long, but it isn’t always in the same place. The intro plays somewhere within the first two to five minutes of an episode, so just skipping to a specific timestamp won’t do the trick.

The first job is to extract the audio of an intro sequence, so that it can be used for fingerprinting. Exporting the first 15 minutes of audio with ffmpeg easily creates a wav file that can be trimmed down with an audio editor of choice. That clip gets fed into the open-source SoundFingerprinting library as a signature, then each video has its audio track exported and the signature gets identified within it. SoundFingerprinting therefore detects where (down to the second) the intro exists within each video file.

Marking out chapter breaks using that information is conceptually simple, but ends up being a bit roundabout because it seems .avi files don’t have a simple way to encode chapters. However, .mkv files are another matter. To get around this, [jg] first converts each .avi to .mkv using ffmpeg then splices in the chapter breaks with mkvmerge. One important element is that the reformatting between .avi and .mkv is done without completely re-encoding the video itself, so it’s a quick process. The result is a bunch of .mkv files with chapter breaks around the intro, wherever it may be!

The script is available here for anyone to play with, and the project page is a good learning reference because [jg] kindly provides all the command-line options used for each tool. Interested in using audio fingerprinting in your own projects? Remember to also check out Olaf, the Overly Lightweight Acoustic Fingerprinting method that can be implemented in embedded systems and web browsers.

23 thoughts on “Audio Fingerprinting Skips A Show’s Intro, Reliably

  1. I remember in “Contact” by C. Sagan, there was this company specialized in devices for skipping commercials, government announcements and religion-related talks. In the Spanish translation I read they were called “Pamplinex, predicanex” and I cannot remember the other one. Maybe the future is there already…

      1. If so, do you think that AdBlockers online should be “very illegal”?

        Here you can read more about commercial skipping: https://www.wikiwand.com/en/Commercial_skipping

        I have a vague memory of a post on hackaday a few years ago related to commercial skippnig. They used the closed caption channel to detect when there was a commercial break and automatically mute their TV. Some quick googling didn’t give me the post I was looking for.

        1. I used to run mythtv back in 2001 or so… It had a method of removing/skipping commercials back then. I think it used some sort of “fading to black” in the video to detect… It wasn’t perfect, especially for dark shows, but work probably 90% of the time.

    1. Seems to me that there are a lot of HaD readers who are probably lacking an optical drive due to it being in pieces on their workbench as part of a half completed project from 2 months ago.

  2. Thanks for making me aware of mkvmerge, I was looking for a way to do chapters in some video files I own.

    And even if I own many optical drive I don’t think gatekeeping people that don’t have at least one is useful.

  3. Do you really need fingerprinting? Convolution should find the intro as well, and better probably, because fingerprinting is used to handle noise and shorter randomly cut sequences.

  4. OK but how does it fare when the producers change the intro a little to add a Jingle Bell, for the Christmas specials, or even some spooky ghost sounds for Halloween specials, etc.
    We need to go Deep Learning.

  5. Skipping an intro takes me 4 seconds (pressing the right arrow key 18 times). An intro lasts on average 12 episodes before. In total: 48 seconds lost.

    Now tell me if I can do this entire fingerprinting process in 48 seconds?

    And the resulting files wouldn’t be the originals, so the new CRCs would fail to be recognized by some systems (e.g. anidb.net)

  6. “The ability to skip through television show “intros” requires a Plex Pass subscription for Server admin account and the Plex account used in the player app.”

    Yeah…no. You’re also automatically assuming everyone is going to roll over to PLEX when more people use MPC or VLC because they’re not having subscriptions dangled in front of them.

Leave a Reply to regCancel reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.