Audio Fingerprinting Skips A Show’s Intro, Reliably

Lacking a DVD drive, [jg] was watching a TV series in the form of a bunch of .avi video files. Of course, when every episode contains a full intro, it is only a matter of time before that gets too annoying to sit through.

Chapter breaks reliably inserted around the intro, even when it doesn’t always occur in the same place.

The usual method of skipping the intro on a plain video file is a simple one:

  1. Manually drag the playback forward past the intro.
  2. Oops that’s too far, bring it back.
  3. Ugh reversed it too much, nudge it forward.
  4. Okay, that’s good.

[jg] was certain there was a better way, and the solution was using audio fingerprinting to insert chapter breaks. The plain video files now have a chapter breaks around the intro, allowing for easy skipping straight to content. The reason behind selecting this method is simple: the show intro is always 52 seconds long, but it isn’t always in the same place. The intro plays somewhere within the first two to five minutes of an episode, so just skipping to a specific timestamp won’t do the trick.

The first job is to extract the audio of an intro sequence, so that it can be used for fingerprinting. Exporting the first 15 minutes of audio with ffmpeg easily creates a wav file that can be trimmed down with an audio editor of choice. That clip gets fed into the open-source SoundFingerprinting library as a signature, then each video has its audio track exported and the signature gets identified within it. SoundFingerprinting therefore detects where (down to the second) the intro exists within each video file.

Marking out chapter breaks using that information is conceptually simple, but ends up being a bit roundabout because it seems .avi files don’t have a simple way to encode chapters. However, .mkv files are another matter. To get around this, [jg] first converts each .avi to .mkv using ffmpeg then splices in the chapter breaks with mkvmerge. One important element is that the reformatting between .avi and .mkv is done without completely re-encoding the video itself, so it’s a quick process. The result is a bunch of .mkv files with chapter breaks around the intro, wherever it may be!

The script is available here for anyone to play with, and the project page is a good learning reference because [jg] kindly provides all the command-line options used for each tool. Interested in using audio fingerprinting in your own projects? Remember to also check out Olaf, the Overly Lightweight Acoustic Fingerprinting method that can be implemented in embedded systems and web browsers.

Community Rallies Behind Youtube-dl After DMCA Takedown

At this point, you’ve likely heard that the GitHub repository for youtube-dl was recently removed in response to a DMCA takedown notice filed by the Recording Industry Association of America (RIAA). As the name implies, this popular Python program allowed users to produce local copies of audio and video that had been uploaded to YouTube and other content hosting sites. It’s a critical tool for digital archivists, people with slow or unreliable Internet connections, and more than a few Hackaday writers.

It will probably come as no surprise to hear that the DMCA takedown and subsequent removal of the youtube-dl repository has utterly failed to contain the spread of the program. In fact, you could easily argue that it’s done the opposite. The developers could never have afforded the amount of publicity the project is currently enjoying, and as the code is licensed as public domain, users are free to share it however they see fit. This is one genie that absolutely won’t be going back into its bottle.

In true hacker spirit, we’ve started to see some rather inventive ways of spreading the outlawed tool. A Twitter user by the name of [GalacticFurball] came up with a way to convert the program into a pair of densely packed rainbow images that can be shared online. After downloading the PNG files, a command-line ImageMagick incantation turns the images into a compressed tarball of the source code. A similar trick was one of the ways used to distribute the DeCSS DVD decryption code back in 2000; though unfortunately, we doubt anyone is going to get the ~14,000 lines of Python code that makes up youtube-dl printed up on any t-shirts.

Screenshot of the Tweet sharing YouTube-dl repository as two images

It’s worth noting that GitHub has officially distanced themselves from the RIAA’s position. The company was forced to remove the repo when they received the DMCA takedown notice, but CEO Nat Friedman dropped into the project’s IRC channel with a promise that efforts were being made to rectify the situation as quickly as possible. In a recent interview with TorrentFreak, Friedman said the removal of youtube-dl from GitHub was at odds with the company’s own internal archival efforts and financial support for the Internet Archive.

But as it turns out, some changes will be necessary before the repository can be brought back online. While there’s certainly some debate to be had about the overall validity of the RIAA’s claim, it isn’t completely without merit. As pointed out in the DMCA notice, the project made use of several automated tests that ran the code against copyrighted works from artists such as Taylor Swift and Justin Timberlake. While these were admittedly very poor choices to use as official test cases, the RIAA’s assertion that the entire project exists solely to download copyrighted music has no basis in reality.

[Ed Note: This is only about GitHub. You can still get the code directly from the source.]

Super-Simple VGA Adapter Sports Low-Res Output With Only Four TTL Chips

Here at Hackaday we cast a wary eye at tips that come in with superlative claims. Generally, if we post something that claims to be the fastest or the smallest of all time, we immediately get slapped down in the comments by someone who has done it faster or smaller. So we present the simplest TTL video card ever knowing the same thing will happen, but eager to see how anyone might scale things down.

To be fair, [George Foot] does qualify his claim to the simplest usable VGA adapter, and he does note that it descends from [Ben Eater]’s “world’s worst video card”, which he uses for his 6502 breadboard computer. But where [Ben]’s VGA adapter uses about 20 TTL chips and an EEPROM, [George] has managed to decrease the BOM to just four TTL chips along with the memory and a crystal oscillator. This required a fair number of compromises, of course; the color depth is fairly low, as is the resolution. Each pixel appears as a thin horizontal bar rather than a small square, leading the images to be smeared out across the screen. They’re still surprisingly viewable, though, which probably says more about the quality of the pattern-recognition wetware between our ears than anything about the quality of the adapter. [George] gives a tour of the circuit in the brief video below.

It looks like [George] has posted a few improvements to the project since we first spotted it, so we’re looking forward to seeing how much the parts count went up. We’re also keen to see if anyone can outdo the simplicity of this effort — be sure to let us know if you give it a shot.

Continue reading “Super-Simple VGA Adapter Sports Low-Res Output With Only Four TTL Chips”

Light Fields: Missing Ingredient For Immersive 3D Video Gets Improved

46 time-synchronized action cameras make up the guts of the capture device.

3D video content has a significant limitation, one that is not trivial to solve. Video captured by a camera — even one with high resolution and a very wide field of view — still records a scene as a flat plane, from a fixed point of view. The limitation this brings will be familiar to anyone who has watched a 3D video (or “360 video”) in VR and moved their head the wrong way. In these videos one is free to look around, but may not change the position of their head in the process. Put another way, pivoting one’s head to look up, down, left, or right is fine. Moving one’s head higher, lower, closer, further, or to the side? None of that works. Natural movements like trying to peek over an object, or moving slightly to the side for a better view simply do not work.

Light field video changes that. It is captured using a device like the one in the image above, and Google has a resource page giving an excellent overview of what light field video is, what it can look like, and how they are doing it. That link covers recent improvements to their camera apparatus as well as to video encoding and rendering, but serves as a great show-and-tell of what light fields are and what they can do.

Continue reading “Light Fields: Missing Ingredient For Immersive 3D Video Gets Improved”

Escape To An Alternate Reality Anywhere With Port-A-Vid

There was a time when only the most expensive televisions could boast crystal clear pixels on a wall-mountable thin screen. What used to be novelty from “High Definition Flat Screen Televisions are now just “TV” available everywhere. So as a change of pace from our modern pixel perfection, [Emily Velasco] built the Port-A-Vid as a relic from another timeline.

The centerpiece of any aesthetically focused video project is obviously the screen, and a CRT would be the first choice for a retro theme. Unfortunately, small CRTs have recently become scarce, and a real glass picture tube would not fit within the available space anyhow. Instead, we’re actually looking at a modern LCD sitting behind a big lens to give it an old school appearance.

The lens, harvested from a rear-projection TV, was chosen because it was a good size to replace the dial of a vacuum gauge. This project enclosure started life as a Snap-On Tools MT425 but had become just another piece of broken equipment at a salvage yard. The bottom section, formerly a storage bin for hoses and adapters, is now home to the battery and electronics. All original markings on the hinged storage lid were removed and converted to the Port-A-Vid control panel.

A single press of the big green button triggers a video to play, randomly chosen from a collection of content [Emily] curated to fit with the aesthetic. We may get a clip from an old educational film, or something shot with a composite video camera. If any computer graphics pop up, they will be primitive vector graphics. This is not the place to seek ultra high definition content.

As a final nod to common artifacts of electronics history, [Emily] wrote an user’s manual for the Port-A-Vid. Naturally it’s not a downloadable PDF, but a stack of paper stapled together. Each page written in the style of electronics manuals of yore, treated with the rough look of multiple generation photocopy rumpled with use.

If you have to ask “Why?” it is doubtful any explanation would suffice. This is a trait shared with many other eclectic projects from [Emily]. But if you are delighted by fantastical projects hailing from an imaginary past, [Emily] has also built an ASCII art cartridge for old parallel port printers.

Continue reading “Escape To An Alternate Reality Anywhere With Port-A-Vid”

Vizy “AI Camera” Wants To Make Machine Vision Less Complex

Vizy, a new machine vision camera from Charmed Labs, has blown through their crowdfunding goal on the promise of making machine vision projects both easier and simpler to deploy. The camera, which starts around $250, integrates a Raspberry Pi 4 with built-in power and shutdown management, and comes with a variety of pre-installed applications so one can dive right in.

The Sony IMX477 camera sensor is the same one found in the Raspberry Pi high quality camera, and supports capture rates of up to 300 frames per second (under the right conditions, anyway.) Unlike the usual situation faced by most people when a Raspberry Pi is involved, there’s no need to worry about adding a real-time clock, enclosure, or ensuring shutdowns happen properly; it’s all taken care of.

‘Birdfeeder’ application can automatically identify and upload images of visitors.

Charmed Labs are the same folks behind the Pixy and Pixy 2 cameras, and Vizy goes further in the sense that everything required for a machine vision project has been put onboard and made easy to use and deploy, even the vision processing functions work locally and have no need for a wireless data connection (though one is needed for things like automatic uploading or sharing.) For outdoor or remote applications, there’s a weatherproof enclosure option, and wireless connectivity in areas with no WiFi can be obtained by plugging in a USB cellular modem.

A few of the more hacker-friendly hardware features are things like a high-current I/O header and support for both C/CS and M12 lenses for maximum flexibility. The IR filter can also be enabled or disabled via software, so no more swapping camera modules for ones with the IR filter removed. On the software side, applications are all written in Python and use open software like Tensorflow and OpenCV for processing.

The feature list looks good, but Vizy also seems to have a clear focus. It looks best aimed at enabling projects with the following structure:

Detect Things (people, animals, cars, text, insects, and more) and/or Measure Things (size, speed, duration, color, count, angle, brightness, etc.)

Perform an Action (for example, push a notification or enable a high-current I/O) and/or Record (save images, video, or other data locally or remotely.)

The Motionscope application tracking balls on a pool table. (Click to enlarge)

A good example of this structure is the Birdfeeder application which comes pre-installed. With the camera pointed toward a birdfeeder, animals coming for a snack are detected. If the visitor is a bird, Vizy identifies the species and uploads an image. If the animal is not a bird (for example, a squirrel) then Vizy can detect that as well and, using the I/O header, could briefly turn on a sprinkler to repel the hungry party-crasher. A sample Birdfeeder photo stream is here on Google Photos.

Motionscope is a more unusual but very interesting-looking application, and its purpose is to capture moving objects and measure the position, velocity, and acceleration of each. A picture does a far better job of explaining what Motionscope does, so here is a screenshot of the results of watching some billiard balls and showing what it can do.

3D Printed Video Terminal Dials C For Cyberpunk

Created for the Disobey 2020 hacker conference in Finland, this Blade Runner inspired communications terminal isn’t just for decoration. It was part of an interactive game that required attendees to physically connect their conference badges up and “call” different characters with the functional keypad on the front of the unit.

[Purkkaviritys] was in charge of designing the 3D printed enclosure for the device, which he says takes an entire 2 kg roll of filament to print out. Unfortunately he wasn’t as involved in the electronics side of things, so we don’t have a whole lot of information about the internals beyond the fact that its powered by a Raspberry Pi 4, features a HyperPixel 4.0 display, and uses power over Ethernet so it could be easily set up at the con with just a single cable run.

A look at the custom keypad PCB.

The keypad is a custom input device using the Arduino Micro and Cherry MX Blue switches with 3D printed keycaps to get that chunky payphone look and feel. [Purkkaviritys] mentions that the keypad is also responsible for controlling the RGB LED strips built into the sides of the terminal, and that the Raspberry Pi toggles the status of the Caps, Scroll Lock, and Num Lock keys to select the different lighting patterns.

Naturally we’d like to see more info on how this beauty was put together, but given that it was built for such a specific purpose, it’s not like you’d really need to duplicate the original configuration anyway. Thanks to [Purkkaviritys] you have the STL files to print off our own copy of the gloriously cyberpunk enclosure, all you’ve got to do now is figure out how to make video calls with it.

Continue reading “3D Printed Video Terminal Dials C For Cyberpunk”