History Of Closed Captions: The Analog Era

Closed captioning on television and subtitles on DVD, Blu-ray, and streaming media are taken for granted today. But it wasn’t always so. In fact, it was quite a struggle for captioning to become commonplace. Back in the early 2000s, I unexpectedly found myself involved in a variety of closed captioning projects, both designing hardware and consulting with engineering teams at various consumer electronics manufacturers. I may have been the last engineer working with analog captioning as everyone else moved on to digital.

But before digging in, there is a lot of confusing and imprecise language floating around on this topic. Let’s establish some definitions. I often use the word captioning which encompasses both closed captions and subtitles:

Closed Captions: Transmitted in a non-visible manner as textual data. Usually they can be enabled or disabled by the user. In the NTSC system, it’s often referred to as Line 21, since it was transmitted on video line number 21 in the Vertical Blanking Interval (VBI).
Subtitles: Rendered in a graphical format and overlaid onto the video / film. Usually they cannot be turned off. Also called open or hard captions.

The text contained in captions generally falls into one of three categories. Pure dialogue (nothing more) is often the style of captioning you see in subtitles on a DVD or Blu-ray. Ordinary captioning includes the dialogue, but with the addition of occasional cues for music or a non-visible event (a doorbell ringing, for example). Finally, “Subtitles for the Deaf or Hard-of-hearing” (SDH) is a more verbose style that adds even more descriptive information about the program, including the speaker’s name, off-camera events, etc.

Roughly speaking, closed captions are targeting the deaf and hard of hearing audience. Subtitles are targeting an audience who can hear the program but want to view the dialogue for some reason, like understanding a foreign movie or learning a new language.

Continue reading “History Of Closed Captions: The Analog Era”

Eyecam Is Watching You In Between Blinks

We will be the first to admit that it’s often hard to be productive while working from home, especially if no one’s ever really looking over your shoulder. Well, here is one creepy way to feel as though someone is keeping an eye on you, if that’s what gets you to straighten up and fly right. The Eyecam research project by [Marc Teyssier] et. al. is a realistic, motorized eyeball that includes a camera and hangs out on top of your computer monitor. It aims to spark conversation about the sensors that are all around us already in various cold and clinical forms. It’s an open source project with a paper and a repo and a how-to video in the works.

The eyebrow-raising design pulls no punches in the uncanny department: the eye behaves as you’d expect (if you could have expected this) — it blinks, looks around, and can even waggle its brow. The eyeball, brow, and eyelids are actuated by a total of six servos that are controlled by an Arduino Nano.

Inside the eyeball is a Raspberry Pi camera connected to a Raspi Zero for the web cam portion of this intriguing horror show. Keep an eye out after the break for the Eyecam infomercial.

Creepy or fascinating, it succeeds in making people think about the vast amount of sensors around us now, and what the future of them could look like. Would mimicking eye contact be an improvement over the standard black and gray oblong eye? Perhaps a pair of eyes would be less unsettling, we’re not really sure. But we are left to wonder what’s next, a microphone that looks like an ear? Probably. Will it have hair sprouting from it? Perhaps.

Yeah, it’s true; two eyes are more on the mesmerizing side, but still creepy, especially when they follow you around the room and can shoot frickin’ laser beams.

Continue reading “Eyecam Is Watching You In Between Blinks”

Learn Multirotors From First Principles

Multirotors, or drones as they’re popularly called, are so ubiquitous as to have become a $10 toy. They’re no less fun to fly for it though, and learning how they work is no less fascinating. It’s something [Science Buddies] has addressed in a series of videos examining them from first principles. They may be aimed at youngsters, but they’re still an entertaining enough watch for those of advancing years.

Instead of starting with a multirotor control board, the video takes four little DC motors and two popsicle sticks to make a rudimentary drone frame. Then with the help of dowels and springs it tethers the craft as the control mechanisms are explained bit by bit, from simple on-off motor control through proportional control to adding an Arduino and following through to how a multirotor stays in flight. It’s instructional and fun to watch, and maybe even for some of us, a chance to learn something.

We’ve had multirotor projects aplenty here over the years, but how about something completely different made from popsicle sticks?

Continue reading “Learn Multirotors From First Principles”

AI Upscaling And The Future Of Content Delivery

The rumor mill has recently been buzzing about Nintendo’s plans to introduce a new version of their extremely popular Switch console in time for the holidays. A faster CPU, more RAM, and an improved OLED display are all pretty much a given, as you’d expect for a mid-generation refresh. Those upgraded specifications will almost certainly come with an inflated price tag as well, but given the incredible demand for the current Switch, a $50 or even $100 bump is unlikely to dissuade many prospective buyers.

But according to a report from Bloomberg, the new Switch might have a bit more going on under the hood than you’d expect from the technologically conservative Nintendo. Their sources claim the new system will utilize an NVIDIA chipset capable of Deep Learning Super Sampling (DLSS), a feature which is currently only available on high-end GeForce RTX 20 and GeForce RTX 30 series GPUs. The technology, which has already been employed by several notable PC games over the last few years, uses machine learning to upscale rendered images in real-time. So rather than tasking the GPU with producing a native 4K image, the engine can render the game at a lower resolution and have DLSS make up the difference.

The current model Nintendo Switch

The implications of this technology, especially on computationally limited devices, is immense. For the Switch, which doubles as a battery powered handheld when removed from its dock, the use of DLSS could allow it to produce visuals similar to the far larger and more expensive Xbox and PlayStation systems it’s in competition with. If Nintendo and NVIDIA can prove DLSS to be viable on something as small as the Switch, we’ll likely see the technology come to future smartphones and tablets to make up for their relatively limited GPUs.

But why stop there? If artificial intelligence systems like DLSS can scale up a video game, it stands to reason the same techniques could be applied to other forms of content. Rather than saturating your Internet connection with a 16K video stream, will TVs of the future simply make the best of what they have using a machine learning algorithm trained on popular shows and movies?

Continue reading “AI Upscaling And The Future Of Content Delivery”

Scanimate Analog Video Synths Produced Oceans Of Motion Graphics

Why doesn’t this kind of stuff ever happen to us? One lucky day back in high school, [Dave Sieg] stumbled upon a room full of new equipment and a guy standing there scratching his head. [Dave]’s curiosity about this fledgling television studio was rewarded when that guy asked [Dave] if he wanted to help set it up. From that point on, [Dave] had the video bug. The rest is analog television history.

Today, [Dave] is the proud owner and maintainer of two Scanimate machines — the first R&D prototype, and the last one of only eight ever produced. The Scanimate is essentially an analog synthesizer for video signals, and they made it possible to move words and pictures around on a screen much more easily than ever before. Any animated logo or graphics seen on TV from the mid-1970s to the mid-80s was likely done with one of these huge machines, and we would jump quite high at the chance to fiddle with one of them.

Analog television signals were continuously variable, and much like an analog music synthesizer, the changes imposed on the signal are immediately discernible. In the first video below, [Dave] introduces the Scanimate and plays around with the Viceland logo a bit.

Stick around for the second and third videos where he superimposes the Scanimate’s output on to the video he’s making, all the while twiddling knobs to add oscillators and thoroughly explaining what’s going on. If you’ve ever played around with Lissajous patterns on an oscilloscope, you’ll really have a feel for what’s happening here. In the fourth video, [Dave] dives deeper and dissects the analog circuits that make up this fantastic piece of equipment.

Here’s another way to play with scan lines: delay the output to some of them and you have a simple scrambler.

Continue reading “Scanimate Analog Video Synths Produced Oceans Of Motion Graphics”

Real Time Object Detection For $59

There was a time when making a machine to identify objects in a camera was difficult, even without trying to do it in real time. But now, you can do it with a Jetson Nano board for under $60. How well does it work? Watch [Murtaza’s] video below and see what you think.

The first few minutes of the video piqued our interest, and good thing, too, because the 50 lines of code get a 50-plus minute video! It is worth watching, though, because there’s a lot of good information about how to apply this technique in your own projects.

Continue reading “Real Time Object Detection For $59”

sample of automatically generated comics

Read Your Movies As Automatically Generated Comic Books

A research paper from Dalian University of Technology in China and City University of Hong Kong (direct PDF link) outlines a system that automatically generates comic books from videos. But how can an algorithm boil down video scenes to appropriately reflect the gravity of the scene in a still image? This impressive feat is accomplished by saving two still images per second, then segments the frames into scenes through analysis of region-of-interest and importance ranking.

movie to comic book pipeline diagram

For its next trick, speech for each scene is processed by combining subtitle information with the audio track of the video. The audio is analyzed for emotion to determine the appropriate speech bubble type and size of the subtitle text. Frames are even analyzed to establish which person is speaking for proper placement of the bubbles. It can then create layouts of the keyframes, determining panel sizes for each page based on the region-of-interest analysis.

The process is completed by stylizing the keyframes with flat color through quantization, for that classic cel shading look, and then populating the layouts with each frame and word balloon.

The team conducted a study with 40 users, pitting their results against previous techniques which require more human intervention and still besting them in every measure. Like any great superhero, the team still sees room for improvement. In the future, they would like to improve the accuracy of keyframe selection and propose using a neural network to do so.

Thanks to [Qes] for the tip!