The recent flurry of videos and posts about the TVGuardian foul language filter brought back some fond memories. I was the chief engineer on this project for most of its lifespan. You’ve watched the teardowns, you’ve seen the reverse engineering, now here’s the inside scoop.
Gumby is Born
Back in 1999, my company took on a redesign project for the TVG product, a box that replaced curse words in closed-captioning with sanitized equivalents. Our first task was to take an existing design that had been produced in limited volumes and improve it to be more easily manufactured.
The original PCB used all thru-hole components and didn’t scale well to large quantity production. Replacing the parts with their surface mount equivalents resulted in Model 101, internally named Gumby for reasons long lost. If you have a sharp eye, you will have noticed something odd about two parts on the board as shown in [Ben Eater]’s video. The Microchip PIC and the Zilog OSD chip had two overlapping footprints, one for thru-hole and one for SMD. Even though we preferred SMD parts, sometimes there were supply issues. This was a technique we used on several designs in our company to hedge our bets. It also allowed us to use a socketed ICs for testing and development. Continue reading “The Story Behind The TVGuardian Curse Catcher”→
To begin at the beginning, a couple of weeks ago [Alec] over at everyone’s favorite nerd hangout Technology Connections did a video on the TVGuardian, a device that attempted to clean up the language of live TV and recorded programming. Go watch that video for the details, but for a brief summary, TVGuardian worked by scanning the closed caption text for naughty words and phrases, muted the audio when something suggestive was found in a lookup table, and inserted a closed caption substitute for the offensive content. In his video, [Alec] pined for a way to look at the list of verboten words, and [Ben] accepted the challenge.
The naughty word list ended up living on a 93LC86 serial EEPROM, which [Ben] removed from his TVGuardian for further exploration. Rather than just plug it into a programmer and dumping the contents, he decided to roll his own decoder with an Arduino, because that’s more fun. And can we just point out our ongoing amazement that [Ben] is able to make watching someone else code interesting?
The resulting NSFW word list is titillating, of course, and the video would be plenty satisfying if that’s where it ended. But [Ben] went further and figured out how the list is organized, how the dirty-to-clean substitutions are made, and even how certain words are whitelisted. That last bit resulted in the revelation that Hollywood legend [Dick Van Dyke] gets a special whitelisting, lest his name becomes sanitized to a hilarious [Jerk Van Gay].
Hats off to [Alec] for inspiring [Ben]’s fascinating reverse engineering effort here.
Closed captioning on television and subtitles on DVD, Blu-ray, and streaming media are taken for granted today. But it wasn’t always so. In fact, it was quite a struggle for captioning to become commonplace. Back in the early 2000s, I unexpectedly found myself involved in a variety of closed captioning projects, both designing hardware and consulting with engineering teams at various consumer electronics manufacturers. I may have been the last engineer working with analog captioning as everyone else moved on to digital.
But before digging in, there is a lot of confusing and imprecise language floating around on this topic. Let’s establish some definitions. I often use the word captioning which encompasses both closed captions and subtitles:
Closed Captions: Transmitted in a non-visible manner as textual data. Usually they can be enabled or disabled by the user. In the NTSC system, it’s often referred to as Line 21, since it was transmitted on video line number 21 in the Vertical Blanking Interval (VBI).
Subtitles: Rendered in a graphical format and overlaid onto the video / film. Usually they cannot be turned off. Also called open or hard captions.
The text contained in captions generally falls into one of three categories. Pure dialogue (nothing more) is often the style of captioning you see in subtitles on a DVD or Blu-ray. Ordinary captioning includes the dialogue, but with the addition of occasional cues for music or a non-visible event (a doorbell ringing, for example). Finally, “Subtitles for the Deaf or Hard-of-hearing” (SDH) is a more verbose style that adds even more descriptive information about the program, including the speaker’s name, off-camera events, etc.
Roughly speaking, closed captions are targeting the deaf and hard of hearing audience. Subtitles are targeting an audience who can hear the program but want to view the dialogue for some reason, like understanding a foreign movie or learning a new language.
If you don’t have hearing loss, it is easy to forget just how much you depend on your ears. Hearing aids are great if you can afford them, but they aren’t like glasses where they immediately improve your sense in almost every way. In addition to having to get used to a hearing aid you’ll often find increased noise and even feedback. If you’ve been to a theater lately, you may have noticed a closed caption display system somewhere nearby that you can sit within visual range of should you be hard of hearing. That limits your seat choices though, and requires you to split your attention between the stage and the device. The National Theatre of London is using Epson smart glasses to put the captions right in your individual line of vision (see video below).
The Epson glasses are similar to the Google Glass that caused such a stir a few years ago, and it seems like such a great application we are surprised it has taken this long to be created. We were also surprised to hear about the length of the project, amazingly it took four years. The Epson glasses can take HDMI or USB-C inputs, so it seems as though a Raspberry Pi, a battery, and the glasses could have made this a weekend project.
[CNLohr] has made a habit of using ATtiny microcontrollers for everything, and one of his most popular projects is using an ATTiny85 to generate NTSC video. With a $2 microcontroller and eight pins, [CNLohr] can put text and simple graphics on any TV. He’s back at it again, only this time the microcontroller isn’t plugged into the TV.
The ATtiny in this project is overclocked to 30MHz or so using the on-chip PLL. That, plus a few wires of sufficient length means this chip can generate and broadcast NTSC video.
[CNLohr] mentions that it should be possible to use this board to transmit closed captioning directly to a TV. If you’re looking for the simplest way to display text on a monitor with an AVR, there ‘ya go: a microcontroller and two wires. He’s unable to actually test this, as he lost the remote for his tiny TV from the turn of the millennium. Because there’s no way for [CNLohr] to enable closed captioning on his TV, he can’t build the obvious application for this circuit – a closed caption Twitter bot. That doesn’t mean you can’t.