Video Voice Visualization

For their ECE 4760 final project at Cornell, [Varun, Hyun, and Madhuri] created a real-time sound spectrogram that visually outputs audio frequencies such as voice patterns and bird songs in gray-scale video to any NTSC television with no noticeable delay.

The system can take input from either the on-board microphone element or the 3.5mm audio jack. One ATMega1284 microcontroller is used for the audio processing and FFT stage, while a second ‘1284 converts the signal to video for NTSC output. The mic and line audio inputs are amplified individually with LM358 op-amps. Since the audio is sampled at 8KHz, a low-pass filter gets rid of frequencies above 4KHz.

After the break, you can see the team demonstrate their project by speaking and whistling bird calls into the microphone as well as feeding recorded bird calls through the line input. They built three controls into the project to freeze the video, slow it down by a factor of two, and convert between linear and logarithmic scales. There are also short clips of the recorded bird call visualization and an old-timey dial-up modem.

Continue reading “Video Voice Visualization”

Vocal mouse control

Absolutely fascinating. The University of Washington is developing a vocally controled mouse interface. We’ve seen vocal control of the computer before, but it is usally responding to specific commands and words to carry out tasks such as opening files.  This system uses different vowels and sounds to create cursor motion.  You can see the same system used in the video above to control a robot arm as well.

[via BotJunkie]