Video Voice Visualization

For their ECE 4760 final project at Cornell, [Varun, Hyun, and Madhuri] created a real-time sound spectrogram that visually outputs audio frequencies such as voice patterns and bird songs in gray-scale video to any NTSC television with no noticeable delay.

The system can take input from either the on-board microphone element or the 3.5mm audio jack. One ATMega1284 microcontroller is used for the audio processing and FFT stage, while a second ‘1284 converts the signal to video for NTSC output. The mic and line audio inputs are amplified individually with LM358 op-amps. Since the audio is sampled at 8KHz, a low-pass filter gets rid of frequencies above 4KHz.

After the break, you can see the team demonstrate their project by speaking and whistling bird calls into the microphone as well as feeding recorded bird calls through the line input. They built three controls into the project to freeze the video, slow it down by a factor of two, and convert between linear and logarithmic scales. There are also short clips of the recorded bird call visualization and an old-timey dial-up modem.

 

Bird Songs

 

Dial-up modem

 

5 thoughts on “Video Voice Visualization

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.