Remove A Speaker’s Voice From A Recording Using Ultrasound

What if you could effectively prevent someone from recording your voice? This is the focus of a study by Guo et al. (2022) at Michigan State University, in which they use a dynamically calculated audio signal that effectively cancels out one’s voice in a recording device. This relies on an interesting aspect of certain micro-electro-mechanical system (MEMS) microphones, which are commonly used in smartphones and other recording devices.

Pressure sensitivity of a MEMS microphone. (credit: Brian R. Elbing)
Pressure sensitivity of a MEMS microphone. (credit: Brian R. Elbing)

A specially crafted ultrasound signal sent to the same microphone which is recording one’s voice can result in the voice audio signal being gone on the final recording. The approach taken by the authors involves using a neural network that is trained on voice samples of the person (“Bob”) whose voice has to be cancelled. After recording Bob’s voice during a conversation, the creatively named Neurally Enhanced Cancellation (NEC) system determines the ultrasound signal to be sent to the target recording device. Meanwhile the person holding the recording device (“Alice”) will still perceive Bob’s voice normally.

As ultrasound is highly directional, the system can only jam a specific microphone and wouldn’t affect hidden microphones in a room. As noted by the authors, it is possible to do general microphone jamming using other systems, but this is legally problematic, which should not be an issue with their NEC system.

Thanks to [JohnU] for the tip!