Gesture Recognition Using Ultrasound


You’d be hard pressed to find a public restroom that wasn’t packed full of hands free technology these days. From the toilets to the sinks and paper towel dispensers, hands free tech is everywhere in modern public restrooms.

The idea is to cut down on the spread of germs.  However, as we all know too well, this technology is not perfect. We’ve all gone from sink to sink in search of one that actually worked. Most of us have waved our hands wildly in the air to get a paper towel dispenser to dispense, creating new kung-fu moves in the process. IR simply has its limitations.

What if there was a better way? Check out [Ackerley] and [Lydia’s] work on gesture recognition using ultrasound. Such technology is cheap and could easily be implemented in countless applications where hands free control of our world is desired. Indeed, the free market has already been developing this technology for use in smart phones and tablets.

Where a video camera will use upwards of 1 watt of power to record video, an ultrasound device will use only micro watts. IR can still be used to detect gestures, as in this gesture based security lock, but lacks the resolution that can be obtained by ultrasound.  So let us delve deep into the details of [Ackerley] and [Lydia’s] ultrasound version of a gesture recognizer, so that we might understand just how it all works, and you too can implement your own ultrasound gesture recognition system.

Most of us are aware of the Doppler Effect – the compressing and stretching of waveforms as the source moves toward or away from a point. Consider a device that consists of a tone generator above the 20kHz human ear threshold (ultrasound) and a microphone transducer that would react to reflections of the ultrasound waveform. If an object, such as a hand, were moving toward the device, the reflected waveform would experience a Doppler shift. Such a shift would be seen by the microphone. The same would happen if the object were moving away from the device. This frequency shift can be calculated by:




In order to determine if an object is moving toward or away from the device, you must compare the outgoing and incoming frequencies. [Ackerley] and [Lydia] decided to use the Fast Fourier Transform equation to do this – the same technique used by Microsoft’s Sound Wave, which inspired their project. Unfortunately their assigned processor, the Atmel 1284p, would not be able to handle the Fast Fourier Transform AND signal acquisition at the same time. It was just not fast enough. Stumped, their instructor suggested a clever idea. An idea that will open up gesture recognition via ultrasound to the world of the 8 bit micro controller. You see, instead of doing the frequency comparison on the resource limited digital side, do it on the analog side with an AD633 Analog Multiplier IC (pdf warning).

It turns out that if you multiply two sine waves, you will get two different products. One will be the difference and the other will be the sum of the two waveforms. There is beauty in this. Our paradigm has shifted. This single 8 pin IC can determine the difference in frequency between the incoming and outgoing signals. Consider an outgoing frequency of 24kHz. Now consider a hand moving toward the device creating a Doppler shifted frequency of 24.1kHz. The output of the AD633 would be 1kHz and 48.1kHz.  The 48.1kHz is easily filtered away and you are left with the 100Hz, or the difference between the incoming/outgoing frequencies that an 8 bit micro controller can easily sample.

Now a keen eye will see that the Doppler shifted frequency only reveals magnitude, and not direction. [Ackerley] and [Lydia] solve this problem by observing subtle changes in amplitude of the difference frequency. Many more details of how this is done can be found in the linked article. The image below show’s their algorithm in the Atmel detecting a “pull” motion.




The genius of this project is that a viable gesture recognition system can be implemented with cheap components. The approach of doing a similar system with a PC or smart device would be different. We would like to see the microcontroller side pushed further. Imagine a system in an elevator where the passenger could “draw” the number to the floor he or she wanted to go to. Or a paper towel system that would dispense towels as we twirled our hand, and stop when we stopped twirling. Or a sink that could change water temperature with a simple gesture. Such systems, using the technology designed by [Ackerley] and [Lydia], should be possible.

27 thoughts on “Gesture Recognition Using Ultrasound

  1. > hands free tech is everywhere in modern public restrooms.

    Ummm, really? Apart from the the air hand dryer, I don’t recall anything ever being hands free in any toilets I’ve had reason to use, public or not.

        1. A hands-free toilet detects the presence of a person using it (usually with an IR proximity sensor) and flushes shortly after the person leaves. They usually have a button/switch for manual flushing in case of sensor malfunction.

        1. probably an NZ thing. It would make sense if there’s not a local company who makes automatic plumbing fixtures, because then you’d have to pay to have them shipped form elsewhere instead, and shipping to New Zealand tends to be expensive because it’s kind of remote.

          1. I’m from Czech republic and there usually are hands-free hand dryers and taps but manual soap dispensers and paper towel boxes. I don’t think the bacteria are that much of a problem because if you compare one touch of a hand you wash right away to just breathing in air you find out bacteria are all around and people who are scared of them can actually get way more vulnerable to bacterial infections. Also see

    1. Perhaps you could modulate a pattern into the transmitter with a swept signal to obtain the distance and relative speed change.
      Or multiple transmitters or multiple receivers operating at different frequency ranges to obtain the a more detailed position.

  2. And then you have to grab a dirty doorknob or pull bar to leave without a paper towel to use.That leaves only the stool paper to use, they will e-that as well! Toto toilets!

    1. Bathroom doors that open -in- are a huge pet-peeve of mine.
      Air dryers, except for the high volume ones, are less enviro-friendly than recycled paper towel. All air dryers are less sanitary. High volume air blasting fecal coliform into the air *shudder*. Refillable soap dispensers, where the soap container is refilled not where you put in a new soap ‘cartridge’, can be germ harbors too.

      1. I HATE the recent “innovation” that is those air-blade driers. They’re supposed to solve all the problems etc of hand-drying in a filthy pit of strangers faecal matter. They work by directing the air in a certain way or something, so “no touch”. Except they require you to place your hands in a curved gulley only millimetres wider than your hand.

        Like playing one of those loop-over-wire buzzer games. Only instead of a buzzer, you get to touch a surface every other filthy hand has smeared across. So in the time-honoured fashion I’ll just flick the water off into the air and finish drying on my trousers. Stupid, stupid, STUPID invention!

  3. I see no reason why the Doppler shift wouldn’t indicate direction unless the chip they’re using for frequency comparison only gives the magnitude of the difference between the two frequencies.

  4. “We’ve all gone from sink to sink in search of one that actually worked.”

    Actually, nope. I’ve encountered these quite often, but apart from a rare handwaving ritual, they always worked.

  5. You could download free software to do this on a laptop years ago, using just the internal speakers and microphone. These days, we can do MUCH better:
    echolocation algorithm maps cathedral in 3D to the millimetre:

    [quote]Computer scientists have developed an algorithm that uses echolocation to build an accurate 3D replica of any space with just four microphones. … “Our software can build a 3D map of a simple, convex room with a precision of a few millimetres” … What makes the algorithm design compelling is that it only needs data from four microphones, and these microphones can be placed anywhere in the space to get an accurate reading. It filters out the early, stronger echoes, from the latter weaker ones, building up an image as it goes along. … “Each microphone picks up the direct sound from the source, as well as the echoes arriving from various walls,” Dokmanić says. “The algorithm then compares the signal from each microphone. The infinitesimal lags that appear in the signals are used to calculate not only the distance between the microphones, but also the distance from each microphone to the walls and the sound source.” … The algorithm was successfully tested in a room where the wall was moved, but more impressively in Lausanne Cathedral’s side portal, which is as intricate as a building comes. The region was accurately mapped in 3D … The idea behind the technology is not to help train us all in echolocation — which researchers believe can be done — but apply it to forensic science, architecture planning and even the consumer space. … the team wants to build a smartphone app that uses inaudible ultrasound to map interiors, say in a shopping centre, to tell consumers exactly where they are …[/quote]

    1. Okay, reading the linked page, it says “inspired by Microsoft’s Sound Wave” (one of the apps for a laptop I had referred to above).

      This one uses a microprocessor and other required components, rather than a laptop, so could be cheaper for a “one-off” hardware apps.

      I would think a piezo emitter may be better for this app than a speaker, for such a custom hardware solution. Just use frequencies that work well with such an emitter.

  6. Hehe, I’m always amazed to see how using “low-tech” could appear clever in today’s Arduino/ARM/FPGA/uP-only designs.

    I remember years ago, when as a student I designed a card for the classic line-following robot. Input was an analog cam, output were bytes indicating line path at three distances in front of the robot, all of this using only an LM3881, 74** logic chips and a single clock source.

    Digital high-speed computation isn’t always the best or right way to solve a problem, sometimes thinking with some basic, well-known simple principles saves the day !

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.