Voice Without Sound

March 15, 2023 by Bryan Cockfield 14 Comments

Voice recognition is becoming more and more common, but anyone who’s ever used a smart device can attest that they aren’t exactly fool-proof. They can activate seemingly at random, don’t activate when called or, most annoyingly, completely fail to understand the voice commands. Thankfully, researchers from the University of Tokyo are looking to improve the performance of devices like these by attempting to use them without any spoken voice at all.

The project is called SottoVoce and uses an ultrasound imaging probe placed under the user’s jaw to detect internal movements in the speaker’s larynx. The imaging generated from the probe is fed into a series of neural networks, trained with hundreds of speech patterns from the researchers themselves. The neural networks then piece together the likely sounds being made and generate an audio waveform which is played to an unmodified Alexa device. Obviously a few improvements would need to be made to the ultrasonic imaging device to make this usable in real-world situations, but it is interesting from a research perspective nonetheless.

The research paper with all the details is also available (PDF warning). It’s an intriguing approach to improving the performance or quality of voice especially in situations where the voice may be muffled, non-existent, or overlaid with a lot of background noise. Machine learning like this seems to be one of the more powerful tools for improving speech recognition, as we saw with this robot that can walk across town and order food for you using voice commands only.

Continue reading “Voice Without Sound” →

Computer Vision Extracts Lightning From Footage

July 1, 2022 by Bryan Cockfield 16 Comments

Lightning is one of the more mysterious and fascinating phenomenon on the planet. Extremely powerful, but each strike on average only has enough energy to power an incandescent bulb for an hour. The exact mechanism that starts a lightning strike is still not well understood. Yet it happens 45 times per second somewhere on the planet. While we may not gain a deeper scientific appreciation of lightning anytime soon, but we can capture it in various photography thanks to this project which leverages computer vision ~~machine learning~~ to pull out the best frames of lightning.

The project’s creator, [Liam], built this as a tool for stormchasers and photographers so that they can film large amounts of time and not have to go back through their footage manually to pull out the frames with lightning strikes. The project borrows from a similar project, but this one adds Python 3 capabilities and runs on a tiny netbook for more easy field deployment. It uses OpenCV for object recognition, using video files as the source data, and features different modes to recognize different types of lightning.

The software is free and open source, and releases are supported for both Windows and Linux. So far, [Liam] has been able to capture all kinds of electrical atmospheric phenomenon with it including lightning, red sprites, and elves. We don’t see too many projects involving lightning around here, partly because humans can only generate a fraction of the voltage potential needed for the average lightning strike.

What Is It, R2? Have Something To Share?

October 27, 2017 by James Hobson 5 Comments

Sometimes great projects keep evolving. [Bithead942] built himself an R2-D2 to accompany him when he goes a-trooping — but something didn’t feel quite right. Turns out, R2 was missing its signature beeping banter, so he made it more contextually responsive by implementing a few voice commands.

[Bithead942]’s main costume is that of an X-Wing pilot, and the replica helmet works perfectly; it already has a fake microphone — easily replaced with a working model — and the perfect niche to stash the electronics in the ‘mohawk.’

Even though the helmet has the perfect hiding spot for a circuit, space is still at a premium. Services like Alexa tend to be pretty accurate, but require WiFi access — not a guarantee on the convention floor. Instead, [bithead942] found that the EasyVR Shield 3.0 voice recognition board provided a suitable stand-in. It needs a bit of training to work properly(cue the montage!), but in the end it compares fresh audio commands to the ‘training’ files it has stored, and if there’s a match, triggers a corresponding serial port. It’s not perfect, but it most certainly works!

Continue reading “What Is It, R2? Have Something To Share?” →

RadarCat Gives Computers A Sense Of Touch

October 21, 2016 by James Hobson 18 Comments

So far, humans have had the edge in the ability to identify objects by touch. but not for long. Using Google’s Project Soli, a miniature radar that detects the subtlest of gesture inputs, the [St. Andrews Computer Human Interaction group (SACHI)] at the University of St. Andrews have developed a new platform, named RadarCat, that uses the chip to identify materials, as if by touch.

Realizing that different materials return unique radar signals to the chip, the [SACHI] team combined it with their recognition software and machine learning processes that enables RadarCat to identify a range of materials with accuracy in real time! It can also display additional information about the object, such as nutritional information in the case of food, or product information for consumer electronics. The video displays how RadarCat has already learned an impressive range of materials, and even specific body parts. Can Skynet be far behind?

Continue reading “RadarCat Gives Computers A Sense Of Touch” →

Impedance Tomography Is The New X-Ray Machine

November 12, 2015 by Bryan Cockfield 43 Comments

Seeing what’s going on inside a human body is pretty difficult. Unless you’re Superman and you have X-ray vision, you’ll need a large, expensive piece of medical equipment. And even then, X-rays are harmful part of the electromagnetic spectrum. Rather than using a large machine or questionable Kryptonian ionizing radiation vision, there’s another option now: electrical impedance tomography.

[Chris Harrison] and the rest of a research team at Carnegie Mellon University have come up with a way to use electrical excitation to view internal impedance cross-sections of an arm. While this doesn’t have the resolution of an X-ray or CT, there’s still a large amount of information that can be gathered from using this method. Different structures in the body, like bone, will have a different impedance than muscle or other tissues. Even flexed muscle changes its impedance from its resting state, and the team have used their sensor as proof-of-concept for hand gesture recognition.

This device is small, low power, and low-cost, and we could easily see it being the “next thing” in smart watch features. Gesture recognition at this level would open up a whole world of possibilities, especially if you don’t have to rely on any non-wearable hardware like ultrasound or LIDAR.

Voice Controlled RGB LED Lamp

August 17, 2014 by Rick Osgood 2 Comments

Voice Controlled Lamp

[Saurabh] wanted a quick project to demonstrate how easy it can be to build devices that are voice controlled. His latest Instructable does just that using an Arduino and Visual Basic .Net.

[Saurabh] decided to build a voice controlled lamp. He knew he wanted it to change colors as well as be energy-efficient. It also had to be easy to control. The obvious choice was to use an RGB LED. The LED on its own wouldn’t be very interesting. He needed something to diffuse the light, like a lampshade. [Saurabh] decided to start with an empty glass jar. He filled the jar with gel wax, which provides a nice surface to diffuse the light.

The RGB LED was mounted underneath the jar’s screw-on cover. [Saurabh] soldered a 220 ohm current limiting resistor to each of the three anodes of the LED. A hole was drilled in the cap so he’d have a place to run the wires. The LED was then hooked up to an Arduino Leonardo.

The Arduino sketch has several built-in functions to set all of the colors, and also fade. [Saurabh] then wrote a control interface using Visual Basic .Net. The interface allows you to directly manipulate the lamp, but it also has built-in voice recognition functionality. This allows [Saurabh] to use his voice to change the color of the lamp, turn it off, or initiate a fading routing. You can watch a video demonstration of the voice controls below. Continue reading “Voice Controlled RGB LED Lamp” →

Cat Door Unlocks Via Facial Recognition

May 14, 2010 by Mike Szczys 50 Comments

Faced with critters trying to get in and a cat that loved to show them her latest kill, the folks at Quantum Picture came up with a system that unlocks the cat door based on image recognition. As you can see above, it uses a camera to capture the profile of anything approaching the cat door. That image is compared to stored positive identification sets, making up a feline positive identification protocol. Don’t think this is necessary? In the writeup there’s a couple of images showing the outline of a skunk. Sounds like this system is a necessity.

We wonder if this lucky cat also has an Internet enabled cat feeder?

[Thanks Stephen]