Learn Sign Language Using Machine Vision

Learning a new language is a great way to exercise the mind and learn about different cultures, and it’s great to have a native speaker around to improve the learning experience. Without one it’s still possible to learn via videos, books, and software though. The task does get much more complicated when trying to learn a language that isn’t spoken, though, like American Sign Language. This project allows users to learn the ASL alphabet with the help of computer vision and some machine learning algorithms.

The build uses a computer vision model in MobileNetV2 which is trained for each sign in the ASL alphabet. A sign is shown to the user on a screen, and the user needs to demonstrate the sign to the computer in order to progress. To do this, OpenCV running on a Raspberry Pi with a PiCamera is used to analyze the frames of the user in real-time. The user is shown pictures of the correct sign, and is rewarded when the correct sign is made.

While this only works for alphabet signs in ASL currently, the team at the University of Glasgow that built this project is planning on expanding it to include other signs as well. We have seen other machines built to teach ASL in the past, like this one which relies on a specialized glove rather than computer vision.

Continue reading “Learn Sign Language Using Machine Vision”

Training Doppler Radar With Smart Watch IMUs Data For Activity Recognition

When it comes to interpreting sensor data automatically, it helps to have a large data set to assist in validating it, as well as training when it concerns machine learning (ML). Creating this data set with carefully tagged and categorized information is a long and tedious process, which is where the idea of cross-domain translations come into play, as in the case of using millimeter wave (mmWave) radar sensors to recognize activity of e.g. building occupants with the IMU2Doppler project at Smash Lab of Carnegie Mellon University.

The most commonly used sensor type when it comes to classifying especially human motion are inertial measurement units (IMU) such as accelerometers and gyroscopes, which are found in everything from smartphones to smart watches and fitness bands. For these devices it’s common to classify measurement patterns as matches a particular activity, such as walking, jogging, or brushing one’s teeth. This makes them both well-defined and very accessible.

As for why a mmWave-based Doppler radar would be preferred for monitoring e.g. building occupants is the privacy aspect compared to using cameras, and the inconvenience of equipping people with a body-worn IMU. Using Doppler radar it would theoretically be possible for people to track activities within their own home, as well as in a medical setting to ensure patients are safe, or at a gym to track one’s performance, or usage of equipment. All without the use of cameras or personal sensors. In the past, we’ve seen a similar approach that used targeted laser beams.

As promising as this sounds, at this point in time the number of activities that are recognized with reasonable accuracy (~70%) is limited to ten types. Depending on the intended application this may already be sufficient, though as the published paper notes, there is still a lot of room for growth.

Continue reading “Training Doppler Radar With Smart Watch IMUs Data For Activity Recognition”

A man performing push-ups in front of a PC

Machine Learning Helps You Get In Shape While Working A Desk Job

Humans weren’t made to sit in front of a computer all day, yet for many of us that’s how we spend a large part of our lives. Of course we all know that it’s important to get up and move around every now and then to stretch our muscles and get our blood flowing, but it’s easy to forget if you’re working towards a deadline. [Victor Sonck] thought he needed some reminders — as well as some not-so-gentle nudging — to get into the habit of doing a quick workout a few times a day.

To this end, he designed a piece of software that would lock his computer’s screen and only unlock it if he performed five push-ups. Locking the screen on his Linux box was as easy as sending a command through the network, but recognizing push-ups was a harder task for which [Victor] decided to employ machine learning. A Raspberry Pi with a webcam attached could do the trick, but the limited processing power of the Pi’s CPU might prove insufficient for processing lots of raw image data.

[Victor] therefore decided on using a Luxonis OAK-1, which is a 4K camera with a built-in machine-learning processor. It can run various kinds of image recognition systems including Blazepose, a pre-trained model that can recognize a person’s pose from an image. The OAK-1 uses this to send out a set of coordinates that describe the position of a person’s head, torso and limbs to the Raspberry Pi through a USB interface. A second machine-learning model running on the Pi then analyzes this dataset to recognize push-ups.

[Victor]’s video (embedded below) is an entertaining introduction into the world of machine-learning systems for video processing, as well as a good hands-on example of a project that results in a useful tool. If you’re interested in learning more about machine learning on small platforms, check out this 2020 Remoticon talk on machine learning on microcontrollers, or this 2019 Supercon talk about implementing machine vision on a Raspberry Pi.

Continue reading “Machine Learning Helps You Get In Shape While Working A Desk Job”

Using Statistics Instead Of Sensors

Statistics often gets a bad rap in mathematics circles for being less than concrete at best, and being downright misleading at worst. While these sentiments might ring true for things like political polling, it hides the fact that statistical methods can be put to good use in engineering systems with fantastic results. [Mark Smith], for example, has been working on an espresso machine which can make the perfect shot of coffee, and turned to one of the tools in the statistics toolbox in order to solve a problem rather than adding another sensor to his complex coffee-brewing machine.

To make espresso, steam is generated which is then forced through finely ground coffee. [Mark] found that his espresso machine was often pouring too much or too little coffee, and in order to improve his machine’s accuracy in this area he turned to the linear regression parameter R2, also known as the coefficient of determination. By using a machine learning algorithm tuned to this value, which assesses predictable variation in a data set, a computer can more easily tell when the coffee begins pouring out of the portafilter and into the espresso cup based on the pressure and water flow in the machine itself rather than using some other input such as the weight of the cup.

We have seen in the past how seriously [Mark] takes his coffee-making, and this is another step in a series of improvements he has made to his equipment. In this iteration, he has additionally produced a simulation in JupyterLab to better assist him in modeling the system and making even more accurate predictions. It’s quite a bit more effort than adding sensors, but since his espresso machine already included quite a bit of computing power it’s not too big a leap for him to make.

AI-Generated Sleep Podcast Urges You To Imagine Pleasant Nonsense

[Stavros Korokithakis] finds the experience of falling asleep to fairy tales soothing, and this has resulted in a fascinating project that indulges this desire by using machine learning to generate mildly incoherent fairy tales and read them aloud. The result is a fantastic sort of automated, machine-generated audible sleep aid. Even the logo is machine-generated!

The Deep Dreams Podcast is entirely machine-generated, including the logo.

The project leverages the natural language generation abilities of OpenAI’s GPT-3 to create fairytale-style content that is just coherent enough to sound natural, but not quite coherent enough to make a sensible plotline. The quasi-lucid, dreamlike result is perfect for urging listeners to imagine pleasant nonsense (thanks to Nathan W Pyle for that term) as they drift off to sleep.

We especially loved reading about the methods and challenges [Stavros] encountered while creating this project. For example, he talks about how there is more to a good-sounding narration than just pointing a text-to-speech engine at a wall of text and mashing “GO”. A good episode has things like strategic pauses, background music, and audio fades. That’s where pydub — a Python library for manipulating audio — came in handy. As for the speech, text-to-speech quality is beyond what it was even just a few years ago (and certainly leaps beyond machine-generated speech in the 80s) but it still took some work to settle on a voice that best suited the content, and the project gradually saw improvement.

Deep Dreams Podcast has a GitLab repository if you want to see the code that drives it all, and you can go to the podcast itself to give it a listen.

OpenCV Brings Pinch To Zoom Into The Real World

Gesture controls arrived in the public consciousness a little over a decade ago as touchpads and touchscreens became more popular. The main limitation to gesture controls, a least as far as [Norbert] is concerned, is that they can only control objects in a virtual space. He was hoping to use gestures to control a real-world object instead, and created this device which uses gestures to control an actual picture.

In this unique augmented reality device, not only is the object being controlled in the real world but the gestures are being monitored there as well, thanks to a computer vision system watching his hand which is running OpenCV. The position data is fed into an algorithm which controls a physical picture mounted on a slender robotic arm. Now, when [Norbert] “pinches to zoom”, the servo attached to the picture physically brings it closer to or further from his field of view. He can also use other gestures to move the picture around.

While this gesture-controlled machine is certainly a proof-of-concept, there are plenty of other uses for gesture controls of real-world objects. Any robotics platform could benefit from an interface like this, or even something slightly more mundane like an office PowerPoint presentation. Opportunity abounds, but if you need a primer for OpenCV take a look at this build which tracks a hand in minute detail.

Continue reading “OpenCV Brings Pinch To Zoom Into The Real World”

A Soft Thumb-Sized Vision-Based Touch Sensor

A team from the Max Planck Institute for Intelligent Systems in Germany have developed a novel thumb-shaped touch sensor capable of resolving the force of a contact, as well as its direction, over the whole surface of the structure. Intended for dexterous manipulation systems, the system is constructed from easily sourced components, so should scale up to a larger assemblies without breaking the bank. The first step is to place a soft and compliant outer skin over a rigid metallic skeleton, which is then illuminated internally using structured light techniques. From there, machine learning can be used to estimate the shear and normal force components of the contact with the skin, over the entire surface, by observing how the internal envelope distorts the structured illumination.

The novelty here is the way they combine both photometric stereo processing with other structured light techniques, using only a single camera. The camera image is fed straight into a pre-trained machine learning system (details on this part of the system are unfortunately a bit scarce) which directly outputs an estimate of the contact shape and force distribution, with spatial accuracy reported good to less than 1 mm and force resolution down to 30 millinewtons. By directly estimating normal and shear force components the direction of the contact could be resolved to 5 degrees. The system is so sensitive that it can reportedly detect its own posture by observing the deformation of the skin due its own weight alone!

We’ve not covered all that many optical sensing projects, but here’s one using a linear CIS sensor to turn any TV into a touch screen. And whilst we’re talking about using cameras as sensors, here’s a neat way to use optical fibers to read multiple light-gates with a single camera and OpenCV.

Continue reading “A Soft Thumb-Sized Vision-Based Touch Sensor”