Speech Recognition On An Arduino Nano?

Like most of us, [Peter] had a bit of extra time on his hands during quarantine and decided to take a look back at speech recognition technology in the 1970s. Quickly, he started thinking to himself, “Hmm…I wonder if I could do this with an Arduino Nano?” We’ve all probably had similar thoughts, but [Peter] really put his theory to the test.

The hardware itself is pretty straightforward. There is an Arduino Nano to run the speech recognition algorithm and a MAX9814 microphone amplifier to capture the voice commands. However, the beauty of [Peter’s] approach, lies in his software implementation. [Peter] has a bit of an interplay between a custom PC program he wrote and the Arduino Nano. The learning aspect of his algorithm is done on a PC, but the implementation is done in real-time on the Arduino Nano, a typical approach for really any machine learning algorithm deployed on a microcontroller. To capture sample audio commands, or utterances, [Peter] first had to optimize the Nano’s ADC so he could get sufficient sample rates for speech processing. Doing a bit of low-level programming, he achieved a sample rate of 9ksps, which is plenty fast for audio processing.

To analyze the utterances, he first divided each sample utterance into 50 ms segments. Think of dividing a single spoken word into its different syllables. Like analyzing the “se-” in “seven” separate from the “-ven.” 50 ms might be too long or too short to capture each syllable cleanly, but hopefully, that gives you a good mental picture of what [Peter’s] program is doing. He then calculated the energy of 5 different frequency bands, for every segment of every utterance. Normally that’s done using a Fourier transform, but the Nano doesn’t have enough processing power to compute the Fourier transform in real-time, so Peter tried a different approach. Instead, he implemented 5 sets of digital bandpass filters, allowing him to more easily compute the energy of the signal in each frequency band.

The energy of each frequency band for every segment is then sent to a PC where a custom-written program creates “templates” based on the sample utterances he generates. The crux of his algorithm is comparing how closely the energy of each frequency band for each utterance (and for each segment) is to the template. The PC program produces a .h file that can be compiled directly on the Nano. He uses the example of being able to recognize the numbers 0-9, but you could change those commands to “start” or “stop,” for example, if you would like to.

[Peter] admits that you can’t implement the type of speech recognition on an Arduino Nano that we’ve come to expect from those covert listening devices, but he mentions small, hands-free devices like a head-mounted multimeter could benefit from a single word or single phrase voice command. And maybe it could put your mind at ease knowing everything you say isn’t immediately getting beamed into the cloud and given to our AI overlords. Or maybe we’re all starting to get used to this. Whatever your position is on the current state of AI, hopefully, you’ve gained some inspiration for your next project.

Self-Driving Or Mind Control? Which Do You Prefer?

We know you love a good biohack as much as we do, so we thought you would like [Tony’s] brainwave-controlled RC truck. Instead of building his own electroencephalogram (EEG), he thought he would use NeuroSky’s MindWave. EEGs are pretty complex, multi-frequency waves that require some fairly sophisticated circuitry and even more sophisticated signal processing to interpret. So, [Tony] thought it would be nice to off-load a bit of that heavy-lifting, and luckily for him, the MindWave headset is fairly hacker-friendly.

EEGs are a very active area of research, so some of the finer details of the signal are still being debated. However, It appears that attention can be quantified by measuring alpha waves which are EEG content between 8-10 Hz. And it seems as though eye blinks can be picked from the EEG as well. Conveniently, the MindWave exports these energy levels to an accompanying smartphone application which [Tony] then links to his Arduino over Bluetooth using the ever-so-popular HC-05 module.

To control the car, he utilized the existing remote control instead of making his own. Like most people, [Tony] thought about hooking up the Arduino pins to the buttons on the remote control, thereby bypassing the physical buttons, but he noticed the buttons were a bit smaller than he was comfortable soldering to and he didn’t want to risk damaging the circuit board. [Tony’s] RC truck has a pistol grip transmitter, which inspired a slightly different approach. He mounted the servo onto the controller’s wheel mechanism, allowing him to control the direction of the truck by rotating the wheel using the servo. He then fashioned another servo onto the transmitter such that the servo could depress the throttle when it rotates. We thought that was a pretty nifty workaround.

Cool project, [Tony]! We’ve seen some cool EEG Hackaday Prize entries before. Maybe this could be the next big one.

Continue reading “Self-Driving Or Mind Control? Which Do You Prefer?”

AI Makes Linux Do What You Mean, Not What You Say

We are always envious of the Star Trek Enterprise computers. You can just sort of ask them a hazy question and they will — usually — figure out what you want. Even the automatic doors seemed to know the difference between someone walking into a turbolift versus someone being thrown into the door during a fight. [River] decided to try his new API keys for the private beta of an AI service to generate Linux commands based on a description. How does it work? Watch the video below and find out.

Some examples work fairly well. In response to “email the Rickroll video to Jeff Bezos,” the system produced a curl command and an e-mail to what we assume is the right place. “Find all files in the current directory bigger than 1 GB” works, too.

Continue reading “AI Makes Linux Do What You Mean, Not What You Say”

Death Of The Turing Test In An Age Of Successful AIs

IBM has come up with an automatic debating system called Project Debater that researches a topic, presents an argument, listens to a human rebuttal and formulates its own rebuttal. But does it pass the Turing test? Or does the Turing test matter anymore?

The Turing test was first introduced in 1950, often cited as year-one for AI research. It asks, “Can machines think?”. Today we’re more interested in machines that can intelligently make restaurant recommendations, drive our car along the tedious highway to and from work, or identify the surprising looking flower we just stumbled upon. These all fit the definition of AI as a machine that can perform a task normally requiring the intelligence of a human. Though as you’ll see below, Turing’s test wasn’t even for intelligence or even for thinking, but rather to determine a test subject’s sex.

Continue reading “Death Of The Turing Test In An Age Of Successful AIs”

An AI-Free Way To Catch Wildlife On Camera

Judging by the over-representation of the term “AI” in our news feeds these days, we’re clearly in the exponential phase of the artificial intelligence hype cycle, and very nearly at the dreaded “Peak of Inflated Expectations.” It seems like there’s nothing that AI can’t do, and nowhere that its principles can’t be applied to virtuous — and profitable — effect.

We don’t deny that AI has massive potential, but we strongly suspect that there will soon come a day when eyes will roll and stomachs will turn at yet another AI application that could have been addressed with something easier. An example of the simpler approach can be seen in this non-AI wildlife photo trap, cobbled together by [Sebastian] to capture pictures of some camera-shy squirrels. Rather than train an AI with gigabytes of squirrel images, he instead relies on his old Sony Alpha camera, which has a built-in WiFi. A Python script connects to the camera, which is trained on a feeder box and set to a very narrow depth of field. That makes a good percentage of the scene out of focus until a squirrel or other animal comes along looking for treats. The script detects the increased area of the scene that is now in-focus with a Laplace operator in OpenCV, and triggers the camera shutter. [Sebastian] ended up with some wonderful shots of the shy squirrels using this scheme; the video below describes the setup in more detail.

It’s not the first time we’ve seen Laplace transforms used to gauge image sharpness, of course, but we really like the approach [Sebastian] took here for its simplicity. The squirrels are cute too.

Continue reading “An AI-Free Way To Catch Wildlife On Camera”

An ESP Will Read Your Meter For You

As home automation starts to live up to its glossy sci-fi promise there remains a deficiency when it comes to interfacing between the newer computerised components and legacy items from a previous age. A frequent example that appears in projects on Hackaday is the reading of utility meters, and in that arena [jomjol] has a very neat solution involving an ESP32 camera module and a software neural network to identify meter readings directly.

The ESP and camera sit at the top of a 3D-printed housing that fits over the meter. The clever trick comes as each photo’s orientation is determined, and not only is OCR used to read digits but also figures are derived from small dial meters and other indicators on the meter face. It’s a very well-thought-out system, with a web-based configuration tool that allows full customisation of the readable zones and how they should be treated.

This project makes full use of the ESP32’s capabilities, and the attention to detail that has gone into making it usable is particularly impressive. It certainly raises the bar against previous OCR meter reading projects.

[Thanks for the tip Sascha]

AI Learns To Drive Trackmania

Machine learning has long been a topic of interest for humanity, but only in recent years have we had broad access to great computing power to enable to the average person to dive in. [Yosh] recently decided to put an AI to work learning how to race in Trackmania.

After early experiments with supervised learning, [Yosh] decided to implement a genetic algorithm to produce an AI to drive in the game. The AI takes distance from the track walls as an input, and has steering and accelerator values as an output. Starting with 100 AIs in generation 1, [Yosh] iterated by choosing the AIs that covered the longest distance in 13 seconds. Once the AIs started to get the hang of the first few corners, he changed the training to instead prioritize the lowest time taken to traverse each of the checkpoints along the track.

The AI improved over time, and over 100 generations, got down to a 23.48s time on the test track, versus 19.63s for [Trabadia], a talented human. We’d love to see how much better the AI could do with more training. [Yosh] is trying more experiments, like providing extra feedback in the AI fitness function to keep it from hitting the walls. It’s not the first time we’ve seen a genetic algorithm used to train a racing AI, either. Video after the break.

Continue reading “AI Learns To Drive Trackmania”