Speech Recognition On An Arduino Nano?

Like most of us, [Peter] had a bit of extra time on his hands during quarantine and decided to take a look back at speech recognition technology in the 1970s. Quickly, he started thinking to himself, “Hmm…I wonder if I could do this with an Arduino Nano?” We’ve all probably had similar thoughts, but [Peter] really put his theory to the test.

The hardware itself is pretty straightforward. There is an Arduino Nano to run the speech recognition algorithm and a MAX9814 microphone amplifier to capture the voice commands. However, the beauty of [Peter’s] approach, lies in his software implementation. [Peter] has a bit of an interplay between a custom PC program he wrote and the Arduino Nano. The learning aspect of his algorithm is done on a PC, but the implementation is done in real-time on the Arduino Nano, a typical approach for really any machine learning algorithm deployed on a microcontroller. To capture sample audio commands, or utterances, [Peter] first had to optimize the Nano’s ADC so he could get sufficient sample rates for speech processing. Doing a bit of low-level programming, he achieved a sample rate of 9ksps, which is plenty fast for audio processing.

To analyze the utterances, he first divided each sample utterance into 50 ms segments. Think of dividing a single spoken word into its different syllables. Like analyzing the “se-” in “seven” separate from the “-ven.” 50 ms might be too long or too short to capture each syllable cleanly, but hopefully, that gives you a good mental picture of what [Peter’s] program is doing. He then calculated the energy of 5 different frequency bands, for every segment of every utterance. Normally that’s done using a Fourier transform, but the Nano doesn’t have enough processing power to compute the Fourier transform in real-time, so Peter tried a different approach. Instead, he implemented 5 sets of digital bandpass filters, allowing him to more easily compute the energy of the signal in each frequency band.

The energy of each frequency band for every segment is then sent to a PC where a custom-written program creates “templates” based on the sample utterances he generates. The crux of his algorithm is comparing how closely the energy of each frequency band for each utterance (and for each segment) is to the template. The PC program produces a .h file that can be compiled directly on the Nano. He uses the example of being able to recognize the numbers 0-9, but you could change those commands to “start” or “stop,” for example, if you would like to.

[Peter] admits that you can’t implement the type of speech recognition on an Arduino Nano that we’ve come to expect from those covert listening devices, but he mentions small, hands-free devices like a head-mounted multimeter could benefit from a single word or single phrase voice command. And maybe it could put your mind at ease knowing everything you say isn’t immediately getting beamed into the cloud and given to our AI overlords. Or maybe we’re all starting to get used to this. Whatever your position is on the current state of AI, hopefully, you’ve gained some inspiration for your next project.

FIR Filters For Xilinx

Digital filters are always an interesting topic, and they are especially attractive with FPGAs. [Pabolo] has been working with them in a series of blog posts. The latest covers an 8th order FIR filter in Verilog.  He covers some math, which you can find in many places, but he also shows how an implementation maps to DSP slices in a device. Then to reduce the number of slices, he illustrates folding which trades delay time for slice usage.

Folding takes a multi-stage parallel multiplication and breaks it into fewer multiplications done over a longer period of time. This reuses slices to reduce the number required for high-order filters.

Continue reading “FIR Filters For Xilinx”

DSP Spreadsheet: The Goertzel Algorithm Is Fourier’s Simpler Cousin

You probably have at least a nodding familiarity with the Fourier transform, a mathematical process for transforming a time-domain signal into a frequency domain signal. In particular, for computers, we don’t really have a nice equation so we use the discrete version of the transform which takes a series of measurements at regular intervals. If you need to understand the entire frequency spectrum of a signal or you want to filter portions of the signal, this is definitely the tool for the job. However, sometimes it is more than you need.

For example, consider tuning a guitar string. You only need to know if one frequency is present or if it isn’t. If you are decoding TouchTones, you only need to know if two of eight frequencies are present. You don’t care about anything else.

A Fourier transform can do either of those jobs. But if you go that route you are going to do a lot of math to compute things you don’t care about just so you can pick out the one or two pieces you do care about. That’s the idea behind the Goertzel. It is essentially a fast Fourier transform algorithm stripped down to compute just one frequency band of interest.  The math is much easier and you can usually implement it faster and smaller than a full transform, even on small CPUs.

Continue reading “DSP Spreadsheet: The Goertzel Algorithm Is Fourier’s Simpler Cousin”

Learning SDR And DSP Hack Chat

Join us on Wednesday, November 11th at noon Pacific for Learning SDR and DSP Hack Chat with Marc Lichtman!

“Revolution” is a term thrown about with a lot less care than it probably should be, especially in fields like electronics. It’s understandable, though — the changes to society that have resulted from the “Transistor Revolution” or the “PC Revolution” or more recently, the “AI Revolution” have been transformative, often for good and sometimes for ill. The common thread, though, is that once these revolutions came about, nothing was ever the same afterward.

Such is the case with software-defined radio (SDR) and digital signal processing (DSP). These two related fields may not seem as transformative as some of the other electronic revolutions, but when you think about it, they really have transformed the world of radio communications. SDR means that complex radio transmitters and receivers, no longer have to be implemented strictly in hardware as a collection of filters, mixers, detectors, and amplifiers; instead, they can be reduced to a series of algorithms running on a computer.

Teamed with DSP, SDR has resulted in massive shifts in the RF field, with powerful, high-bandwidth radio links being built into devices almost as an afterthought. But the concepts can be difficult to wrap one’s head around, at least when digging beyond the basics and really trying to learn how SDR and DSP work. Thankfully, Dr. Marc Lichtman, an Adjunct Professor at the University of Maryland, literally wrote the book on the subject. “PySDR: A Guide to SDR and DSP using Python” is a fantastic introduction to SDR and DSP that’s geared toward those looking to learn how to put SDR and DSP to work in practical systems. Dr. Lichtman will stop by the Hack Chat to talk about his textbook, to answer your questions on how best to learn about SDR and DSP, and to discuss what the next steps are once you conquer the basics.

join-hack-chatOur Hack Chats are live community events in the Hackaday.io Hack Chat group messaging. This week we’ll be sitting down on Wednesday, November 11 at 12:00 PM Pacific time. If time zones baffle you as much as us, we have a handy time zone converter.

Click that speech bubble to the right, and you’ll be taken directly to the Hack Chat group on Hackaday.io. You don’t have to wait until Wednesday; join whenever you want and you can see what the community is talking about.

[Banner image credit: Dsimic, CC BY-SA 4.0, via Wikimedia Commons]

Continue reading “Learning SDR And DSP Hack Chat”

Intuition About Signals And Systems

Signals and systems theory is a tough topic. Terms like convolution and impulse response can be hard to understand on a visceral level and most books that talk about these things emphasize math over intuition. [Discretised] has a YouTube channel that already has several videos that promise to tackle these topics with “minimum maths, maximum intuition.” We particularly noticed the talks on convolution and impulse response.

We think that often math and intuition don’t always come together. It is one thing, for example, to know that E=I times R, and power is I times E, but it is another to realize that a half-watt transmitter delivers 5V into a 50Ω load and that one watt will take just over 7V into that same load.

The example used is computing how much smoke you can expect to create by setting off fireworks. We presume the math models are notional since we imagine a real model would be pretty complex and involve things like wind data. But it still makes a nice example.

If you don’t know anything about the topic, these might not be the right ones to try to learn the basics. But we do applaud people sharing their intuition on these complex subjects.

Continue reading “Intuition About Signals And Systems”

DSP Spreadsheet: Talking To Yourself Using IQ

We’ve done quite a bit with Google Sheets and signal processing: we’ve generated signals, created filters, and computed quadrature signals. We can pull all that together into an educational model for two SDRs talking to each other, but it’s going to require two parts: modulation and demodulation. Guess what? We can do that with a spreadsheet.

The first step is to generate a reference clock for the carrier. You’ll need a cosine wave (I) and sine wave (Q). Of course, you also need the time base. That’s columns A-C in the spreadsheet and works like other signal generation we’ve seen.

Continue reading “DSP Spreadsheet: Talking To Yourself Using IQ”

DSP Spreadsheet: IQ Diagrams

In previous installments of DSP Spreadsheet, we’ve looked at generating signals, mixing them, and filtering them. If you start trying to work with DSP, though, you’ll find a topic that always rears its head: IQ signals. It turns out, these aren’t as hard as they appear at first and, as usual, we’ll tackle them in a spreadsheet.

What does IQ stand for? The I stands for “in phase” and the Q stands for quadrature. By convention, the I signal is a cosine wave and the Q signal is a sine wave. Another way to say that is that the I and Q signals are 90 degrees out of phase. By manipulating the amplitude of I and Q, you can create complex modulation or, conversely, demodulate signals. We’ll see a spreadsheet that shows that completely next time.

Continue reading “DSP Spreadsheet: IQ Diagrams”