Making Linux Offline Voice Recognition Easier

September 25, 2021

For just about any task you care to name, a Linux-based desktop computer can get the job done using applications that rival or exceed those found on other platforms. However, that doesn’t mean it’s always easy to get it working, and speech recognition is just one of those difficult setups.

A project called Voice2JSON is trying to simplify the use of voice workflows. While it doesn’t provide the actual voice recognition, it does make it easier to get things going and then use speech in a natural way.

The software can integrate with several backends to do offline speech recognition including CMU’s pocketsphinx, Dan Povey’s Kaldi, Mozilla’s DeepSpeech 0.9, and Kyoto University’s Julius. However, the code is more than just a thin wrapper around these tools. The fast training process produces both a speech recognizer and an intent recognizer. So not only do you know there is a garage door, but you gain an understanding of the opening and closing of the garage door.

In addition, the tools are all made to work in Unix-style pipelines which is refreshing. Here’s an example configuration from the project’s website:

[GarageDoor]
open the garage door
close the garage door

[LightState]
turn on the living room lamp
turn off the living room lamp

There are templating features so you can specify optional words and alternative words in a single rule. There are other features like mapping an object like living room lamp into something more computer-friendly.

Overall, this looks like a fun tool to have in your kit. If you do something interesting with it, be sure to drop us a tip so we can cover it. Meanwhile, we’ve been watching Linux speech for quite a while. Of course, what we really want is speech commands like the USS Enterprise, and we have to admit it is getting closer.

10 thoughts on “Making Linux Offline Voice Recognition Easier”

Martin says:

September 25, 2021 at 4:20 pm

Can anybody recommend a good offline speech recognition engine that has pretrained models for german?

Report comment

Reply
1. Humpelstilzchen says:
  
  September 25, 2021 at 11:27 pm
  
  Try Vosk (https://alphacephei.com/vosk/) which for unknown reasons is missing in the above list. One of the languages it supports is does is german. So far I only tried the english model but I had good results with it:
  
  https://hackaday.io/project/25406-wild-thumper-based-ros-robot/log/190631-higher-accuracy-speech-recognition-in-ros-with-vosk
  
  Report comment
  
  Reply
Drone says:

September 25, 2021 at 8:25 pm

@Martin said: “Can anybody recommend a good offline speech recognition engine that has pretrained models for german?”

Maybe…

—-[Mozilla DeepSpeech & German]—-==

* Search for: “Mozilla DeepSpeech German” without the ” “s:

https://duckduckgo.com/?q=Mozilla+DeepSpeech+German&t=ffab&ia=web

* DeepSpeech for German Language

https://discourse.mozilla.org/t/deepspeech-for-german-language/36527/14

* German End-to-end Speech Recognition based on DeepSpeech

https://www.researchgate.net/publication/336532830_German_End-to-end_Speech_Recognition_based_on_DeepSpeech

* AASHISHAG / deepspeech-german

https://github.com/AASHISHAG/deepspeech-german

* ynop / deepspeech-german

—-[Mozilla DeepSpeech in-General]—-

* mozilla / DeepSpeech

https://github.com/mozilla/DeepSpeech

* Welcome to DeepSpeech’s documentation!

https://deepspeech.readthedocs.io/en/r0.9/?badge=latest

Report comment

Reply
Metal3d says:

September 25, 2021 at 11:31 pm

For speech recognition and easy approach with python, you can use Vosk API Wich uses Kaldi. It’s offline and it works great.

Report comment

Reply
1. rpavlik says:
  
  September 26, 2021 at 3:54 am
  
  And for dictation, best I’ve found is nerd-dictation which build on Vosk.
  
  Report comment
  
  Reply
ncrmnt says:

September 25, 2021 at 11:31 pm

The best thing for building assistants I’ve found so far is Rhasspy. I plan on doing a proper writeup of setting up my instance once I’m finished. I still have to make that reaspeakerd work on recent Debian (without doa it’s sad performance) and figure out wakeword training.

Report comment

Reply
Jacob Soby says:

September 26, 2021 at 2:05 pm

“Illuminate”
“deluminate”

Report comment

Reply
1. Menga says:
  
  September 26, 2021 at 8:03 pm
  
  I always wonder why this epic voice command wasn’t included with alexa or google home
  
  Report comment
  
  Reply
Gio says:

September 27, 2021 at 5:08 am

What about Simon & Blather? Haven’t tried them yet but Simon, for instance, seems pretty good and simple enough to use. Although, I’m not sure it’s still developed. The last version I could find was 0.4x, released in 2017 but it was supposed to lead to v0.5, which I can’t find.

Report comment

Reply
Miles says:

July 17, 2022 at 8:23 am

i have been wanting to dive into a voice recognition project for home use for many years now… however the only thing that i can find that seams to work uses a third party for the translating.. what i would like is some thing that is considered private (NO GOOGLE). i have been using Linux for several years now and had lots of fun learning from it, so i am not exactly a beginner but after several hours of trying to understand how to install/setup such a machine seams to be quite difficult. nothing i have found on the internet explaining the steps needed to setup such a machine, they always end with or start with creating a google account .. i am not worried about how complex the setup might be but more about basic instructions that do not require a degree in rocket science. so if some one was able to write a human readable tutorial on any one of the popular open source projects i am sure that project will become quite popular overall as i have tried to read and understand the jargon (quickly getting lost in web searches). It seams there is definitely a need to simplify the installation process….

Report comment

Reply

Hackaday

Making Linux Offline Voice Recognition Easier

10 thoughts on “Making Linux Offline Voice Recognition Easier”

Leave a Reply to MartinCancel reply

Search

Never miss a hack

If you missed it

Just For Laughs: Charlie Douglass And The Laugh Track

A Gentle Introduction To Ncurses For The Terminally Impatient

End Of An Era: NOAA’s Polar Sats Wind Down Operations

Crowdsourcing SIGINT: Ham Radio At War

Reconductoring: Building Tomorrow’s Grid Today

Our Columns

This Week In Security: That Time I Caused A 9.5 CVE, IOS Spyware, And The Day The Internet Went Down

Hacker Tactic: ESD Diodes

FLOSS Weekly Episode 837: World’s Best Beta Tester

ZPUI Could Be Your Tiny Embedded GUI

Keebin’ With Kristina: The One With The Gaming Typewriter

10 thoughts on “Making Linux Offline Voice Recognition Easier”

Leave a Reply to MartinCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns