Talking Star Trek

June 8, 2016

Speech generation and recognition have come a long way. It wasn’t that long ago that we were in a breakfast place and endured 30 minutes of a teenaged girl screaming “CALL JUSTIN TAYLOR!” into her phone repeatedly, with no results. Now speech on phones is good enough you might never use the keyboard unless you want privacy. Every time we ask Google or Siri a question and get an answer it makes us feel like we are living in Star Trek.

[Smcameron] probably feels the same way. He’s been working on a Star Trek-inspired bridge simulator called “Space Nerds in Space” for some time. He decided to test out the current state of Linux speech support by adding speech commands and response to it. You can see the results in the video below.

For speech output, he used pico2wave and espeak. There’s also Festival, but he couldn’t get that one working. He also used PocketSphinx for speech recognition and provides information on how to pretrain the system for words to make it respond better. In the video, you can see that it isn’t perfect, but it is pretty good.

The other part of the equation is recognizing natural language and [Smcameron] discusses that, as well. If this were a real starship, you might need to do a little work on the user interface to be sure you heard the right thing before taking some drastic action (“I said: ‘blow warp drive’ not ‘go warp five’!”).

If your system isn’t as powerful as a full Linux box, consider uSpeech for the Arduino. You might also check out Jasper.

34 thoughts on “Talking Star Trek”

DainBramage says:

June 8, 2016 at 4:27 pm

Fascinating…

Report comment

Reply
1. Shannon says:
  
  June 9, 2016 at 3:08 am
  
  Indeed
  
  Report comment
  
  Reply
Bear Naff says:

June 8, 2016 at 5:36 pm

I love the “kitchen sink” approach that Stephen has decided to take with this project. I admit that I kinda miss the primitive wireframe graphics that the game used in early stages, but it’s fun to watch unusual and unique features make it into production.

Report comment

Reply
kris says:

June 8, 2016 at 7:21 pm

Good to see the state of the open source art…it is coming along

not ready for prime time, or my time for that matter, just yet

Report comment

Reply
localroger says:

June 8, 2016 at 7:25 pm

Speech-recognitionwise, that was less than impressive.

Report comment

Reply
1. PurpleTurdBracket says:
  
  June 8, 2016 at 8:02 pm
  
  Sorry hear that you are disappointed, would you like your money back?
  
  Report comment
  
  Reply
  1. notarealemail says:
    
    June 9, 2016 at 1:23 am
    
    [https://youtube.com/watch?v=BF1wVv8OnfE]
    Life is a lemon…
    
    Report comment
    
    Reply
bthy says:

June 8, 2016 at 8:35 pm

i love humor in code, lolled when i saw this in the server main()

take_your_locale_and_shove_it();

Report comment

Reply
1. smcameron says:
  
  June 8, 2016 at 10:59 pm
  
  Ha. smcameron here. This was to work around a problem I ran into reading, iirc, ascii wavefront obj files. Those files contain floats. I was parsing them with sscanf. sscanf expects different things for floats depending on locale, sometimes periods for decimals, sometimes commas, which is kind of crazy since the data files I am reading come wit the program, and I really don’t want per-locale files. This is a starship simulator, we’re post-national here, right? No problem, though, just set the locale with setlocale() at the beginning of your program, right? Well, no, because various libraries seem to keep calling setlocale() — many many times — and undoing your setlocale() — (I’m looking at you, gtk). I got fed up with this nonsense, why is sscanf behaving so stupidly anyway? Why is gtk behaving so stupidly? Fine, children, you want to call setlocale? Fine. So I override it once and for all, and anybody who calls setlocale will get a locale for sure, and that locale will be “C”, and all will be right with the world.
  
  Report comment
  
  Reply
  1. bthy says:
    
    June 9, 2016 at 9:23 am
    
    oh i’ve been on that ride before, that’s why i immediately lolled. Thanks for replying, you have a really cool project going on, i hope you continue to develop it. I was looking over the server code after reading the contribution.md, and i would love to contribute at some point, i might join up later.
    
    Report comment
    
    Reply
bthy says:

June 8, 2016 at 8:46 pm

lots of treasures in smcameron’s repositories, like this procedural spaceship generator that uses OpenScad
https://www.youtube.com/watch?v=VnyerXljmrQ

Report comment

Reply
daid303 says:

June 8, 2016 at 10:30 pm

Shameless plug, we (some guys at Ultimaker) build our own star-trek like bridge simulator as well. It’s FOSS, and can be found at: http://emptyepsilon.org/ and we suck at putting movies online :-)
It’s 2 years in the making. And, it has an HTTP API, which can be use used for all kinds of things.

I tried the same thing as SNIS did, voice control. But then with the google chrome voice API. As the HTTP API is easy to access from javascript. However, I’m not sure if it’s my dutch accent, I found the voice recognition results to be quite bad.

Report comment

Reply
1. Leithoa says:
  
  June 9, 2016 at 5:36 am
  
  Don’t feel bad. Chekov had the same issues.
  
  Report comment
  
  Reply
RoGeorge says:

June 8, 2016 at 11:17 pm

Voice recognition running on a local machine is a joke.
I tried it 15 years again, then last year. Almost no difference. Unusable for a real world application.

Cloud processing for voice recognition is a little better, but still unreliable.

So, for the moment, I am stuck with touch screens everywhere. Oh, how I miss physical buttons…
But there is hope with Soli: https://www.youtube.com/watch?v=0QNiZfSsPc0

:o)

Report comment

Reply
1. Dan#1438459043 says:
  
  June 9, 2016 at 12:15 am
  
  Yes that project was amazing, what happened to it?
  
  As for local voice recognition it works much better if there is a limited vocabulary and the system has been trained/biased accordingly.
  
  My biggest problem with voice is the environment here is sometimes very noisy and chaotic with machines and multiple people, it is even hard to have a phone conversation at times. So I agree sophisticated gesture input would very nice and I expect that complex hand signatures could serve as a form of password too. Just sign you name in mid air, or whatever.
  
  Report comment
  
  Reply
  1. Bear Naff says:
    
    June 9, 2016 at 5:46 am
    
    It appears to still be demo-ing: https://www.cnet.com/videos/project-soli-the-best-thing-we-saw-at-google-io-2016/
    
    Report comment
    
    Reply
2. Elliot Williams says:
  
  June 9, 2016 at 2:23 am
  
  My keyboard never misunderstands me.
  
  Report comment
  
  Reply
  1. Jerry says:
    
    June 9, 2016 at 8:24 am
    
    My kybard constnatly misiunderstands me…
    
    Report comment
    
    Reply
    1. notarealemail says:
      
      June 10, 2016 at 1:09 am
      
      May speak wreck a nation his tear able. Eye knead two train hit butter.
      
      Report comment
      
      Reply
3. lwatcdr says:
  
  June 9, 2016 at 6:04 am
  
  Simple use local speech recognition to trigger the device like saying Hello Siri, Okay Google, or Hello Jane to start activate the cloud based Speech Recognition. That way you can have your own always listening Speech Recognition system.
  Of course once you have the speech to text you might want to use something like https://adapt.mycroft.ai/ or https://opennlp.apache.org/ and maybe throw in http://docs.opencv.org/2.4/modules/contrib/doc/facerec/facerec_tutorial.html
  just for fun.
  
  Report comment
  
  Reply
AltMarcxs says:

June 9, 2016 at 7:17 am

Commercial products work quite well, specially when it’s trained, and I’m not by far a native english speaker..
Just tried the standard OSX mt-lion ASR on the content of the video, via internal mic, works almost better (“warp” isn’t a word), than the video itself !
The big trouble for Open Source are the missing of enough voice profiles, and that were clouded system wins because with each sentence recognized their voices profiles increase. But privacy…
What I hope, is a system that can talk and sound like me, so I could loop this output with the original speech to get way better recognition. Or scam Google,Siri… with boring cooking recipes or marxist books, between my personal requests.

Report comment

Reply
1. chronoglass says:
  
  June 9, 2016 at 8:26 am
  
  exactly, it’s all about the training. I’m actually sort of surprised there isn’t a proper open source ASR “cloud” option yet. Of course I say that as a person who would USE it, not MAINTAIN it.. so yeah
  
  Report comment
  
  Reply
  1. AltMarcxs says:
    
    June 9, 2016 at 9:54 am
    
    A GOOD headset is a must too. About open source -> get involved: http://www.voxforge.org/home
    
    Report comment
    
    Reply
Miroslav says:

June 9, 2016 at 9:16 am

So how does the cloud recognition work? Some guys in India listening to what you say? =ducks =

Report comment

Reply
1. Leithoa says:
  
  June 9, 2016 at 10:03 am
  
  No, they’re in Virginia.
  
  Report comment
  
  Reply
  1. Miroslav says:
    
    June 9, 2016 at 1:06 pm
    
    ROFL :)
    
    Report comment
    
    Reply
Whatnot says:

June 9, 2016 at 3:13 pm

“Every time we ask Google or Siri a question and get an answer it makes us feel like we are living in Star Trek.”

I think there is a episode in the original series where they go in a parallel universe where everybody is a fascist mean duplicate.
That’s probably the place where they would have Google/Siri/Cortana then.

Report comment

Reply
1. PurpleTurdBracket says:
  
  June 9, 2016 at 3:19 pm
  
  Try again, in English.
  
  Report comment
  
  Reply
  1. Whatnot says:
    
    June 9, 2016 at 5:12 pm
    
    “Every time we ask Google or Siri a question and get an answer it makes us feel like we are living in Star Trek.”
    
    Creo que hay un episodio de la serie original donde van en un universo paralelo donde todo el mundo es un medio fascista duplicado.
    Eso es probablemente el lugar donde tendrían que Google / Siri / Cortana a continuación.
    
    Report comment
    
    Reply
    1. PurpleTurdBracket says:
      
      June 9, 2016 at 5:15 pm
      
      Spanish gibberish is still gibberish.
      
      Report comment
      
      Reply
      1. Whatnot says:
        
        June 9, 2016 at 5:17 pm
        
        If you are unable to read plain English perhaps some type of professional help like a teacher can aid you.
        Perhaps that is also better than to complain to others about your inabilities.
        
        Report comment
      2. PurpleTurdBracket says:
        
        June 9, 2016 at 5:20 pm
        
        If you are unable to write plain English perhaps some type of professional help like a teacher can aid you.
        Perhaps that is also better than to inflict onto others your inabilities.
        
        Report comment
      3. dontfeedthetroll says:
        
        June 10, 2016 at 1:03 am
        
        This troll is purple.
        
        Report comment
    2. Whatnot says:
      
      June 9, 2016 at 5:15 pm
      
      Addendum:
      I’m just trying a type of English you might mean. If it’s another type you need please use an online translator yourself.
      
      Complaints about it to complaints@google.com please, thanks.
      
      Report comment
      
      Reply

Hackaday

Talking Star Trek

34 thoughts on “Talking Star Trek”

Leave a Reply to LeithoaCancel reply

Search

Never miss a hack

If you missed it

Libogc Allegations Rock Wii Homebrew Community

A Gentle Introduction To COBOL

The DIY 1982 Picture Phone

Life On K2-18b? Don’t Get Your Hopes Up Just Yet

From PostScript To PDF

Our Columns

Researchers Create A Brain Implant For Near-Real-Time Speech Synthesis

FLOSS Weekly Episode 831: Let’s Have Lunch

Supercon 2024: Photonics/Optical Stack For Smart-Glasses

Keebin’ With Kristina: The One With The Protractor Keyboard

Supercon 2024: Sketching With Machines

34 thoughts on “Talking Star Trek”

Leave a Reply to LeithoaCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns