IEEE Spectrum had an interesting post covering several companies trying to sell voice programming interfaces. Not programming APIs for speech recognition, but the replacement of the traditional text editor to produce programs.
The companies, Serenade and Talon, have very different styles. Serenade has fairly normal-sounding language, whereas Talon has you use very specific phrases and can even use eye tracking to figure out what you are looking at when you issue a command. There’s also mention of two open-source products (Aenae and Caster) that require you to use a third-party speech engine.
For an example of Talon’s input, imagine you want this line of code in your program:
You’d say this out loud: “Phrase name op equals snake extract word paren mad.” Not exactly how Star Trek envisioned voice programming.
For accessibility, this might be workable. It is hard for us to imagine a room full of developers all talking to make their computers enter C or Python code. Until we can say, “Computer, build a graphic using the data in file hackaday-27,” we think this is not going to go mainstream.
The actual speech recognition part is pretty much a commodity now. Making a reasonable set of guesses about what people will say and what they mean by it is something else. It seems like this works best when you have a very specific and limited vocabulary, like operating a 3D printer.