For just about any task you care to name, a Linux-based desktop computer can get the job done using applications that rival or exceed those found on other platforms. However, that doesn’t mean it’s always easy to get it working, and speech recognition is just one of those difficult setups.
A project called Voice2JSON is trying to simplify the use of voice workflows. While it doesn’t provide the actual voice recognition, it does make it easier to get things going and then use speech in a natural way.
The software can integrate with several backends to do offline speech recognition including CMU’s pocketsphinx, Dan Povey’s Kaldi, Mozilla’s DeepSpeech 0.9, and Kyoto University’s Julius. However, the code is more than just a thin wrapper around these tools. The fast training process produces both a speech recognizer and an intent recognizer. So not only do you know there is a garage door, but you gain an understanding of the opening and closing of the garage door.
One of the nice things about the Unix philosophy that Linux inherited is that the filesystem is very modular. That’s good, too, because a typical system might want a choice of filesystems like ext4, reiserfs, btrfs, and even network systems like nfs. Besides that, there are fake file systems like /sys and /dev that help Linux make everything look like a file. The downside is that building a filesystem required changing the kernel or, at least, writing a loadable module. That’s not as hard as it sounds, but it is a little more difficult than writing a normal program. Then came FUSE — file system in user space. This is a single file system module that allows you to create new file systems by writing ordinary code.
Back in 2019 we first came across the mutantC, an open source 3D printable Raspberry Pi handheld created by [rahmanshaber] that took more than a little inspiration from Sony’s VAIO ultra-mobile PCs (UMPCs) from the early 2000s. It was an impressive first effort, but it clearly had a long way to go before it could really be a practical mobile device.
Well after two years of development and three iterative versions of this Linux powered QWERTY slider, [rahmanshaber] is ready to show off the new and improved mutantC_v4. Outwardly it looks quite similar to the original version, with the notable addition of a tiny thumbstick and a pair of programmable buttons on the right side that can be used for input in addition to the touch screen. But inside it’s a whole other story, with so many changes and improvements that we hardly even know where to start.
Inside the mutantC_v4, showing off the ESP32-S2
Probably the most notable improvement is the addition of an ESP32-S2, specifically a bare ESP-12K module, to the main PCB. Previous versions of the hardware used an Arduino Pro Micro to interface with all the hardware, but the added horsepower of the ESP32 should come in handy with the array of sensors, controls, and NeoPixels that [rahmanshaber] has tasked the chip with. There’s even a buzzer and a coin-style vibration motor in there to provide some feedback to the user. While the board has changed significantly, it still retains compatibility with the Pi Zero, 2, 3, and 4.
Another notable addition is the expansion connector on the bottom of the handheld that has pins for I2C, UART, and 3.3 V. In the video below, [rahmanshaber] mentions that this feature was previously implemented with a standard 2×6 female header block, but is now using a far slimmer female USB-C port. We do wonder if it’s not a bit confusing to have this faux-USB port right next to the real one that’s actually used to charge the system, but with such cramped quarters occasionally you’ve got to make some tough decisions like that.
If you ever think about it, computers are exceedingly stupid. Even the most powerful CPU can’t do very much. However, it can do what it does very rapidly and repeatably. Computers are so fast, they can appear to do a lot of things at once, too and modern computers have multiple CPUs to further enhance their multitasking abilities. However, we often don’t write programs or shell scripts to take advantage of this. However, there’s no reason for this, as you’ll see.
Bash Support
It is surprisingly easy to get multiple processes running under Bash. For one thing, processes on either side of a pipe run together, and that’s probably the most common way shell scripts using multiprogramming. In other words, think about what happens if you write ls | more.
Under the old MSDOS system, the first program would run to completion, spooling its output to a temporary file. Then the second program will run, reading input from the same file. With Linux and most other modern operating systems, both programs will run together with the input of the second program connected to the first program’s output. Continue reading “Linux Fu: Walk, Chew Gum”→
These days we expect even the cheapest of burner smartphones to feature a multi-core processor, at least a gigabyte of RAM, and a Linux-based operating system. But obviously those sort of specs are unnecessary for an old school POTS desktop phone. Well, that’s what we thought. Then [Josh Max] wrote in to tell us about his adventures in hacking the CaptionCall, and now we’re eager to see what the community can do with root access on a surprisingly powerful Linux phone.
As the names implies, the CaptionCall is a desk phone with an LCD above the keypad that shows real-time captions. Anyone in the United States with hearing loss can get one of these phones for free from the government, so naturally they sell for peanuts on the second hand market. Well, at least they did. Then [Josh] had to go ahead and crack the root password for the ARMv7 i.MX6 powered phone, started poking around inside of its 4 GB of onboard NAND, and got the thing running DOOM.
Tapping into the serial port.
If you’re interested in the technical details, [Josh] has done a great job taking us step by step through his process. It’s a story that will be at least somewhat familiar to anyone who’s played around with embedded Linux devices, and unsurprisingly, starts with locating a serial port header on the PCB.
Finding the environment variables to pretty tightly locked down, he took the slow-route and dumped the phone’s firmware 80 characters at a time with U-Boot’s “memory display” command. Passing the recovered firmware image through binwalk and a password cracker got him the root credentials in short order, and from there, that serial port got a whole lot more useful.
[Josh] kicked the phone’s original UI to the curb, set up an ARM Debian Jessie chroot, and started working his way towards a fully functional Linux environment. With audio, video, and even keypad support secured, he was ready to boot up everyone’s favorite 1993 shooter. He’s been kind enough to share his work in a GitHub repository, and while it might not be a turn-key experience, all the pieces are here to fully bend the hardware to your will.
Historically, running DOOM on a new piece of hardware has been the harbinger of bigger and better things to come. With unfettered access to its Linux operating system up for grabs, we predict the CaptionCall is going to become a popular hacking target going forward, and we can’t wait to see it.
The life of a Linux user can be a bit difficult. Sometimes you have to — or want to — run Windows. Why Windows? Sometimes you have a work computer or a laptop that Linux doesn’t support well. Or it might be software. Although there are plenty of programs that can edit, say, Word documents, there’s always that one document that doesn’t quite translate correctly. Things like videoconferencing software sometimes works on Linux but might have fewer features.
So what do you do? You can dual boot, of course, but that’s not very handy. You can run Windows in a virtual machine if you have enough horsepower. There’s also Wine, but that often has its own set of problems with features and stability of complex programs. However, recent versions of Windows provide the Windows Subsystem for Linux (WSL).
With WSL, you can have most of what you like about Linux inside your Windows session. You just have to know how to set it up, and I’m going to show you one way that works for me with reasonably stable versions of Windows 10. Continue reading “Linux Fu: The Windows X11 Connection”→
The SMART Response XE is a handheld computer that was originally sold for use in the classroom as a terminal for pupils taking tests. It’s now cheap enough on the surplus market to have become a target for experimenters, and we’ve seen them with a variety of cool hacks. We particularly like what [chmod775] has done with it, putting a VT100 terminal emulator on the device and hiding a NanoPi Neo Air single board computer in the battery bay. Powered from a USB battery bank, it gives a fully-featured Linux terminal in the palm of the hand. We see it running an Ubuntu LTS version, and it’s clear that it’s a functional and usable device.
This raises a more abstract question though: We’d guess comparatively few of us write software through an old-style dumb terminal, instead we’re more likely to get our terminal experience at a much more accomplished command line with all the conveniences of a modern desktop surrounding it. How many of us could comfortably return to the limited confines of a VT100 emulator on an odd-sized LCD display? We’d be interested to hear [chmod755]’s experiences using it, because if it retains usability it’s a device we wouldn’t mind having ourselves.