There was an all-too-brief period in history where typewriters went from clunky, purely mechanical beasts to streamlined, portable electromechanical devices. But when the 80s came around and the PC revolution started, the typewriting was on the wall for these machines, and by the 90s everyone had a PC, a printer, and Microsoft Word. And thus the little daisy-wheel typewriters began to populate thrift shops all over the world.
That’s fine with us, because it gave [Arvind Sanjeev] a chance to build “Ghostwriter”, an AI-powered automatic typewriter. The donor machine was a clapped-out Brother electronic typewriter, which needed a bit of TLC to bring it back to working condition. From there, [Arvind] worked out the keyboard matrix and programmed an Arduino to drive the typewriter, both read and write. A Raspberry Pi running the OpenAI Python API for GPT-3 talks to the Arduino over serial, which basically means you can enter a GPT writing prompt with the keyboard and have the machine spit out a dead-tree version of the results.
To mix things up a bit, [Arvind] added a pair of pots to control the creativity and length of the response, plus an OLED screen which seems only to provide some cute animations, which we don’t hate. We also don’t hate the new paint job the typewriter got, but the jury is still out on the “poetry” that it typed up. Eye of the beholder, we suppose.
Whatever you think of GPT’s capabilities, this is still a neat build and a nice reuse of otherwise dead-end electronics. Need a bit more help building natural language AI into your next project? Our own [Donald Papp] will get you up to speed on that.
It should be no surprise that running untrusted code in a GitHub Actions workflow can have unintended consequences. It’s a killer feature, to automatically run through a code test suite whenever a pull request is opened. But that pull request is run in some part of the target’s development environment, and there’s been a few clever attacks found over the years that take advantage of that. There’s now another one, what Legit Security calls Github Environment Injection, and there were some big-name organizations vulnerable to it.
The crux of the issue is the $GITHUB_ENV file, which contains environment variables to be set in the Actions environment. Individual variables get added to this file as part of the automated action, and that process needs to include some sanitization of data. Otherwise, an attacker can send an environment variable that includes a newline and completely unintended environment variable. And an unintended, arbitrary environment variable is game over for the security of the workflow. The example uses the NODE_OPTIONS variable to dump the entire environment to an accessible output. Any API keys or other secrets are revealed.
This particular attack was reported to GitHub, but there isn’t a practical way to fix it architecturally. So it’s up to individual projects to be very careful about writing untrusted data into the $GITHUB_ENV file.
[Georgi Gerganov] recently shared a great resource for running high-quality AI-driven speech recognition in a plain C/C++ implementation on a variety of platforms. The automatic speech recognition (ASR) model is fully implemented using only two source files and requires no dependencies. As a result, the high-quality speech recognition doesn’t involve calling remote APIs, and can run locally on different devices in a fairly straightforward manner. The image above shows it running locally on an iPhone 13, but it can do more than that.
[Georgi]’s work is a port of OpenAI’s Whisper model, a remarkably-robust piece of software that does a truly impressive job of turning human speech into text. Whisper is easy to set up and play with, but this port makes it easier to get the system working in other ways. Having such a lightweight implementation of the model means it can be more easily integrated over a variety of different platforms and projects.
The usual way that OpenAI’s Whisper works is to feed it an audio file, and it spits out a transcription. But [Georgi] shows off something else that might start giving hackers ideas: a simple real-time audio input example.
By using a tool to stream audio and feed it to the system every half-second, one can obtain pretty good (sort of) real-time results! This of course isn’t an ideal method, but the robustness and accuracy of Whisper is such that the results look pretty great nevertheless.
You can watch a quick demo of that in the video just under the page break. If it gives you some ideas, head over to the project’s GitHub repository and get hackin’!
Should you wish to try high-quality voice recognition without buying something, good luck. Sure, you can borrow the speech recognition on your phone or coerce some virtual assistants on a Raspberry Pi to handle the processing for you, but those aren’t good for major work that you don’t want to be tied to some closed-source solution. OpenAI has introduced Whisper, which they claim is an open source neural net that “approaches human level robustness and accuracy on English speech recognition.” It appears to work on at least some other languages, too.
If you try the demonstrations, you’ll see that talking fast or with a lovely accent doesn’t seem to affect the results. The post mentions it was trained on 680,000 hours of supervised data. If you were to talk that much to an AI, it would take you 77 years without sleep!
[Max Woolf] sometimes struggles to create ideal headlines for his blog posts, and decided to apply his experience with machine learning to the problem. He asked: could an AI be trained to optimize his blog titles? It is a fascinating application of natural language processing, and [Max] explains all about what it does and how it works.
The machine learning framework [Max] uses is GPT-3, a language model that works with natural-seeming human language that is capable of being tweaked in different ways. [Max] uses OpenAI’s GPT-3 API (which, by the way, is much easier to experiment with than one might think) and here is the basic workflow for his title optimizer:
The optimizer takes as input a blog post title to optimize.
OpenAI’s pre-trained GPT-3 engine is used to generate six alternate titles.
For each of those alternate titles, a fine-tuned version of GPT-3 is consulted to judge how “good” they are based on custom training data. (“Good” in this context means “similar to titles of successful submissions on Hacker News“, but more on that in a moment.)
[Alexander] created codex_py2cpp as a way of experimenting with Codex, an AI intended to translate natural language into code. [Alexander] had slightly different ideas, however, and created codex_py2cpp as a way to play with the idea of automagically converting Python into C++. It’s not really intended to create robust code conversions, but as far as experiments go, it’s pretty neat.
The program works by reading a Python script as an input file, setting up a few parameters, then making a request to OpenAI’s Codex API for the conversion. It then attempts to compile the result. If compilation is successful, then hopefully the resulting executable actually works the same way the input file did. If not? Well, learning is fun, too. If you give it a shot, maybe start simple and don’t throw it too many curveballs.
Want your next project to trash talk? Dynamically rewrite boring log messages as sci-fi technobabble? Happily (or grudgingly) answer questions? Doing that sort of thing and more can be done with OpenAI’s GPT-3, a natural language prediction model with an API that is probably a lot easier to use than you might think.
In fact, if you have basic Python coding skills, or even just the ability to craft a curl statement, you have just about everything you need to add this ability to your next project. It’s not free in the long run, although initial use is free on signup, but for personal projects the costs will be very small.
OpenAI has an API that provides access to GPT-3, a machine learning model with the ability to perform just about any task that involves understanding or generating natural-sounding language.
OpenAI provides some excellent documentation as well as a web tool through which one can experiment interactively. First, however, one must create an account and receive an API key. After that is done, the doors are open.
Creating an account also gives one a number of free credits that can be used to experiment with ideas. Once the free trial is used up or expires, using the API will cost money. How much? Not a lot, frankly. Everything sent to (and received from) the API is broken into tokens, and pricing is from $0.0008 to $0.06 per thousand tokens. A thousand tokens is roughly 750 words, so small projects are really not a big financial commitment. My free trial came with 18 USD of credits, of which I have so far barely managed to spend 5%.
Let’s take a closer look at how it works, and what can be done with it!