Natural Language AI In Your Next Project? It’s Easier Than You Think

Want your next project to trash talk? Dynamically rewrite boring log messages as sci-fi technobabble? Happily (or grudgingly) answer questions? Doing that sort of thing and more can be done with OpenAI’s GPT-3, a natural language prediction model with an API that is probably a lot easier to use than you might think.

In fact, if you have basic Python coding skills, or even just the ability to craft a curl statement, you have just about everything you need to add this ability to your next project. It’s not free in the long run, although initial use is free on signup, but for personal projects the costs will be very small.

Basic Concepts

OpenAI has an API that provides access to GPT-3, a machine learning model with the ability to perform just about any task that involves understanding or generating natural-sounding language.

OpenAI provides some excellent documentation as well as a web tool through which one can experiment interactively. First, however, one must create an account and receive an API key. After that is done, the doors are open.

Creating an account also gives one a number of free credits that can be used to experiment with ideas. Once the free trial is used up or expires, using the API will cost money. How much? Not a lot, frankly. Everything sent to (and received from) the API is broken into tokens, and pricing is from $0.0008 to $0.06 per thousand tokens. A thousand tokens is roughly 750 words, so small projects are really not a big financial commitment. My free trial came with 18 USD of credits, of which I have so far barely managed to spend 5%.

Let’s take a closer look at how it works, and what can be done with it!

Continue reading “Natural Language AI In Your Next Project? It’s Easier Than You Think”

TapType: AI-Assisted Hand Motion Tracking Using Only Accelerometers

The team from the Sensing, Interaction & Perception Lab at ETH Zürich, Switzerland have come up with TapType, an interesting text input method that relies purely on a pair of wrist-worn devices, that sense acceleration values when the wearer types on any old surface. By feeding the acceleration values from a pair of sensors on each wrist into a Bayesian inference classification type neural network which in turn feeds a traditional probabilistic language model (predictive text, to you and I) the resulting text can be input at up to 19 WPM with 0.6% average error. Expert TapTypers report speeds of up to 25 WPM, which could be quite usable.

Details are a little scarce (it is a research project, after all) but the actual hardware seems simple enough, based around the Dialog DA14695 which is a nice Cortex M33 based Bluetooth Low Energy SoC. This is an interesting device in its own right, containing a “sensor node controller” block, that is capable of handling sensor devices connected to its interfaces, independant from the main CPU. The sensor device used is the Bosch BMA456 3-axis accelerometer, which is notable for its low power consumption of a mere 150 μA.

User’s can “type” on any convenient surface.

The wristband units themselves appear to be a combination of a main PCB hosting the BLE chip and supporting circuit, connected to a flex PCB with a pair of the accelerometer devices at each end. The assembly was then slipped into a flexible wristband, likely constructed from 3D printed TPU, but we’re just guessing really, as the progression from the first embedded platform to the wearable prototype is unclear.

What is clear is that the wristband itself is just a dumb data-streaming device, and all the clever processing is performed on the connected device. Training of the system (and subsequent selection of the most accurate classifier architecture) was performed by recording volunteers “typing” on an A3 sized keyboard image, with finger movements tracked with a motion tracking camera, whilst recording the acceleration data streams from both wrists. There are a few more details in the published paper for those interested in digging into this research a little deeper.

The eagle-eyed may remember something similar from last year, from the same team, which correlated bone-conduction sensing with VR type hand tracking to generate input events inside a VR environment.

Continue reading “TapType: AI-Assisted Hand Motion Tracking Using Only Accelerometers”

Audio Eavesdropping Exploit Might Make That Clicky Keyboard Less Cool

Despite their claims of innocence, we all know that the big tech firms are listening to us. How else to explain the sudden appearance of ads related to something we’ve only ever spoken about, seemingly in private but always in range of a phone or smart speaker? And don’t give us any of that fancy “confirmation bias” talk — we all know what’s really going on.

And now, to make matters worse, it turns out that just listening to your keyboard clicks could be enough to decode what’s being typed. To be clear, [Georgi Gerganov]’s “KeyTap3” exploit does not use any of the usual RF-based methods we’ve seen for exfiltrating data from keyboards on air-gapped machines. Rather, it uses just a standard microphone to capture audio while typing, building a cluster map of the clicks with similar sounds. By analyzing the clusters against the statistical likelihood of certain sequences of characters appearing together — the algorithm currently assumes standard English, and works best on clicky mechanical keyboards — a reasonable approximation of the original keypresses can be reconstructed.

If you’d like to see it in action, check out the video below, which shows the algorithm doing a pretty good job decoding text typed on an unplugged keyboard. Or, try it yourself — the link above implements KeyTap3 in-browser. We gave it a shot, but as a member of the non-mechanical keyboard underclass, it couldn’t make sense of the mushy sounds it heard. Then again, our keyboard inferiority affords us some level of protection from the exploit, so there’s that.

Editors Note: Just tried it on a mechanical keyboard with Cherry MX Blue switches and it couldn’t make heads or tails of what was typed, so your mileage may vary. Let us know if it worked for you in the comments.

What strikes us about this is that it would be super simple to deploy an exploit like this. Most side-channel attacks require such a contrived scenario for installing the exploit that just breaking in and stealing the computer would be easier. All KeyTap needs is a covert audio recording, and the deed is done.

Continue reading “Audio Eavesdropping Exploit Might Make That Clicky Keyboard Less Cool”

Camera held in hand

Review: Vizy Linux-Powered AI Camera

Vizy is a Linux-based “AI camera” based on the Raspberry Pi 4 that uses machine learning and machine vision to pull off some neat tricks, and has a design centered around hackability. I found it ridiculously simple to get up and running, and it was just as easy to make changes of my own, and start getting ideas.

Person and cat with machine-generated tags identifying them
Out of the box, Vizy is only a couple lines of Python away from being a functional Cat Detector project.

I was running pre-installed examples written in Python within minutes, and editing that very same code in about 30 seconds more. Even better, I did it all without installing a development environment, or even leaving my web browser, for that matter. I have to say, it made for a very hacker-friendly experience.

Vizy comes from the folks at Charmed Labs; this isn’t their first stab at smart cameras, and it shows. They also created the Pixy and Pixy 2 cameras, of which I happen to own several. I have always devoured anything that makes machine vision more accessible and easier to integrate into projects, so when Charmed Labs kindly offered to send me one of their newest devices, I was eager to see what was new.

I found Vizy to be a highly-polished platform with a number of truly useful hardware and software features, and a focus on accessibility and ease of use that I really hope to see more of in future embedded products. Let’s take a closer look.

Continue reading “Review: Vizy Linux-Powered AI Camera”

Amazing “Connect Fore!” Robot Challenges Your Putting Practice

We’ve just come across [Bithead]’s amazing, robotically-automated mashup of miniature golf and Connect Four, which also includes an AI opponent who pulls no punches in its drive to win. Connect Fore! celebrates Scotland — the birthplace of golf, after all — and looks absolutely fantastic.

Scotty the AI opponent uses this robotic turret to make their moves in a game of Connect Fore!

The way it works is this: players take turns putting colored balls into one of seven different holes at the far end of the table. Each hole feeds to a clear tube — visible in the middle of the table — which represent each of the columns in a game of Connect Four.

Each player attempts to stack balls in such a way that they create an unbroken line of four in their color, either horizontally, vertically, or diagonally. In a one-player game, a human player faces off against “Scotty”, the computer program that chooses its moves with intelligence and fires balls from a robotic turret.

[Bithead] started this project as a learning experience, and being such a complex project, the write-up is extensive. We really recommend reading through the whole thing if you are at all interested in what goes into making such a project work.

What’s particularly interesting is all of the ways in which things nearly worked, or needed nudging or fine adjustment. One might think that reliably getting a ball to enter a hole and roll down a PVC tube wouldn’t be a particularly finicky task, but it turns out that all kinds of things can go wrong.

Even finding the right play surface was a challenge. [Bithead]’s first purchase from Amazon was a total waste: it looked bad, smelled bad, and balls didn’t roll well on it. There are high-quality artificial turfs out there, but the good stuff gets shockingly expensive, and such a small project pretty much pigeonholes one as a nuisance customer when it comes to vendors. The challenges [Bithead] overcame serve as a reminder to keep the 80/20 rule (or Pareto principle) in mind when estimating what will get a project to the finish line.

Right under the page break below is a brief video tour of the completed table, and after that, you can watch a game in action as [Bithead] faces off against Scotty the AI. Curious about the inner workings? The last video has some build details that fill in a few blanks from the write-up.

We’ve seen an automated Chess table before, but this is an entirely other, utterly fantastic level of work.
Continue reading “Amazing “Connect Fore!” Robot Challenges Your Putting Practice”

AI-Generated Sleep Podcast Urges You To Imagine Pleasant Nonsense

[Stavros Korokithakis] finds the experience of falling asleep to fairy tales soothing, and this has resulted in a fascinating project that indulges this desire by using machine learning to generate mildly incoherent fairy tales and read them aloud. The result is a fantastic sort of automated, machine-generated audible sleep aid. Even the logo is machine-generated!

The Deep Dreams Podcast is entirely machine-generated, including the logo.

The project leverages the natural language generation abilities of OpenAI’s GPT-3 to create fairytale-style content that is just coherent enough to sound natural, but not quite coherent enough to make a sensible plotline. The quasi-lucid, dreamlike result is perfect for urging listeners to imagine pleasant nonsense (thanks to Nathan W Pyle for that term) as they drift off to sleep.

We especially loved reading about the methods and challenges [Stavros] encountered while creating this project. For example, he talks about how there is more to a good-sounding narration than just pointing a text-to-speech engine at a wall of text and mashing “GO”. A good episode has things like strategic pauses, background music, and audio fades. That’s where pydub — a Python library for manipulating audio — came in handy. As for the speech, text-to-speech quality is beyond what it was even just a few years ago (and certainly leaps beyond machine-generated speech in the 80s) but it still took some work to settle on a voice that best suited the content, and the project gradually saw improvement.

Deep Dreams Podcast has a GitLab repository if you want to see the code that drives it all, and you can go to the podcast itself to give it a listen.

Hackaday Links Column Banner

Hackaday Links: March 13, 2022

As Russia’s war on Ukraine drags on, its knock-on effects are being felt far beyond the eastern Europe theater. And perhaps nowhere is this more acutely felt than in the space launch industry, seeing that at least until recently, Russia was pretty much everyone’s go-to ride to orbit. All that has changed now, at least temporarily, and has expanded to include halting sales of rocket engines used in other nations’ launch vehicles. Specifically, Roscosmos has put an end to exports of the RD-180 engine used in the US Atlas V launch vehicle, along with the RD-181 thrusters found in the Antares rocket. The loss of these engines may be more symbolic than practical, at least for the RD-180 — United Launch Alliance stopped selling launches on Atlas V back last year, and had secured the engines it needed for the 29 flights it has booked by that April. Still, there’s some irony that the Atlas V, which started life as an ICBM aimed at the USSR in the 1950s, has lost its Russian-made engines.

Bad news for Jan Mrázek’s popular open-source parametric search utility which made JLCPCB’s component library easier to use. We wrote about it back in 2020, and things seemed to be going fine up until this week, when Jan got a take-down request for his service. When we first heard about this, we checked the application’s web page, which bore a big red banner that included what were apparently unpleasant accusations Jan had received, including the words “reptile” and “parasitic.” The banner is still there, but the text has changed to a more hopeful tone, noting that LCSC, the component supplier for JLC’s assembly service, objected to the way Jan was pulling component data, and that they are now working together on something that everyone can be happy with. Here’s hoping that the service is back in action again soon.

Good news, everyone: Epson is getting into the 3D printer business. Eager to add a dimension to the planar printing world they’ve mostly worked in, they’ve announced that they’ll be launching a direct-extrusion printer sometime soon. Aimed at the industrial market, the printer will use a “flat screw extruder,” which is supposed to be similar to what the company uses on its injection molding machines. We sure didn’t know Epson was in the injection molding market, so it’ll be interesting to see if expertise there results in innovation in 3D printing, especially if it trickles down to the consumer printing market. Just as long as they don’t try to DRM the pellets, of course.

You can’t judge a book by its cover, but it turns out that there’s a lot you can tell about a person’s genetics just by looking at their face. At least that’s according to an AI startup called FDNA, which makes an app called “Face2Gene” that the company claims can identify 300 genetic disorders by analyzing photos of someone’s face. Some genetic disorders, like Down Syndrome, leave easily recognizable facial features, but some changes are far more subtle and hard to recognize. We had heard of cases where photos of toddlers posted on social media were used to diagnose retinoblastoma, a rare cancer of the retina. But this is on another level entirely.

And finally, working in an Amazon warehouse has got to be a tough gig, and if some of the stories are to be believed, it borders on being a horror show. But one Amazonian recently shared a video that showed what it’s like to get trapped by his robotic coworkers. The warehouse employee somehow managed to get stuck in a maze created by Amazon’s pods, which are stacks of shelves that hold merchandise and are moved around the warehouse floor by what amounts to robotic pallet jacks. Apparently, the robots know enough to not collide with their meat-based colleagues, but not enough to not box them in. To be fair, the human eventually found a way out, but it was a long search and it seems like another pod could have moved into position to block the exit at any time. You could see it as a scary example of human-robot interaction gone awry, but we prefer to look at it as the robots giving their friend a little unscheduled break away from the prying eyes of his supervisor.