Natural Language AI In Your Next Project? It’s Easier Than You Think

Want your next project to trash talk? Dynamically rewrite boring log messages as sci-fi technobabble? Happily (or grudgingly) answer questions? Doing that sort of thing and more can be done with OpenAI’s GPT-3, a natural language prediction model with an API that is probably a lot easier to use than you might think.

In fact, if you have basic Python coding skills, or even just the ability to craft a curl statement, you have just about everything you need to add this ability to your next project. It’s not free in the long run, although initial use is free on signup, but for personal projects the costs will be very small.

Basic Concepts

OpenAI has an API that provides access to GPT-3, a machine learning model with the ability to perform just about any task that involves understanding or generating natural-sounding language.

OpenAI provides some excellent documentation as well as a web tool through which one can experiment interactively. First, however, one must create an account and receive an API key. After that is done, the doors are open.

Creating an account also gives one a number of free credits that can be used to experiment with ideas. Once the free trial is used up or expires, using the API will cost money. How much? Not a lot, frankly. Everything sent to (and received from) the API is broken into tokens, and pricing is from $0.0008 to $0.06 per thousand tokens. A thousand tokens is roughly 750 words, so small projects are really not a big financial commitment. My free trial came with 18 USD of credits, of which I have so far barely managed to spend 5%.

Let’s take a closer look at how it works, and what can be done with it!

Continue reading “Natural Language AI In Your Next Project? It’s Easier Than You Think”

Need A Snack From Across Town? Send Spot!

[Dave Niewinski] clearly knows a thing or two about robots, judging from his YouTube channel. Usually the projects involve robot arms mounted on some sort of wheeled platform, but this time it’s the tune of some pretty famous yellow robot legs, in the shape of spot from Boston Dynamics. The premise is simple — tell the robot what snacks you want, entirely by voice command, and off he goes to fetch. But, we’re not talking about navigating to the fridge in the same room. We’re talking about trotting out the front door, down the street and crossing roads to visit favorite restaurant. Spot will order the snacks and bring them back, fully autonomously.

Spot’s depth cameras provide localized navigation and object avoidance information
Local AI vision system handles avoiding those pesky moving objects

There are multiple things going here, all of which are pretty big computational tasks. Firstly, there is no cloud-based voice control, ala Google voice or Alexa. The robot works on the premise of full autonomy, which means no internet connectivity for any aspect. All voice recognition, voice-to-text, and speech synthesis are performed locally using the NVIDIA Riva GPU-based AI speech SDK, running on the local NVIDIA Jetson AGX Orin carried on Spot’s back. A front-facing webcam supplies the audio feed for this. The voice recognition application listens for the wake phrase, then turns the snack order into text, for later replay when it gets to the destination. Navigation is taken care of with a Microstrain RTK GNSS module, which has all the needed robustness, such as dual antennas, and inertial fallback for those regions with a spotty signal. Navigation is no use out in the real world on its own, which is where Spot’s depth sensor cameras come in. These enable local obstacle avoidance, as per the usual spot behavior we’ve all seen before. But what about crossing the road without getting tens of thousands of dollars of someone else’s hardware crushed by a passing truck? Spot’s onboard streaming cameras are fed into the NVIDIA dash cam net AI platform which enables real-time recognition of moving obstacles such as cars, humans and anything else that might be wandering around and get in the way. All in all a cool project showing the future potential of AI in robotics for important tasks, like fetching me a beer when I most need it, even if it comes from the local corner shop.

We love robots around here. Robots can mow your lawn, navigate inside your house with a little help from invisible QR Codes, even help out with growing your food. The robot-assisted future long promised, may now be looking more like the present.

Continue reading “Need A Snack From Across Town? Send Spot!”

A Baudot Code Speaking Chatterbot With A Freakish Twist

[Sam Battle] known on YouTube as [Look Mum No Computer] is mostly known as a musical artist, but seems lately to have taken a bit of shine to retro telecoms gear, and this latest foray is into the realm of the minicom tty device which was a lifeline for those not blessed with ability to hear well enough to communicate via telephone. Since in this modern era of chatting via the internet, it is becoming much harder to actually find another user with a minicom, [Sam] decided to take the human out of the loop entirely and have the minicom user talk instead to a Raspberry Pi running an instance of MegaHal, which is 1990s era chatterbot.  The idea of this build (that became an exhibit in this museum is not obsolete) was to have an number of minicom terminals around the room connected via the internal telephone network (and the retro telephone exchange {Sam] maintains) to a line interface module, based upon the Mitel MH88422 chip. This handy device allows a Raspberry Pi to interface to the telephone line, and answer calls, with all the usual handshaking taken care of. The audio signal from the Mitel interface is fed to the Pi via a USB audio interface (since the Pi has no audio input) module.

Continue reading “A Baudot Code Speaking Chatterbot With A Freakish Twist”

Camera held in hand

Review: Vizy Linux-Powered AI Camera

Vizy is a Linux-based “AI camera” based on the Raspberry Pi 4 that uses machine learning and machine vision to pull off some neat tricks, and has a design centered around hackability. I found it ridiculously simple to get up and running, and it was just as easy to make changes of my own, and start getting ideas.

Person and cat with machine-generated tags identifying them
Out of the box, Vizy is only a couple lines of Python away from being a functional Cat Detector project.

I was running pre-installed examples written in Python within minutes, and editing that very same code in about 30 seconds more. Even better, I did it all without installing a development environment, or even leaving my web browser, for that matter. I have to say, it made for a very hacker-friendly experience.

Vizy comes from the folks at Charmed Labs; this isn’t their first stab at smart cameras, and it shows. They also created the Pixy and Pixy 2 cameras, of which I happen to own several. I have always devoured anything that makes machine vision more accessible and easier to integrate into projects, so when Charmed Labs kindly offered to send me one of their newest devices, I was eager to see what was new.

I found Vizy to be a highly-polished platform with a number of truly useful hardware and software features, and a focus on accessibility and ease of use that I really hope to see more of in future embedded products. Let’s take a closer look.

Continue reading “Review: Vizy Linux-Powered AI Camera”

Amazing “Connect Fore!” Robot Challenges Your Putting Practice

We’ve just come across [Bithead]’s amazing, robotically-automated mashup of miniature golf and Connect Four, which also includes an AI opponent who pulls no punches in its drive to win. Connect Fore! celebrates Scotland — the birthplace of golf, after all — and looks absolutely fantastic.

Scotty the AI opponent uses this robotic turret to make their moves in a game of Connect Fore!

The way it works is this: players take turns putting colored balls into one of seven different holes at the far end of the table. Each hole feeds to a clear tube — visible in the middle of the table — which represent each of the columns in a game of Connect Four.

Each player attempts to stack balls in such a way that they create an unbroken line of four in their color, either horizontally, vertically, or diagonally. In a one-player game, a human player faces off against “Scotty”, the computer program that chooses its moves with intelligence and fires balls from a robotic turret.

[Bithead] started this project as a learning experience, and being such a complex project, the write-up is extensive. We really recommend reading through the whole thing if you are at all interested in what goes into making such a project work.

What’s particularly interesting is all of the ways in which things nearly worked, or needed nudging or fine adjustment. One might think that reliably getting a ball to enter a hole and roll down a PVC tube wouldn’t be a particularly finicky task, but it turns out that all kinds of things can go wrong.

Even finding the right play surface was a challenge. [Bithead]’s first purchase from Amazon was a total waste: it looked bad, smelled bad, and balls didn’t roll well on it. There are high-quality artificial turfs out there, but the good stuff gets shockingly expensive, and such a small project pretty much pigeonholes one as a nuisance customer when it comes to vendors. The challenges [Bithead] overcame serve as a reminder to keep the 80/20 rule (or Pareto principle) in mind when estimating what will get a project to the finish line.

Right under the page break below is a brief video tour of the completed table, and after that, you can watch a game in action as [Bithead] faces off against Scotty the AI. Curious about the inner workings? The last video has some build details that fill in a few blanks from the write-up.

We’ve seen an automated Chess table before, but this is an entirely other, utterly fantastic level of work.
Continue reading “Amazing “Connect Fore!” Robot Challenges Your Putting Practice”

AI-Generated Sleep Podcast Urges You To Imagine Pleasant Nonsense

[Stavros Korokithakis] finds the experience of falling asleep to fairy tales soothing, and this has resulted in a fascinating project that indulges this desire by using machine learning to generate mildly incoherent fairy tales and read them aloud. The result is a fantastic sort of automated, machine-generated audible sleep aid. Even the logo is machine-generated!

The Deep Dreams Podcast is entirely machine-generated, including the logo.

The project leverages the natural language generation abilities of OpenAI’s GPT-3 to create fairytale-style content that is just coherent enough to sound natural, but not quite coherent enough to make a sensible plotline. The quasi-lucid, dreamlike result is perfect for urging listeners to imagine pleasant nonsense (thanks to Nathan W Pyle for that term) as they drift off to sleep.

We especially loved reading about the methods and challenges [Stavros] encountered while creating this project. For example, he talks about how there is more to a good-sounding narration than just pointing a text-to-speech engine at a wall of text and mashing “GO”. A good episode has things like strategic pauses, background music, and audio fades. That’s where pydub — a Python library for manipulating audio — came in handy. As for the speech, text-to-speech quality is beyond what it was even just a few years ago (and certainly leaps beyond machine-generated speech in the 80s) but it still took some work to settle on a voice that best suited the content, and the project gradually saw improvement.

Deep Dreams Podcast has a GitLab repository if you want to see the code that drives it all, and you can go to the podcast itself to give it a listen.

NVIDIA Unveils Jetson AGX Orin Developer Kit

When you think of high-performance computing powered by NVIDIA hardware, you probably think of applications leveraging the capabilities of the company’s graphics cards. In many cases, you’d be right. But naturally there are situations where the traditional combination of x86 computer and bolt-on GPU simply isn’t going to cut it; try packing a modern gaming computer onto a quadcopter and let us know how it goes.

For these so-called “edge computing” situations, NVIDIA offers the Jetson line of ARM single-board computers which include a scaled-down GPU that gives them vastly improved performance for machine learning applications than something like the Raspberry Pi. Today during their annual GPU Technology Conference (GTC), NVIDIA announced the immediate availability of the Jetson AGX Orin Developer Kit, which the company promises can deliver “server-class AI performance” in a package small enough for use in IoT or robotics.

As with the earlier Jetsons, the palm-sized development kit acts as a sort of breakout board for the far smaller module slotted into it. This gives developers access to the full suite of the connectivity and I/O options offered by the Jetson module in a desktop-friendly form that makes prototyping the software side of things much easier. Once the code is working as intended, you can simply pop the Jetson module out of the development kit and install it in your final hardware.

NVIDIA is offering the Orin module in a range of configurations, depending on your computational needs and budget. At the high end is the AGX Orin 64 GB at $1599 USD; which offers a 12-core ARM Cortex-A78AE processor, 32 GB of DDR5 RAM, 64 GB of onboard flash, and a Ampere GPU with 2048 CUDA cores and 64 Tensor cores, which all told enables it to perform an incredible 275 trillion operations per second (TOPS).

At the other end of the spectrum is the Orin NX 8 GB, a SO-DIMM module that delivers 70 TOPS for $399. It’s worth noting that even this low-end flavor of the Orin is capable of more than double the operations per second as 2018’s Jetson AGX Xavier, which until now was the most powerful entry in the product line.

The Jetson AGX Orin Developer Kit is available for $1,999 USD, and includes the AGX Orin 64 GB module. Interestingly, NVIDIA says the onboard software is able to emulate any of of the lower tier modules, so you won’t necessarily have to swap out the internal modules if your final hardware will end up using one of the cheaper modules. Of course the inverse of that is even folks who only planned on using the more budget-friendly units either have to shell out for an expensive dev kit, or try to spin their own breakout board.

While the $50 USD Jetson Nano is far more likely to be on the workbench of the average Hackaday reader, we have to admit that the specs of these new Orin modules are very exciting. Then again, we’ve covered several projects that used the previously top-of-the-line Jetson Xavier, so we don’t doubt one of you is already reaching for their wallet to pick up this latest entry into NVIDIA’s line of diminutive powerhouses.