ChatGPT, Bing, And The Upcoming Security Apocalypse

March 4, 2023 by Elliot Williams 46 Comments

Most security professionals will tell you that it’s a lot easier to attack code systems than it is to defend them, and that this is especially true for large systems. The white hat’s job is to secure each and every point of contact, while the black hat’s goal is to find just one that’s insecure.

Whether black hat or white hat, it also helps a lot to know how the system works and exactly what it’s doing. When you’ve got the source code, either because it’s open-source, or because you’re working inside the company that makes the software, you’ve got a huge advantage both in finding bugs and in fixing them. In the case of closed-source software, the white hats arguably have the offsetting advantage that they at least can see the source code, and peek inside the black box, while the attackers cannot.

Still, if you look at the number of security issues raised weekly, it’s clear that even in the case of closed-source software, where the defenders should have the largest advantage, that offense is a lot easier than defense.

So now put yourself in the shoes of the poor folks who are going to try to secure large language models like ChatGPT, the new Bing, or Google’s soon-to-be-released Bard. They don’t understand their machines. Of course they know how the work inside, in the sense of cross multiplying tensors and updating weights based on training sets and so on. But because the billions of internal parameters interact in incomprehensible ways, almost all researchers refer to large language models’ inner workings as a black box.

And they haven’t even begun to consider security yet. They’re still worried about how to construct obscure background prompts that prevent their machines from spewing hate speech or pornographic novels. But as soon as the machines start doing something more interesting than just providing you plain text, the black hats will take notice, and someone will have to figure out defense.

Indeed, this week, we saw the first real shot across the bow: a hack to make Bing direct users to arbitrary (bad) webpages. The Bing hack requires the user to already be on a compromised website, so it’s maybe not very threatening, but it points out a possible real security difference between Bing and ChatGPT: Bing gives you links to follow, and that makes it a juicy target.

We’re right on the edge of a new security landscape, because even the white hats are facing a black box in the AI. So far, what ChatGPT and Codex and other large language models are doing is trivially secure – putting out plain text – but Bing is taking the first dangerous steps into doing something more useful, both for users and black hats. Given the ease with which people have undone OpenAI’s attempts to keep ChatGPT in its comfort zone, my guess is that the white hats will have their hands full, and the black-box nature of the model deprives them of their best hope. Buckle your seatbelts.

Norm Abram Is Back, And Thanks To AI, Now In HD

March 2, 2023 by Tom Nardi 53 Comments

We’ve said many times that while woodworking is a bit outside our wheelhouse, we have immense respect for those with the skill and patience to turn dead trees into practical objects. Among such artisans, few are better known than the legendary Norm Abram — host of The New Yankee Workshop from 1989 to 2009 on PBS.

So we were pleased when the official YouTube channel for The New Yankee Workshop started uploading full episodes of the classic DIY show a few months back for a whole new generation to enjoy. The online availability of this valuable resource is noteworthy enough, but we were particularly impressed to see the channel start experimenting with AI enhanced versions of the program recently.

Note AI Norm’s somewhat cartoon-like appearance.

Originally broadcast in January of 1992, the “Child’s Wagon” episode of Yankee Workshop was previously only available in standard definition. Further, as it was a relatively low-budget PBS production, it would have been taped rather than filmed — meaning there’s no negative to go back and digitize at a higher resolution. But thanks to modern image enhancement techniques, the original video could be sharpened and scaled up to 1080p with fairly impressive results.

That said, the technology isn’t perfect, and the new HD release isn’t without a few “uncanny valley” moments. It’s particularly noticeable with human faces, but as the camera almost exclusively focuses on the work, this doesn’t come up often. There’s also a tendency for surfaces to look smoother and more uniform than they should, and reflective objects can exhibit some unusual visual artifacts.

Even with these quirks, this version makes for a far more comfortable viewing experience on today’s devices. It’s worth noting that so far only a couple episodes have been enhanced, each with an “AI HD” icon on the thumbnail image to denote them as such. Given the computational demands of this kind of enhancement, we expect it will be used only on a case-by-case basis for now. Still, it’s exciting to see this technology enter the mainstream, especially when its used on such culturally valuable content. Continue reading “Norm Abram Is Back, And Thanks To AI, Now In HD” →

A Milliwatt Of DOOM

February 26, 2023 by Jenny List 36 Comments

The seminal 1993 first-person shooter from id Software, DOOM, has become well-known as a test of small computer platforms. We’ve seen it on embedded systems far and wide, but we doubt we’ve ever seen it consume as little power as it does on a specialized neural network processor. The chip in question is a Syntiant NDP200, and it’s designed to be the always-on component listening for the wake word or other trigger in an AI-enabled IoT device.

DOOM running on as little as a milliwatt of power makes for an impressive PR stunt at a trade show, but perhaps more interesting is that the chip isn’t simply running the game, it’s also playing it. As a neural network processor it contains the required smarts to learn how to play the game, and in the simple circular level it’s soon picking off the targets with ease.

We’ve not seen any projects using these chips as yet, which is hardly surprising given their niche marketplace. It is however worth noting that there is a development board for the lower-range sibling chip NDP101, which sells for around $35 USD. Super-low-power AI is within reach.

Teaching A Robot To Hallucinate

February 26, 2023 by Donald Papp 8 Comments

Training robots to execute tasks in the real world requires data — the more, the better. The problem is that creating these datasets takes a lot of time and effort, and methods don’t scale well. That’s where Robot Learning with Semantically Imagined Experience (ROSIE) comes in.

The basic concept is straightforward: enhance training data with hallucinated elements to change details, add variations, or introduce novel distractions. Studies show a robot additionally trained on this data performs tasks better than one without.

This robot is able to deposit an object into a metal sink it has never seen before, thanks to hallucinating a sink in place of an open drawer in its original training data.

Suppose one has a dataset consisting of a robot arm picking up a coke can and placing it into an orange lunchbox. That training data is used to teach the arm how to do the task. But in the real world, maybe there is distracting clutter on the countertop. Or, the lunchbox in the training data was empty, but the one on the counter right now already has a sandwich inside it. The further a real-world task differs from the training dataset, the less capable and accurate the robot becomes.

ROSIE aims to alleviate this problem by using image diffusion models (such as Imagen) to enhance the training data in targeted and direct ways. In one example, a robot has been trained to deposit an object into a drawer. ROSIE augments this training by inpainting the drawer in the training data, replacing it with a metal sink. A robot trained on both datasets competently performs the task of placing an object into a metal sink, despite the fact that a sink never actually appears in the original training data, nor has the robot ever seen this particular real-world sink. A robot without the benefit of ROSIE fails the task.

Here is a link to the team’s paper, and embedded below is a video demonstrating ROSIE both in concept and in action. This is also in a way a bit reminiscent of a plug-in we recently saw for Blender, which uses an AI image generator to texture entire 3D scenes with a simple text prompt.

Continue reading “Teaching A Robot To Hallucinate” →

This Camera Produces A Picture, Using The Scene Before It

February 25, 2023 by Jenny List 23 Comments

It’s the most basic of functions for a camera, that when you point it at a scene, it produces a photograph of what it sees. [Jasper van Loenen] has created a camera that does just that, but not perhaps in the way we might expect. Instead of committing pixels to memory it takes a picture, uses AI to generate a text description of what is in the picture, and then uses another AI to generate an image from that picture. It’s a curiously beautiful artwork as well as an ultimate expression of the current obsession with the technology, and we rather like it.

The camera itself is a black box with a simple twin-lens reflex viewfinder. Inside is a Raspberry Pi that takes the photo and sends it through the various AI services, and a Fuji Instax Mini printer. Of particular interest is the connection to the printer which we think may be of interest to quite a few others, he’s reverse engineered the Bluetooth protocols it uses and created Python code allowing easy printing. The images it produces are like so many such AI-generated pieces of content, pretty to look at but otherworldly, and weird parallels of the scenes they represent.

It’s inevitable that consumer cameras will before long offer AI augmentation features for less-competent photographers, meanwhile we’re pleased to see Jasper getting there first.

Tiny Machine Learning On As Little As 2 KB Of RAM

February 24, 2023 by Al Williams 6 Comments

All of the machine language stuff coming out lately doesn’t affect you if you are developing with embedded microcontrollers, right? Perhaps not. Microsoft Research India wants you to use their EdgeML tool to do machine learning tasks such as gesture recognition in tiny devices like an Arduino Uno. According to the developers, you might need as little as 2 KB of RAM. There’s no network connection required and the work is using Tensorflow underneath, so it is compatible with much of what you’ll find for bigger computers.

If you add processing power, you can get more capability. For example, one of the demonstrations is a wake-word recognizer on a Raspberry Pi Zero (although the page for that demo seems to be missing at the moment; try the GesturePod, instead).

The system generally uses Python, but there are efficient C++ implementations for selected algorithms. The code lives on GitHub. There are also a number of research papers about each tool that you can find on the GitHub page. There’s also a recent paper on MinUn, an attempt to make things even more efficient for ARM microcontrollers. In particular, MinUn can store approximate numbers to save space, allows for variable precision of tensors, and tries to reduce memory fragmentation, an important feature for CPUs that don’t have memory management units.

If you haven’t studied TensorFlow yet, start here. Why use something like this with a microcontroller? How about smarter robots?

How To Roll Your Own Custom Object Detection Neural Network

February 13, 2023 by Donald Papp 2 Comments

Real-time object detection, which uses neural networks and deep learning to rapidly identify and tag objects of interest in a video feed, is a handy feature with great hacker potential. Happily, it’s also possible to make customized CNNs (convolutional neural networks) tailored for one’s own needs, and that process just got easier thanks to some new documentation for the Vizy “AI camera” by Charmed Labs.

Charmed Labs has been making hacker-friendly machine vision devices for a long time, and the Vizy camera impressed us mightily when we checked it out last year. Out of the box, Vizy has a perfectly functional object detector application that runs locally on the device, and can detect and tag many common everyday objects in real time. But what if that default application doesn’t quite meet one’s project needs? Good news, because it’s possible to create a custom-trained CNN, and that process got a lot more accessible thanks to step-by-step examples of training a model to recognize hands doing rock-paper-scissors.

Person and cat with machine-generated tags identifying them — Default object detection works well, but sometimes one needs custom results.

The basic process is this: Start with a variety of images that show the item of interest. Then identify and label the item of interest in each photo. These photos (a “training set”) are then sent to Google Colab, which will be used to generate a neural network. The resulting CNN model can then be downloaded and used, to see how well it performs.

Of course things rarely work perfectly the first time around, so at this point it’s pretty common for some refinement to be needed to increase accuracy. Luckily there are a number of tools to help do this without creating a new model from scratch, so it’s just a matter of tweaking until things perform acceptably.

Google Colab is free and the resulting CNNs are implemented in the TensorFlow Lite framework, meaning it’s possible to use them elsewhere. So if custom object detection has been holding up a project idea of yours, this might be what gets you over that hump.