Creators Can Fight Back Against AI With Nightshade

January 21, 2024 by Jenny List 78 Comments

If an artist were to make use of a piece of intellectual property owned by a large tech company, they risk facing legal action. Yet many creators are unhappy that those same tech companies are using their IP on a grand scale in the form of training material for generative AI. Can they fight back?

Perhaps now they can, with Nightshade, from a team at the University of Chicago. It’s a piece of software for Windows and MacOS that poisons an image with imperceptible shading, to make an AI classify it in an entirely different way than it appears.

The idea is that creators use it on their artwork, and leave it for unsuspecting AIs to assimilate. Their example is that a picture of a cow might be poisoned such that the AI sees it as a handbag, and if enough creators use the software the AI is forever poisoned to return a picture of a handbag when asked for one of a cow. If enough of these poisoned images are put online then the risks of an AI using an online image become too high, and the hope is that then AI companies would be forced to take the IP of their source material seriously.

For this to work it depends on enough creators taking up and using the software, but we are guessing that an inevitable result will be an arms race between AIs and image poisoners. One thing is certain though, as the AI hype has fueled such a growth in generative AI systems, creators, whether they be major publishers, your favourite human-generated tech news website, or someone drawing a cartoon strip in their bedroom, deserve not to have their work stolen in this way.

AI Binoculars Know More About Birds Than You

January 15, 2024 by Richard Baguley 20 Comments

2024 is the year of adding Artificial Intelligence to everything. Now, even a pleasant walk in the woods is getting a dose of AI: optics manufacturer Swarovski has announced the AX Visio, a binocular set with an AI bird identification feature. Not sure if that is a lesser or greater scaup on your pond? These binoculars will tell you, for the low, low price of $4799.

While digital cameras built into binoculars have been around for a while, adding AI is new. That’s a cool thing, but a bit of digging into the specs reveals that there is a much cheaper way to do it.

Buy a cheap digital camera, like the Kodak Pixpro AZ255, which has a higher resolution and longer zoom than these binoculars.
Transfer the image to your cell phone with an $11 memory card reader.
Run the free Cornell Merlin ID app to identify the bird.
Send the $4500 you just saved to us, or your favorite charity.

These ludicrously overpriced binoculars use the same Cornell Merlin ID system that you can use for free from their app, which also has the advantage of being able to ID birds from their songs. This is helpful because birds are tricky creatures who will try and hide from the hideously overpriced gadget you just bought.

[Via DigitalCameraWorld]

Bringing The Voice Assistant Home

January 14, 2024 by Matthew Carlson 22 Comments

For many, the voice assistants are helpful listeners. Just shout to the void, and a timer will be set, or Led Zepplin will start playing. For some, the lack of flexibility and reliance on cloud services is a severe drawback. [John Karabudak] is one of those people, and he runs his own voice assistant with an LLM (large language model) brain.

In the mid-2010’s, it seemed like voice assistants would take over the world, and all interfaces were going to NLP (natural language processing). Cracks started to show as these assistants ran into the limits of what NLP could reasonably handle. However, LLMs have breathed some new life into the idea as they can easily handle much more complex ideas and commands. However, running one locally is easier said than done.

A firewall with some muscle (Protectli Vault VP2420) runs a VLAN and NIPS to expose the service to the wider internet. For actually running the LLM, two RTX 4060 Ti cards provide the large VRAM needed to load a decent-sized model at a cheap price point. The AI engine (vLLM) supports dozens of models, but [John] chose a quantized version of Mixtral to fit in the 32GB of VRAM he had available.

Continue reading “Bringing The Voice Assistant Home” →

Adding AI To NPCs Is Easy, Doing It Well Is Hard

January 10, 2024 by Donald Papp 18 Comments

Adding natural language interfaces to software is easier than ever, and that led [creikey] to prototype a game that hinges on communicating with NPCs. The prototype went through multiple iterations during which he mainly discovered things that did not work well. Ultimately, it led to [creikey] settling on a western-themed game called Dante’s Cowboy which he hopes to release as an experiment. He begins talking about the game around the 4:43 mark in the video, which directly precedes a recording of a presentation he gives at as an indie developer.

Games typically revolve around the player manipulating entities in an environment in order to make things happen. This interaction drives engagement and interesting decisions. But while adding natural language AI to NPCs makes them easy to talk with, talking by itself is a shallow interaction. Convincing NPCs to do things? That’s complex and far more difficult to implement. [creikey] realized the limitations large language models (LLMs) had and worked to overcome them to make a unique game experience.

The challenges boil down to figuring out how to drive meaningful interaction, aligning AI behavior with the gameplay context, and managing API costs. In his words, “it’s been a learning experience to figure out where [natural language AI] even belongs in a game, if it belongs at all.”

We’ve previously seen ChatGPT used to grant NPCs the ability to communicate naturally which is a fascinating tech demo, but gameplay-wise can boil down to being a complicated alternative to pressing a button. As [creikey] discovered, adding this technology into games in a way that feels meaningful takes a new kind of work.

Continue reading “Adding AI To NPCs Is Easy, Doing It Well Is Hard” →

AI Pet Door Rejects Dead Mice

January 9, 2024 by Kristina Panos 32 Comments

If you have pet with a little access door to the outside world, and that pet happens to be a cat, you’re likely on the receiving end of all kinds of lifeless little lagniappes. Don’t worry, it’s CES season out in Las Vegas and a company called Flappie has the solution — an AI-powered cat door that rejects dead mice and other would-be offerings.

It works about like you might expect — there’s a motion sensor and a night-vision camera on the exterior side of the door. Using Flappie’s “unique and proprietary” dataset, the door distinguishes between Tom and Jerry and keeps out unwanted guests with more than 90% accuracy. To do this, Flappie collected video of a lot of cats and prey in a variety of lighting conditions. There’s even a chip detection system that will reject all other cats.

Thankfully, it’s not all automation. The prey detection system can be turned off entirely, and there are manual switches on the inside for locking and unlocking the door at will. You don’t even have to hook it up to the Internet, it seems.

Americans will have to wait a while, as the company is rolling out the door in Switzerland and Germany first. No word on when the US launch will take place, but interested parties can expect to pay around $399.

Of course, this problem can be solved without AI as long as you’re willing to review the situation and unlock the door yourself.

Audio Synthesizer Hooked Up With ChatGPT Interface

December 25, 2023 by Lewin Day 15 Comments

ChatGPT is being asked to handle all kinds of weird tasks, from determining whether written text was created by an AI, to answering homework questions, and much more. It’s good at some of these tasks, and absolutely incapable of others. [Filipe dos Santos Branco] and [Edward Gu] had an out of the box idea, though. What if ChatGPT could do something musical?

They built a system that, at the press of a button, would query ChatGPT for a 10-note melody in a given musical key. Once the note sequence is generated by the large language model, it’s played out by a PWM-based synthesizer running on a Raspberry Pi Pico.

Ultimately, ChatGPT is no musical genius. It’s simply picking a bunch of notes from a list that are known to work together melodically; that’s the whole point of musical keys. It would have been wild if it generated some riffs on the level of Stairway to Heaven or Spontaneous Devotion, but that might be asking for too much.

Here’s the question, though. If you trained a large language model, but got it to digest sheet music instead of written texts… could it learn to write music in various genres and styles? If someone isn’t working on that already, there’s surely an entire PhD you could get out of that idea alone. We should talk!

In any case, it’s one of the more creative projects from the ever-popular ECE 4760 class at Cornell. We’ve featured a bunch of projects from the class over the years, and noted how the course now runs on the RP2040. Continue reading “Audio Synthesizer Hooked Up With ChatGPT Interface” →

Multi-View Wire Art Meets Generative AI

December 20, 2023 by Donald Papp 10 Comments

DreamWire is a system for generating multi-view wire art using machine learning techniques to help generate the patterns required.

The 3-dimensional wire pattern in the center creates images of Einstein, Turing, and Newton depending on viewing angle.

What’s wire art? It’s a three-dimensional twisted mass of lines which, when viewed from a certain perspective, yields an image. Multi-view wire art produces different images from the same mass depending on the viewing angle, and as one can imagine, such things get very complex, very quickly.

A recently-released paper explains how the system works, explaining the role generative AI plays in being uniquely suited to create meaningful intersections between multiple inputs. There’s also a video (embedded just under the page break) that showcases many of the results researchers obtained.

The GitHub repository for the project doesn’t have much in it yet, but it’s a good place to keep an eye on if you’re interested in what comes next.

We’ve seen generative AI applied in a similarly novel way to help create visual anagrams, or 2D patterns that can be interpreted differently based on a variety of orientations and permutations. These sorts of systems still need to be guided by a human, but having machine learning do the heavy lifting allows just about anybody to explore their creativity.

Continue reading “Multi-View Wire Art Meets Generative AI” →