AI Pet Door Rejects Dead Mice

If you have pet with a little access door to the outside world, and that pet happens to be a cat, you’re likely on the receiving end of all kinds of lifeless little lagniappes. Don’t worry, it’s CES season out in Las Vegas and a company called Flappie has the solution — an AI-powered cat door that rejects dead mice and other would-be offerings.

Image by Nathan Ingraham via Engadget

It works about like you might expect — there’s a motion sensor and a night-vision camera on the exterior side of the door. Using Flappie’s “unique and proprietary” dataset, the door distinguishes between Tom and Jerry and keeps out unwanted guests with more than 90% accuracy. To do this, Flappie collected video of a lot of cats and prey in a variety of lighting conditions. There’s even a chip detection system that will reject all other cats.

Thankfully, it’s not all automation. The prey detection system can be turned off entirely, and there are manual switches on the inside for locking and unlocking the door at will. You don’t even have to hook it up to the Internet, it seems.

Americans will have to wait a while, as the company is rolling out the door in Switzerland and Germany first. No word on when the US launch will take place, but interested parties can expect to pay around $399.

Of course, this problem can be solved without AI as long as you’re willing to review the situation and unlock the door yourself.

Audio Synthesizer Hooked Up With ChatGPT Interface

ChatGPT is being asked to handle all kinds of weird tasks, from determining whether written text was created by an AI, to answering homework questions, and much more. It’s good at some of these tasks, and absolutely incapable of others. [Filipe dos Santos Branco] and [Edward Gu] had an out of the box idea, though. What if ChatGPT could do something musical?

They built a system that, at the press of a button, would query ChatGPT for a 10-note melody in a given musical key. Once the note sequence is generated by the large language model, it’s played out by a PWM-based synthesizer running on a Raspberry Pi Pico.

Ultimately, ChatGPT is no musical genius. It’s simply picking a bunch of notes from a list that are known to work together melodically; that’s the whole point of musical keys. It would have been wild if it generated some riffs on the level of Stairway to Heaven or Spontaneous Devotion, but that might be asking for too much.

Here’s the question, though. If you trained a large language model, but got it to digest sheet music instead of written texts… could it learn to write music in various genres and styles? If someone isn’t working on that already, there’s surely an entire PhD you could get out of that idea alone. We should talk!

In any case, it’s one of the more creative projects from the ever-popular ECE 4760 class at Cornell. We’ve featured a bunch of projects from the class over the years, and noted how the course now runs on the RP2040. Continue reading “Audio Synthesizer Hooked Up With ChatGPT Interface”

Multi-View Wire Art Meets Generative AI

DreamWire is a system for generating multi-view wire art using machine learning techniques to help generate the patterns required.

The 3-dimensional wire pattern in the center creates images of Einstein, Turing, and Newton depending on viewing angle.

What’s wire art? It’s a three-dimensional twisted mass of lines which, when viewed from a certain perspective, yields an image. Multi-view wire art produces different images from the same mass depending on the viewing angle, and as one can imagine, such things get very complex, very quickly.

A recently-released paper explains how the system works, explaining the role generative AI plays in being uniquely suited to create meaningful intersections between multiple inputs. There’s also a video (embedded just under the page break) that showcases many of the results researchers obtained.

The GitHub repository for the project doesn’t have much in it yet, but it’s a good place to keep an eye on if you’re interested in what comes next.

We’ve seen generative AI applied in a similarly novel way to help create visual anagrams, or 2D patterns that can be interpreted differently based on a variety of orientations and permutations. These sorts of systems still need to be guided by a human, but having machine learning do the heavy lifting allows just about anybody to explore their creativity.

Continue reading “Multi-View Wire Art Meets Generative AI”

Can Google’s New AI Read Your Datasheets For You?

We’ve seen a lot of AI tools lately, and, of course, we know they aren’t really smart, but they sure fool people into thinking they are actually intelligent. Of course, these programs can only pick through their training, and a lot depends on what they are trained on. When you use something like ChatGPT, for example, you assume they trained it on reasonable data. Sure, it might get things wrong anyway, but there’s also the danger that it simply doesn’t know what you are talking about. It would be like calling your company’s help desk and asking where you left your socks — they simply don’t know.

We’ve seen attempts to have AI “read” web pages or documents of your choice and then be able to answer questions about them. The latest is from Google with NotebookLM. It integrates a workspace where you can make notes, ask questions, and provide sources. The sources can be text snippets, documents from Google Drive, or PDF files you upload.

You can’t ask questions until you upload something, and we presume the AI restricts its answers to what’s in the documents you provide. It still won’t be perfect, but at least it won’t just give you bad information from an unknown source. Continue reading “Can Google’s New AI Read Your Datasheets For You?”

Making Visual Anagrams, With Help From Machine Learning

[Daniel Geng] and others have an interesting system of generating multi-view optical illusions, or visual anagrams. Such images have more than one “correct” view and visual interpretation.

What’s more, there are quite a few different methods on display: 90 degree flips and other (orthogonal) image rotations, color inversions, jigsaw permutations, and more. The project page has a generous number of examples, so go check them out!

The team’s method uses pre-trained diffusion models — more commonly known as the secret sauce inside image-generating AIs — to evaluate and work to combine the differences between different images, and try to combine and apply it in a way that results in the model generating a good visual result. While conceptually straightforward, this process wasn’t really something that could work without diffusion models driven by modern machine learning techniques.

The visual_anagrams GitHub repository has code and the research paper goes into details on implementation, limitations, and gives guidance on obtaining good results. Image generation is just one of the rapidly-evolving aspects of recent innovations, and it’s always interesting to see unusual applications like this one.

Mozilla Lets Folks Turn AI LLMs Into Single-File Executables

LLMs (Large Language Models) for local use are usually distributed as a set of weights in a multi-gigabyte file. These cannot be directly used on their own, which generally makes them harder to distribute and run compared to other software. A given model can also have undergone changes and tweaks, leading to different results if different versions are used.

To help with that, Mozilla’s innovation group have released llamafile, an open source method of turning a set of weights into a single binary that runs on six different OSes (macOS, Windows, Linux, FreeBSD, OpenBSD, and NetBSD) without needing to be installed. This makes it dramatically easier to distribute and run LLMs, as well as ensuring that a particular version of LLM remains consistent and reproducible, forever.

This wouldn’t be possible without the work of [Justine Tunney], creator of Cosmopolitan, a build-once-run-anywhere framework. The other main part is llama.cpp, and we’ve covered why it is such a big deal when it comes to running self-hosted LLMs.

There are some sample binaries available using the Mistral-7B, WizardCoder-Python-13B, and LLaVA 1.5 LLMs. Just keep in mind that if you’re on a Windows platform, only the LLaVA 1.5 will run, because it’s the only one that squeaks under the 4 GB limit on executable files that Windows has. If you run into issues, check out the gotchas list for troubleshooting tips.

How Do You Prove An AI Didn’t Make Your Art?

In the world of digital art, distinguishing between AI-generated and human-made creations has become a significant challenge. Almost overnight, tool sets for generating AI artworks became commonly available to the public, and suddenly, every digital art competition had to contend with potential submissions. Some have welcomed AI, while others demand competitors create artworks by their own hand and no other.

The problem facing artists and judges alike is just how to determine whether an artwork was created by a human or an AI. So what can be done?

Continue reading “How Do You Prove An AI Didn’t Make Your Art?”