[Johannes] uses Stable Diffusion‘s SDXL Turbo to create a baseline image of “photo of a red brick house, blue sky”. The hardware dials act as manual controls for applying different embeddings to this baseline, such as “coral”, “moss”, “fire”, “ice”, “sand”, “rusty steel” and “cookie”.
By adjusting the dials, those embeddings are applied to the base image in varying strengths. The results are generated on the fly and are pretty neat to see, especially since there is no appreciable amount of processing time required.
The MIDI controller is integrated with the help of lunar_tools, a software toolkit on GitHub to facilitate creating interactive exhibits. As for the image end of things, we’ve previously covered how AI image generators work.
One of the cringier aspects of AI as we know it today has been the proliferation of deepfake technology to make nude photos of anyone you want. What if you took away the abstraction and put the faker and subject in the same space? That’s the question the NUCA camera was designed to explore. [via 404 Media]
[Mathias Vef] and [Benedikt Groß] designed the NUCA camera “with the intention of critiquing the current trajectory of AI image generation.” The camera itself is a fairly unassuming device, a 3D-printed digital camera (19.5 × 6 × 1.5 cm) with a 37 mm lens. When the camera shutter button is pressed, a nude image is generated of the subject.
The final image is generated using a mixture of the picture taken of the subject, pose data, and facial landmarks. The photo is run through a classifier which identifies features such as age, gender, body type, etc. and then uses those to generate a text prompt for Stable Diffusion. The original face of the subject is then stitched onto the nude image and aligned with the estimated pose. Many of the sample images on the project’s website show the bias toward certain beauty ideals from AI datasets.
Looking for more ways to use AI with cameras? How about this one that uses GPS to imagine a scene instead. Prefer to keep AI out of your endeavors to invade personal space? How about building your own TSA body scanner?
We once thought that the best houses on Halloween were the ones that gave out full-size candy bars. While that’s still true, these days we’d rather see a cool display of some kind on the porch. Although some might consider this a trick, gaze into [Tim]’s mirror and you’ll be treated to a spooky version of yourself.
Here’s how it works: At the heart of this build is a webcam, OpenCV, and a computer that’s running the Stable Diffusion AI image generator. The image is shown on a monitor that sits behind 2-way mirrored glass.
We really like the frame that [Tim] built for this. Unable to find something both suitable and affordable, they built one out of wood molding and aged it appropriately.
We also like the ping pong ball vanity globe lights and the lighting effect itself. Not only is it spooky, it lets the viewer know that something is happening in the background. All the code and the schematic are available if you’d like to give this a go.
At this point, you gotta figure that you’re at least being listened to almost everywhere you go, whether it be a home assistant or your very own phone. So why not roll with the punches and turn lemons into something like a still life of lemons that’s a bit wonky? What we mean is, why not take our conversations and use AI to turn them into art? That’s the idea behind this next-generation digital photo frame created by [TheMorehavoc].
Essentially, it uses a Raspberry Pi and a Respeaker four-mic array to listen to conversations in the room. It listens and records 15-20 seconds of audio, and sends that to the OpenWhisper API to generate a transcript.
This repeats until five minutes of audio is collected, then the entire transcript is sent through GPT-4 to extract an image prompt from a single topic in the conversation. Then, that prompt is shipped off to Stable Diffusion to get an image to be displayed on the screen. As you can imagine, the images generated run the gamut from really weird to really awesome.
The natural lulls in conversation presented a bit of a problem in that the transcription was still generating during silences, presumably because of ambient noise. The answer was in voice activity detection software that gives a probability that a voice is present.
Naturally, people were curious about the prompts for the images, so [TheMorehavoc] made a little gallery sign with a MagTag that uses Adafruit.io as the MQTT broker. Build video is up after the break, and you can check out the images here (warning, some are NSFW).
It’s hard to read the headlines today without feeling like the world couldn’t possibly get much worse. And then tomorrow rolls around, and a fresh set of headlines puts the lie to that thought. On a macro level, there’s not much that you can do about that, but on a personal level, illustrating your news feed with mostly wrong, AI-generated images might take the edge off things a little.
Let us explain. [Roy van der Veen] liked the idea of an e-paper display newsfeed, but the crushing weight of the headlines was a little too much to bear. To lighten things up, he decided to employ Stable Diffusion to illustrate his feed, displaying both the headline and a generated image on a 7.3″ Inky 7-color e-paper display. Every five hours, a script running on a Raspberry Pi Zero 2W fetches a headline from a random source — we’re pleased the list includes Hackaday — and composes a prompt for Stable Diffusion based on the headline, adding on a randomly selected prefix and suffix to spice things up. For example, a prompt might look like, “Gothic painting of (Driving a Motor with an Audio Amp Chip). Gloomy, dramatic, stunning, dreamy.” You can imagine the results.
We have to say, from the examples [Roy] shows, the idea pretty much works — sometimes the images are so far off the mark that just figuring out how Stable Diffusion came up with them is enough to soften the blow. We’d have preferred if the news of the floods in Libya had been buffered by a slightly less dismal scene, but finding out that what was thought to be a “ritual mass murder” was really only a yoga class was certainly heartening.
While we tend to think of Amazon’s e-paper Kindles as more or less single-purpose devices (which to be fair, is how they’re advertised), there’s actually a full-featured Linux computer running behind that simple interface, just waiting to be put to work. Given how cheap you can get old Kindles on the second hand market, this has always struck us as something of a wasted opportunity.
This is why we love to see projects like Kindlefusion from [Diggedypomme]. It turns the Kindle into a picture frame to show off the latest in machine learning art thanks to Stable Diffusion. Just connect your browser to the web-based control interface running on the Kindle, give it a prompt, and away it goes. There are also functions to recall previously generated images, and if you’re connecting from a mobile device, support for creating images from voice prompts.
All you need is a Kindle that can be jailbroken, though technically the software has only been tested against older third and fourth-generation hardware. From there you install a few required packages as listed in the project documentation, including Python 3. Then you just move the Kindlefusion package over either via USB or SSH, and do a little final housekeeping before starting it up and letting it take over the Kindle’s normal UI.
Given the somewhat niche nature of Kindle hacking, we’re particularly glad to see that [Diggedypomme] went through the trouble of explaining the nuances of getting the e-reader ready to run your own code. While it’s not difficult to do, there are plenty of pitfalls if you’ve never done it before, so a concise guide is a nice thing to have. Unfortunately, it seems like Amazon has recently gone on the offensive, with firmware updates blocking the exploits the community was using for jailbreaking on all but the older models that are no longer officially supported.
While it’s a shame you can’t just pick up a new Kindle and start hacking (at least, for now), there are still millions of older devices floating around that could be put to good use. Hopefully, projects like this can help inspire others to pick one up and start experimenting with what’s possible.
Many AI systems require huge training datasets in order to achieve their impressive feats. This applies whether or not you’re talking about an AI that works with images, natural language, or just about anything else. AI developers are starting to come under scrutiny for where they’re sourcing their datasets. Unsurprisingly, stock photo site Getty Images is at the forefront of this, and is now suing the creators of Stable Diffusion over the matter, as reported by The Verge.
Stability AI, the company behind Stable Diffusion, is the target of the lawsuit for one good reason: there’s compelling evidence the company used Getty Images content without permission. The Stable Diffusion AI has been seen to generate output images that actually include blurry approximations of the Getty Images watermark. This is somewhat of a smoking gun to suggest that Stability AI may have scraped Getty Images content for use as training material.
The copyright implications are unclear, but using any imagery from a stock photo database without permission is always asking for trouble. Various arguments will likely play out in court. Stability AI may make claims that their activity falls under fair use guidelines, while Getty Images may claim that the appearance of perverted versions of their watermark may break trademark rules. The lawsuit could have serious implications for AI image generators worldwide, and is sure to be watched closely by the nascent AI industry. As with any legal matter, just don’t expect a quick answer from the courts.