AI Image Generator Twists In Response To MIDI Dials, In Real-time

MIDI isn’t just about music, as [Johannes Stelzer] shows by using dials to adjust AI-generated imagery in real-time. The results are wild, with an interactivity to them that we don’t normally see in such things.

[Johannes] uses Stable Diffusion‘s SDXL Turbo to create a baseline image of “photo of a red brick house, blue sky”. The hardware dials act as manual controls for applying different embeddings to this baseline, such as “coral”, “moss”, “fire”, “ice”, “sand”, “rusty steel” and “cookie”.

By adjusting the dials, those embeddings are applied to the base image in varying strengths. The results are generated on the fly and are pretty neat to see, especially since there is no appreciable amount of processing time required.

The MIDI controller is integrated with the help of lunar_tools, a software toolkit on GitHub to facilitate creating interactive exhibits. As for the image end of things, we’ve previously covered how AI image generators work.

29 thoughts on “AI Image Generator Twists In Response To MIDI Dials, In Real-time

    1. They have no one to blame but themselves. When the Devs BLATANTLY lie about how the models are trained, what they use to train them, keeping lists of artists and entire catalogs of works. THEN Discord leaks pretty much prove the entire team lied through their teeth and KNEW it was lies. I’ve got zero sympathy for them. They should have been transparent and take whatever regulations may come, but they didn’t.

    1. Yeah I was thinking about that too. I wonder if the outputs were pre-baked to make it more responsive, which would make it prohibitive to bake every possible combination of dial positions. Or maybe they just forgot to film the coolest part, this happens more often than you’d think

    1. They used a “distilled” Model. Which is like a Model that tries to condense what a set of bigger models does while being smaller and faster. Essentially it is a Model of other Models. Its as crazy as it sounds.

      It can enable for stupidly fast generation like this one does it within a single step, but as seen in the demo: the accuracy and ability to deviate takes a nose-dive. Rendering it more of something for experimental showcases like this.

  1. Can’t deny that Generative AI used for such interactive show-cases shows potential.

    Just wish it didn’t involve Stability AI. They are definitely one of the more ethically dubious of the bunch.

  2. Very cool. I’m glad they used a self-hostable image generator and not some API a corporation could take away on a whim. I did something similar (without a physical interface) for a puzzle in a table-top RPG – the device had dials corresponding to the classical elements that changed the overall environment, and switches to toggle on or off specific elements. The players had to use it to match descriptions from an NPC’s journal.

  3. Now, apply it to a photo of a face, with variable like “ear size” and “hair color”. We’ve long seen this with selections from discrete images, but it would be a lot more fun with continuous variation.

    1. Probably the developer chose MIDI because you can buy a box with a bunch of knobs on it and a CPU that encodes them and sends messages to an interface really cheap if you use midi as the interface. Otherwise you have to build your own and it’s expensive and takes a long time and the developer wasn’t interested in hardware.anyway.

  4. This reminds me so much of a video I saw of a talk called “Inventing on Principle” by Brett Victor. As a means of illustrating his point, he talks about his own guiding principle of immediate feedback in creative endeavors. Sure, the relationship between AI/ML/what-have-you and human creativity is absolute flame war fodder, but I think this is a fantastic ‘fuzzy’ way to interact with the ‘fuzzy’ black-box logic of AI image generation engines. Also, would 100% recommend looking up the above video on YouTube. Well worth your 55 minutes.

Leave a Reply

Your email address will not be published. Required fields are marked *

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.