3D Printed Triptych Shows Trio Of AI-Generated Images

Fascinated by art generated by deep learning systems such as DALL-E and Stable Diffusion? Then perhaps a wall installation like this phenomenal e-paper Triptych created by [Zach Archer] is in your future.

The three interlocking frames were printed out of “Walnut Wood” HTPLA from ProtoPasta, and hold a pair of 5.79 inch red/black/white displays along with a single 7.3 inch red/yellow/black/white panel from Waveshare. There are e-paper panels out there with more colors available if you wanted to go that route, but judging by the striking images [Zach] has posted, the relatively limited color palettes available on these displays doesn’t seem to be a hindrance.

Note the clever S-shaped brackets holding in the displays.

To create the images themselves, [Zach] wrote a script that would generate endless customized portraits using Stable Diffusion v1.4, and then manually selected the best to get copied over to a 32 GB micro SD card. The side images were generated on the dreamstudio.ai website, and also dumped on the card.

Every 12 hours a TinyPico ESP32 development board in the frame picks some images from the card, applies the necessary dithering and color adjustments to make them look good on the e-paper, and then updates the displays. Continue reading “3D Printed Triptych Shows Trio Of AI-Generated Images”

Giving Stable Diffusion Some Depth

You’ve likely heard quite a bit of buzz over the last few months about Stable Diffusion. The new version (v2) has come out, and in addition to the standard image-to-image and text-to-image modes, it also has a depth-image-to-image that can be incredibly useful. [Andrew] has a write-up that guides you on using this mode.

The basic idea is that you can take both an image and depth into the model, which allows you to control what gets put where. Stable Diffusion is a bit confusing, but we already have some great resources to wrap your head around it. In terms of input, you can use a depth map from a camera with lidar (many recent phones include this) or have another model (like MiDaS) estimate it from a 2D picture. This becomes powerful when you can preserve a specific composition, such as an iconic scene from a well-known movie. You can keep the characters’ poses on the screen but transform the style of the scene into whatever you wish (as seen above).

We have already covered a technique to generate textures right in blender, but this new depth information has already been implemented to provide better accuracy of the textures.

[Justin Alvey] used it to create architectural photos from dollhouse furniture. Using the MiDaS model, he estimated the depth and threw away the RGB aspects by setting the denoising strength to maximum. The simplified dollhouse furniture was easily recognizable to the model, which helped produce great results.

However, the only downside is that the perspective produces a rather dollhouse feel. Changing the focal length and moving farther away helps. Overall, it’s a clever use of what the new AI model can do. It’s a fast-moving space, so this will likely be out of date in a few months.