Make 3D Scenes With A Holodeck-Like Voice Interface

The voice interface for the holodeck in Star Trek had users create objects by saying things like “create a table” and “now make it a metal table” and so forth, all with immediate feedback. This kind of interface may have been pure fantasy at the time of airing, but with the advent of AI and LLMs (large language models) this kind of natural language interface is coming together almost by itself.

A fun demonstration of that is [Dominic Pajak]’s demo project called VoxelAstra. This is a WebXR demo that works both in the Meta Quest 3 VR headset (just go to the demo page in the headset’s web browser) as well as on desktop.

The catch is that since the program uses OpenAI APIs on the back end, one must provide a working OpenAI API key. Otherwise, the demo won’t be able to do anything. Providing one’s API key to someone’s web page isn’t terribly good security practice, but there’s also the option of running the demo locally.

Either way, once the demo is up and running the user simply tells the system what to create. Just keep it simple. It’s a fun and educational demo more than anything and will try to do its work with primitive shapes like spheres, cubes, and cylinders. “Build a snowman” is suggested as a good starting point.

Intrigued by what you see and getting ideas of your own? WebXR can be a great way to give those ideas some life and looking at how someone else did something similar is a fine way to begin. Check out another of [Dominic]’s WebXR projects: a simulated BBC Micro, in VR.

AI Can Now Compress Text

There are many claims in the air about the capabilities of AI systems, as the technology continues to ascend the dizzy heights of the hype cycle. Some of them are true, others stretch definitions a little, while yet more cross the line into the definitely bogus. [J] has one that is backed up by real code though, a compression scheme for text using an AI, and while there may be limitations in its approach, it demonstrates an interesting feature of large language models.

The compression works by assuming that for a sufficiently large model, it’s likely that many source texts will exist somewhere in the training. Using llama.cpp it’s possible to extract the tokenization information of a piece of text contained in its training data and store that as the compressed output. The decompressor can then use that tokenization data as a series of keys to reassemble the original from its training. We’re not AI experts but we are guessing that a source text which has little in common with any training text would fare badly, and we expect that the same model would have to be used on both compression and decompression. It remains a worthy technique though, and no doubt because it has AI pixie dust, somewhere there’s a hype-blinded venture capitalist who would pay millions for it. What a world we live in!

Oddly this isn’t the first time we’ve looked at AI text compression.

Train A GPT-2 LLM, Using Only Pure C Code

[Andrej Karpathy] recently released llm.c, a project that focuses on LLM training in pure C, once again showing that working with these tools isn’t necessarily reliant on sprawling development environments. GPT-2 may be older but is perfectly relevant, being the granddaddy of modern LLMs (large language models) with a clear heritage to more modern offerings.

LLMs are fantastically good at communicating despite not actually knowing what they are saying, and training them usually relies on PyTorch deep learning library, itself written in Python. llm.c takes a simpler approach by implementing the neural network training algorithm for GPT-2 directly. The result is highly focused and surprisingly short: about a thousand lines of C in a single file. It is a highly elegant process that does the same thing the bigger, clunkier methods accomplish. It can run entirely on a CPU, or it can take advantage of GPU acceleration, where available.

This isn’t the first time [Andrej Karpathy] has bent his considerable skills and understanding towards boiling down these sorts of concepts into bare-bones implementations. We previously covered a project of his that is the “hello world” of GPT, a tiny model that predicts the next bit in a given sequence and offers low-level insight into just how GPT (generative pre-trained transformer) models work.

AI + LEGO = A Brickton Of Ideas

What if there was some magic device that could somehow scan all your LEGO and tell you what you can make with it? It’s a childhood dream come true, right? Well, that device is in your pocket. Just dump out your LEGO stash on the carpet, spread it out so there’s only one layer, scan it with your phone, and after a short wait, you get a list of all the the fun things you can make. With building instructions. And oh yeah, it shows you where each brick is in the pile.

We are talking about the BrickIt app, which is available for Android and Apple. Check it out in the short demo after the break. Having personally tried the app, we can say it does what it says it does and is in fact quite cool.

As much as it may pain you to have to pick up all those bricks when you’re finished, it really does work better against a neutral background like light-colored carpet. In an attempt to keep the bricks corralled, we tried a wooden tray, and it didn’t seem to be working as well as it probably could have — it didn’t hold that many bricks, and they couldn’t be spread out that far.

And the only real downside is that results are limited because there’s a paid version. And the app is kind of constantly reminding you of what you’re missing out on. But it’s still really, really cool, so check it out.

We don’t have to tell you how versatile LEGO is. But have you seen this keyboard stand, or this PCB vise?

Continue reading “AI + LEGO = A Brickton Of Ideas”

A pair of hands holds a digital camera. "NUCA" is written in the hood above the lens and a black grip is on the right hand side of the device (left side of image). The camera body is off-white 3D printed plastic. The background is a pastel yellow.

AI Camera Only Takes Nudes

One of the cringier aspects of AI as we know it today has been the proliferation of deepfake technology to make nude photos of anyone you want. What if you took away the abstraction and put the faker and subject in the same space? That’s the question the NUCA camera was designed to explore. [via 404 Media]

[Mathias Vef] and [Benedikt Groß] designed the NUCA camera “with the intention of critiquing the current trajectory of AI image generation.” The camera itself is a fairly unassuming device, a 3D-printed digital camera (19.5 × 6 × 1.5 cm) with a 37 mm lens. When the camera shutter button is pressed, a nude image is generated of the subject.

The final image is generated using a mixture of the picture taken of the subject, pose data, and facial landmarks. The photo is run through a classifier which identifies features such as age, gender, body type, etc. and then uses those to generate a text prompt for Stable Diffusion. The original face of the subject is then stitched onto the nude image and aligned with the estimated pose. Many of the sample images on the project’s website show the bias toward certain beauty ideals from AI datasets.

Looking for more ways to use AI with cameras? How about this one that uses GPS to imagine a scene instead. Prefer to keep AI out of your endeavors to invade personal space? How about building your own TSA body scanner?

 

Dump A Code Repository As A Text File, For Easier Sharing With Chatbots

Some LLMs (Large Language Models) can act as useful programming assistants when provided with a project’s source code, but experimenting with this can get a little tricky if the chatbot has no way to download from the internet. In such cases, the code must be provided by either pasting it into the prompt or uploading a file manually. That’s acceptable for simple things, but for more complex projects, it gets awkward quickly.

To make this easier, [Eric Hartford] created github2file, a Python script that outputs a single text file containing the combined source code of a specified repository. This text file can be uploaded (or its contents pasted into the prompt) making it much easier to share code with chatbots.

Continue reading “Dump A Code Repository As A Text File, For Easier Sharing With Chatbots”

A Slew Of AI Courses To Get Yourself Up To Speed

When there’s a new technology, there’s always a slew of people who want to educate you about it. Some want to teach you to use their tools, some want you to pay for training, and others will use free training to entice you to buy further training. Since AI is the new hot buzzword, there are plenty of free classes from reputable sources. The nice thing about a free class is that if you find it isn’t doing it for you, there’s no penalty to just quit.

We noticed NVIDIA — one of the companies that has most profited from the AI boom — has some courses (not all free, though). Generative AI Explained, and Augment your LLM Using Retrieval Augmented Generation caught our eye. There’s also Building a Brain in 10 Minutes, and Introduction to Physics-informed Machine Learning with Modulus. These are all quite short, though.

Continue reading “A Slew Of AI Courses To Get Yourself Up To Speed”