Raspberry Pi Narrates (And Tattles On) Your Cat, Nature Documentary Style

Detecting a cat with a raspberry pi and camera is one thing, but [Yoko Li]’s AI Raspberry Pi Cat Detection brings things entirely to another level by narrating your feline’s activities, nature documentary style.

The project is ostensibly aimed at tattling on the housecats by detecting forbidden behavior such as trespassing on the kitchen counter. But we daresay that’s overshadowed by the verbose image analysis, which describes the scene in its best David Attenborough impression.

This feline exemplifies both the beauty and the peaceful nature of its kind. No email will be sent as the cat is not on the kitchen counter.

Hard to believe that just a few years ago this cat detector tool was the bee’s knees in cat detection technology. Things have certainly come a long way. Interested? The GitHub repository has everything needed to roll your own and we highly recommend watching it in action in the video, embedded below.

Continue reading “Raspberry Pi Narrates (And Tattles On) Your Cat, Nature Documentary Style”

AI Helps Make Web Scraping Faster And Easier

Web scraping is usually only a first step towards extracting meaningful data. Once you’ve got everything pulled down, you’ve still got to process it into something useful. Here to assist with that is Scrapegraph-ai, a Python tool that promises to automate the process using a selection of large language models (LLMs).

Scrapegraph-ai is able to accept a URL as well as a prompt, which is a plain-English instruction on what to do with the data. Examples include summarizing, describing images, and more. In other words, gathering the data and analyzing or formatting it can now be done as one.

The project is actually pretty flexible in terms of the AI back-end. It’s able to work with locally-installed AI tools (via ollama) or with API keys for services like OpenAI and more. If you have an OpenAI API key, there’s an online demo that will show you the capabilities pretty effectively. Otherwise, local installation is only a few operations away.

This isn’t the first time we have seen the flexibility of AI tools like large language models leveraged to ease the notoriously-fiddly task of web scraping, and it’s great to see the results have only gotten better.

Mitre Wants The Feds To Play In Its Sandbox

If you haven’t worked with the US government, you might not know Mitre, a non-profit government research organization. Formed in 1958 by the U.S. Air Force as a company to guide the SAGE computer, they are often research experts who oversee government contracts or evaluate proposals. Now they are building a $20 millon “AI Sandbox” for the Federal government to build AI prototypes.

Partnered with NVidia, the sandbox will use an NVidia GDX SuperPOD system capable of an exaFLOP of 8-bit AI computation. Mitre reports this will increase their compute power for AI by two orders of magnitude.

Continue reading “Mitre Wants The Feds To Play In Its Sandbox”

Make 3D Scenes With A Holodeck-Like Voice Interface

The voice interface for the holodeck in Star Trek had users create objects by saying things like “create a table” and “now make it a metal table” and so forth, all with immediate feedback. This kind of interface may have been pure fantasy at the time of airing, but with the advent of AI and LLMs (large language models) this kind of natural language interface is coming together almost by itself.

A fun demonstration of that is [Dominic Pajak]’s demo project called VoxelAstra. This is a WebXR demo that works both in the Meta Quest 3 VR headset (just go to the demo page in the headset’s web browser) as well as on desktop.

The catch is that since the program uses OpenAI APIs on the back end, one must provide a working OpenAI API key. Otherwise, the demo won’t be able to do anything. Providing one’s API key to someone’s web page isn’t terribly good security practice, but there’s also the option of running the demo locally.

Either way, once the demo is up and running the user simply tells the system what to create. Just keep it simple. It’s a fun and educational demo more than anything and will try to do its work with primitive shapes like spheres, cubes, and cylinders. “Build a snowman” is suggested as a good starting point.

Intrigued by what you see and getting ideas of your own? WebXR can be a great way to give those ideas some life and looking at how someone else did something similar is a fine way to begin. Check out another of [Dominic]’s WebXR projects: a simulated BBC Micro, in VR.

AI Can Now Compress Text

There are many claims in the air about the capabilities of AI systems, as the technology continues to ascend the dizzy heights of the hype cycle. Some of them are true, others stretch definitions a little, while yet more cross the line into the definitely bogus. [J] has one that is backed up by real code though, a compression scheme for text using an AI, and while there may be limitations in its approach, it demonstrates an interesting feature of large language models.

The compression works by assuming that for a sufficiently large model, it’s likely that many source texts will exist somewhere in the training. Using llama.cpp it’s possible to extract the tokenization information of a piece of text contained in its training data and store that as the compressed output. The decompressor can then use that tokenization data as a series of keys to reassemble the original from its training. We’re not AI experts but we are guessing that a source text which has little in common with any training text would fare badly, and we expect that the same model would have to be used on both compression and decompression. It remains a worthy technique though, and no doubt because it has AI pixie dust, somewhere there’s a hype-blinded venture capitalist who would pay millions for it. What a world we live in!

Oddly this isn’t the first time we’ve looked at AI text compression.

Train A GPT-2 LLM, Using Only Pure C Code

[Andrej Karpathy] recently released llm.c, a project that focuses on LLM training in pure C, once again showing that working with these tools isn’t necessarily reliant on sprawling development environments. GPT-2 may be older but is perfectly relevant, being the granddaddy of modern LLMs (large language models) with a clear heritage to more modern offerings.

LLMs are fantastically good at communicating despite not actually knowing what they are saying, and training them usually relies on PyTorch deep learning library, itself written in Python. llm.c takes a simpler approach by implementing the neural network training algorithm for GPT-2 directly. The result is highly focused and surprisingly short: about a thousand lines of C in a single file. It is a highly elegant process that does the same thing the bigger, clunkier methods accomplish. It can run entirely on a CPU, or it can take advantage of GPU acceleration, where available.

This isn’t the first time [Andrej Karpathy] has bent his considerable skills and understanding towards boiling down these sorts of concepts into bare-bones implementations. We previously covered a project of his that is the “hello world” of GPT, a tiny model that predicts the next bit in a given sequence and offers low-level insight into just how GPT (generative pre-trained transformer) models work.

AI + LEGO = A Brickton Of Ideas

What if there was some magic device that could somehow scan all your LEGO and tell you what you can make with it? It’s a childhood dream come true, right? Well, that device is in your pocket. Just dump out your LEGO stash on the carpet, spread it out so there’s only one layer, scan it with your phone, and after a short wait, you get a list of all the the fun things you can make. With building instructions. And oh yeah, it shows you where each brick is in the pile.

We are talking about the BrickIt app, which is available for Android and Apple. Check it out in the short demo after the break. Having personally tried the app, we can say it does what it says it does and is in fact quite cool.

As much as it may pain you to have to pick up all those bricks when you’re finished, it really does work better against a neutral background like light-colored carpet. In an attempt to keep the bricks corralled, we tried a wooden tray, and it didn’t seem to be working as well as it probably could have — it didn’t hold that many bricks, and they couldn’t be spread out that far.

And the only real downside is that results are limited because there’s a paid version. And the app is kind of constantly reminding you of what you’re missing out on. But it’s still really, really cool, so check it out.

We don’t have to tell you how versatile LEGO is. But have you seen this keyboard stand, or this PCB vise?

Continue reading “AI + LEGO = A Brickton Of Ideas”