Christmas Comes Early With AI Santa Demo

May 18, 2025 by Tyler August 6 Comments

With only two hundred odd days ’til Christmas, you just know we’re already feeling the season’s magic. Well, maybe not, but [Sean Dubois] has decided to give us a head start with this WebRTC demo built into a Santa stuffie.

The details are a little bit sparse (hopefully he finishes the documentation on GitHub by the time this goes out) but the project is really neat. Hardware-wise, it’s an audio-enabled ESP32-S3 dev board living inside Santa, running the OpenAI’s OpenRealtime Embedded SDK (as implemented by ExpressIf), with some customization by [Sean]. Looks like the audio is going through the newest version of LibPeer and the heavy lifting is all happening in the cloud, as you’d expect with this SDK. (A key is required, but hey! It’s all open source; if you have an AI that can do the job locally-hosted, you can probably figure out how to connect to it instead.)

This speech-to-speech AI doesn’t need to emulate Santa Claus, of course; you can prime the AI with any instructions you’d like. If you want to delight children, though, its hard to beat the Jolly Old Elf, and you certainly have time to get it ready for Christmas. Thanks to [Sean] for sending in the tip.

If you like this project but want to avoid paying OpenAI API fees, here’s a speech-to-text model to get you started.We covered this AI speech generator last year to handle the talky bit. If you put them together and make your own Santa Claus (or perhaps something more seasonal to this time of year), don’t forget to drop us a tip!

Hackaday Links: September 22, 2024

September 22, 2024 by Dan Maloney 13 Comments

Thanks a lot, Elon. Or maybe not, depending on how this report that China used Starlink signals to detect low-observable targets pans out. There aren’t a lot of details, and we couldn’t find anything approximating a primary source, but it seems like the idea is based on forward scatter, which is when waves striking an object are deflected only a little bit. The test setup for this experiment was a ground-based receiver listening to the downlink signal from a Starlink satellite while a DJI Phantom 4 Pro drone was flown into the signal path. The drone was chosen because nobody had a spare F-22 or F-35 lying around, and its radar cross-section is about that of one of these stealth fighters. They claim that this passive detection method was able to make out details about the drone, but as with most reporting these days, this needs to be taken with an ample pinch of salt. Still, it’s an interesting development that may change things up in the stealth superiority field.

Continue reading “Hackaday Links: September 22, 2024” →

Hackaday Links: September 1, 2024

September 1, 2024 by Dan Maloney 21 Comments

Why is it always a helium leak? It seems whenever there’s a scrubbed launch or a narrowly averted disaster, space exploration just can’t get past the problems of helium plumbing. We’ve had a bunch of helium problems lately, most famously with the leaks in Starliner’s thruster system that have prevented astronauts Butch Wilmore and Suni Williams from returning to Earth in the spacecraft, leaving them on an extended mission to the ISS. Ironically, the launch itself was troubled by a helium leak before the rocket ever left the ground. More recently, the Polaris Dawn mission, which is supposed to feature the first spacewalk by a private crew, was scrubbed by SpaceX due to a helium leak on the launch tower. And to round out the helium woes, we now have news that the Peregrine mission, which was supposed to carry the first commercial lander to the lunar surface but instead ended up burning up in the atmosphere and crashing into the Pacific, failed due to — you guessed it — a helium leak.
Continue reading “Hackaday Links: September 1, 2024” →

Bringing The Voice Assistant Home

January 14, 2024 by Matthew Carlson 22 Comments

For many, the voice assistants are helpful listeners. Just shout to the void, and a timer will be set, or Led Zepplin will start playing. For some, the lack of flexibility and reliance on cloud services is a severe drawback. [John Karabudak] is one of those people, and he runs his own voice assistant with an LLM (large language model) brain.

In the mid-2010’s, it seemed like voice assistants would take over the world, and all interfaces were going to NLP (natural language processing). Cracks started to show as these assistants ran into the limits of what NLP could reasonably handle. However, LLMs have breathed some new life into the idea as they can easily handle much more complex ideas and commands. However, running one locally is easier said than done.

A firewall with some muscle (Protectli Vault VP2420) runs a VLAN and NIPS to expose the service to the wider internet. For actually running the LLM, two RTX 4060 Ti cards provide the large VRAM needed to load a decent-sized model at a cheap price point. The AI engine (vLLM) supports dozens of models, but [John] chose a quantized version of Mixtral to fit in the 32GB of VRAM he had available.

Continue reading “Bringing The Voice Assistant Home” →

A Hacker-Friendly Software Package For Your Next AI Project

August 31, 2023 by Donald Papp 13 Comments

If you’re interested in using Large Language Models (LLM) in a project, but aren’t plugged directly into the fast-developing world of artificial intelligence (AI), knowing what tool or software to use can be daunting. Luckily, [Max Woolf] created simpleaichat, which is complete with examples and documentation and minimal code complexity.

As [Max] puts it, the main motivations behind the project are to provide useful tools while making it easier for non-engineers to peer through the breathless hyperbole and see just how AI-based apps actually work. This project was directly inspired by [Max]’s own real-world software experiences in this area, particularly his frustrations with popular and much-hyped frameworks in which “Hello World” feels a lot more like Hell World.

simpleaichat is a Python package that provides easy and powerful ways to interface with the OpenAI API, makers of ChatGPT. Now, it is true that OpenAI’s models are not open source and access is not free, but they are easily one of the most capable and cost-effective services of their kind.

Prefer something a little more open, and a lot more private? There’s always the option to run an LLM locally on your own machine, possibly with the help of a tool like text-generation-webui or gpt4all. Running an LLM locally will not have the quality of OpenAI’s offerings, but it can still do the job. It’s also possible to give these local LLMs an interface that mimics OpenAI’s API, so there are loads of possibilities.

Are you getting ideas yet? Share them in the comments, or keep them to yourselves and submit a tip once your project is off the ground!

Bridging A Gap Between LLMs And Programming With TypeChat

July 22, 2023 by Bryan Cockfield 33 Comments

By now, large language models (LLMs) like OpenAI’s ChatGPT are old news. While not perfect, they can assist with all kinds of tasks like creating efficient Excel spreadsheets, writing cover letters, asking for music references, and putting together functional computer programs in a variety of languages. One thing these LLMs don’t do yet though is integrate well with existing app interfaces. However, that’s where the TypeChat library comes in, bridging the gap between LLMs and programming.

TypeChat is an experimental MIT-licensed library from Microsoft which sits in between a user and a LLM and formats responses from the AI that are type-safe so that they can easily be plugged back in to the original interface. It does this by generating JSON responses based on user input, making it easier to take the user input directly, run it through the LLM, and then use the output directly in another piece of code. It can be used for things like prototyping prompts, validating responses, and handling errors. It’s also not limited to a single LLM and can be fairly easily modified to work with many of the existing models.

The software is still in its infancy but does hope to make it somewhat easier to work between user inputs within existing pieces of software and LLMs which have quickly become all the rage in the computer science world. We expect to see plenty more tools like this become available as more people take up using these new tools, which have plenty of applications beyond just writing code.

3D Design With Text-Based AI

May 12, 2023 by Bryan Cockfield 8 Comments

Generative AI is the new thing right now, proving to be a useful tool both for professional programmers, writers of high school essays and all kinds of other applications in between. It’s also been shown to be effective in generating images, as the DALL-E program has demonstrated with its impressive image-creating abilities. It should surprise no one as this type of AI continues to make in-roads into other areas, this time with a program from OpenAI called Shap-E which can render 3D images.

Like most of OpenAI’s offerings, this takes plain language as its input and can generate relatively simple 3D models with this text. The examples given by OpenAI include some bizarre models using text prompts such as a chair shaped like an avocado or an airplane that looks like a banana. It can generate textured meshes and neural radiance fields, both of which have various advantages when it comes to available computing power, training methods, and other considerations. The 3D models that it is able to generate have a Super Nintendo-style feel to them but we can only expect this technology to grow exponentially like other AI has been doing lately.

For those wondering about the name, it’s apparently a play on the 2D rendering program DALL-E which is itself a combination of the names of the famous robot WALL-E and the famous artist Salvador Dali. The Shap-E program is available for anyone to use from this GitHub page. Even though this code comes from OpenAI themselves, plenty are speculating that the AI revolution to come will largely come from open-source sources rather than OpenAI or Google, something for which the future is somewhat hazy.