A Robot Meant For Humans

Although humanity was hoping for a more optimistic robotic future in the post-war era, with media reflecting that sentiment like The Jetsons or Lost in Space, we seem to have shifted our collective consciousness (for good reasons) to a more Black Mirror/Terminator future as real-world companies like Boston Dynamics are actually building these styles of machines instead of helpful Rosies. But this future isn’t guaranteed, and a PhD researcher is hoping to claim back a more hopeful outlook with a robot called Blossom which is specifically built to investigate how humans interact with robots.

For a platform this robot is not too complex, consisting of an accessible frame that can be laser-cut from wood with only a few moving parts controlled by servos. The robot is not too large, either, and can be set on a desk to be used as a telepresence robot. But Blossom’s creator [Michael] wanted this to help understand how humans interact with robots so the latest version is outfitted not only with a large language model with text-to-speech capabilities, but also with a compelling backstory, lore, and a voice derived from Animal Crossing that’s neither human nor recognizable synthetic robot, all in an effort to make the device more approachable.

To that end, [Michael] set the robot up at a Maker Faire to see what sorts of interactions Blossom would have with passers by, and while most were interested in the web-based control system for the robot a few others came by and had conversations with it. It’s certainly an interesting project and reminds us a bit of this other piece of research from MIT that looked at how humans and robots can work productively alongside one another.

Using AI To Help With Assembly

Although generative AI and large language models have been pushed as direct replacements for certain kinds of workers, plenty of businesses actually doing this have found that using this new technology can cause more problems than it solves when it is given free reign over tasks. While this might not be true indefinitely, the real use case for these tools right now is as a kind of assistant to certain kinds of work. For this they can be incredibly powerful as [Ricardo] demonstrates here, using Amazon Q to help with game development on the Commodore 64.

The first step here was to generate code that would show a sprite moving across the screen. The AI first generated code in all caps, as was the style at the time of the C64, but in [Ricardo]’s development environment this caused some major problems, so the code was converted to lowercase. A more impressive conversion was done in the next steps, as the program needed to take advantage of the optimizations found in the Assembly language. With the code converted to 6502 Assembly that can run on the virtual Commodore, [Ricardo] was eventually able to show four sprites moving across the screen after several iterations with the AI, as well as change the style of the sprites to arbitrary designs.

Although the post is a bit over-optimistic on Amazon Q as a tool specifically for developers, it might have some benefits over other generative AIs especially if it’s capable at the chore of programming in Assembly language. We’d love to hear anyone with real-world experience with this and whether it is truly worth the extra cost over something like Copilot or GPT 4. For any of these generative AI models, though, it’s probably worth trying them out while they’re in their early stages. Keep in mind that there’s a lot more than programming that can be done with some of them as well.

Large Language Models On Small Computers

As technology progresses, we generally expect processing capabilities to scale up. Every year, we get more processor power, faster speeds, greater memory, and lower cost. However, we can also use improvements in software to get things running on what might otherwise be considered inadequate hardware. Taking this to the extreme, while large language models (LLMs) like GPT are running out of data to train on and having difficulty scaling up, [DaveBben] is experimenting with scaling down instead, running an LLM on the smallest computer that could reasonably run one.

Of course, some concessions have to be made to get an LLM running on underpowered hardware. In this case, the computer of choice is an ESP32, so the dataset was reduced from the trillions of parameters of something like GPT-4 or even hundreds of billions for GPT-3 down to only 260,000. The dataset comes from the tinyllamas checkpoint, and llama.2c is the implementation that [DaveBben] chose for this setup, as it can be streamlined to run a bit better on something like the ESP32. The specific model is the ESP32-S3FH4R2, which was chosen for its large amount of RAM compared to other versions since even this small model needs a minimum of 1 MB to run. It also has two cores, which will both work as hard as possible under (relatively) heavy loads like these, and the clock speed of the CPU can be maxed out at around 240 MHz.

Admittedly, [DaveBben] is mostly doing this just to see if it can be done since even the most powerful of ESP32 processors won’t be able to do much useful work with a large language model. It does turn out to be possible, though, and somewhat impressive, considering the ESP32 has about as much processing capability as a 486 or maybe an early Pentium chip, to put things in perspective. If you’re willing to devote a few more resources to an LLM, though, you can self-host it and use it in much the same way as an online model such as ChatGPT.

Peering Into The Black Box Of Large Language Models

Large Language Models (LLMs) can produce extremely human-like communication, but their inner workings are something of a mystery. Not a mystery in the sense that we don’t know how an LLM works, but a mystery in the sense that the exact process of turning a particular input into a particular output is something of a black box.

This “black box” trait is common to neural networks in general, and LLMs are very deep neural networks. It is not really possible to explain precisely why a specific input produces a particular output, and not something else.

Why? Because neural networks are neither databases, nor lookup tables. In a neural network, discrete activation of neurons cannot be meaningfully mapped to specific concepts or words. The connections are complex, numerous, and multidimensional to the point that trying to tease out their relationships in any straightforward way simply does not make sense.

Continue reading “Peering Into The Black Box Of Large Language Models”

Testing Large Language Models For Circuit Board Design Aid

Beyond bothering large language models (LLMs) with funny questions, there’s the general idea that they can act as supporting tools. Theoretically they should be able to assist with parsing and summarizing documents, while answering questions about e.g. electronic design. To test this assumption, [Duncan Haldane] employed three of the more highly praised LLMs to assist with circuit board design. These LLMs were GPT-4o (OpenAI), Claude 3 Opus (Anthropic) and Gemini 1.5 (Google).

The tasks ranged from ‘stupid questions’, like asking the delay per unit length of a trace on a PCB, to finding parts for a design, to designing an entire circuit. Of these tasks, only the ‘parsing datasheets’ task could be considered to be successful. This involved uploading the datasheet for a component (nRF5340) and asking the LLM to make a symbol and footprint, in this case for the text-centric JITX format but KiCad/Altium should be possible too. This did require a few passes, as there were glitches and omissions in the generated footprint.

When it came to picking components for a design, it’s clear that you’re out of luck here unless you’re trying to create a design that a million others have made before you in exactly the same way. This problem got worse when trying to design a circuit and ultimately spit out a netlist, with the best LLM (Claude 3 Opus) giving nonsensical suggestions for filter designs and mucking up even basic amplifier designs, including by sticking decoupling capacitors and random resistors just about everywhere.

Effectively, as a text searching tool it would seem that LLMs can have some use for engineers who are tired of digging through yet another few hundred pages of poorly formatted and non-indexed PDF datasheets, but you still need to be on your toes with every step of the way, as the output from the LLM will all too often be slightly to hilariously wrong.

Generative AI Now Encroaching On Music

While it might not seem like it to a novice, music turns out to be a highly mathematical endeavor with precise ratios between chords and notes as well as overall structure of rhythm and timing. This is especially true of popular music which has even more recognizable repeating patterns and trends, making it unfortunately an easy target for modern generative AI which is capable of analyzing huge amounts of data and creating arguably unique creations. This one, called Suno, does just that for better or worse.

Unlike other generative AI offerings that are currently available for creating music, this one is not only capable of generating the musical underpinnings of the song itself but can additionally create a layer of intelligible vocals as well. A deeper investigation of the technology by Rolling Stone found that the tool uses its own models to come up with the music and then offloads the text generation for the vocals to ChatGPT, finally using the generated lyrics to generate fairly convincing vocals. Like image and text generation models that have come out in the last few years, this has the potential to be significantly disruptive.

While we’re not particularly excited about living in a world where humans toil while the machines create art and not the other way around, at best we could hope for a world where real musicians use these models as tools to enhance their creativity rather than being outright substitutes, much like ChatGPT itself currently is for programmers. That might be an overly optimistic view, though, and only time will tell.

Air Canada’s Chatbot: Why RAG Is Better Than An LLM For Facts

Recently Air Canada was in the news regarding the outcome of Moffatt v. Air Canada, in which Air Canada was forced to pay restitution to Mr. Moffatt after the latter had been disadvantaged by advice given by a chatbot on the Air Canada website regarding the latter’s bereavement fare policy. When Mr. Moffatt inquired whether he could apply for the bereavement fare after returning from the flight, the chatbot said that this was the case, even though the link which it provided to the official bereavement policy page said otherwise.

This latter aspect of the case is by far the most interesting aspect of this case, as it raises many questions about the technical details of this chatbot which Air Canada had deployed on its website. Since the basic idea behind such a chatbot is that it uses a curated source of (company) documentation and policies, the assumption made by many is that this particular chatbot instead used an LLM with more generic information in it, possibly sourced from many other public-facing policy pages.

Whatever the case may be, chatbots are increasingly used by companies, but instead of pure LLMs they use what is called RAG: retrieval augmented generation. This bypasses the language model and instead fetches factual information from a vetted source of documentation.

Continue reading “Air Canada’s Chatbot: Why RAG Is Better Than An LLM For Facts”