Testing Large Language Models For Circuit Board Design Aid

Beyond bothering large language models (LLMs) with funny questions, there’s the general idea that they can act as supporting tools. Theoretically they should be able to assist with parsing and summarizing documents, while answering questions about e.g. electronic design. To test this assumption, [Duncan Haldane] employed three of the more highly praised LLMs to assist with circuit board design. These LLMs were GPT-4o (OpenAI), Claude 3 Opus (Anthropic) and Gemini 1.5 (Google).

The tasks ranged from ‘stupid questions’, like asking the delay per unit length of a trace on a PCB, to finding parts for a design, to designing an entire circuit. Of these tasks, only the ‘parsing datasheets’ task could be considered to be successful. This involved uploading the datasheet for a component (nRF5340) and asking the LLM to make a symbol and footprint, in this case for the text-centric JITX format but KiCad/Altium should be possible too. This did require a few passes, as there were glitches and omissions in the generated footprint.

When it came to picking components for a design, it’s clear that you’re out of luck here unless you’re trying to create a design that a million others have made before you in exactly the same way. This problem got worse when trying to design a circuit and ultimately spit out a netlist, with the best LLM (Claude 3 Opus) giving nonsensical suggestions for filter designs and mucking up even basic amplifier designs, including by sticking decoupling capacitors and random resistors just about everywhere.

Effectively, as a text searching tool it would seem that LLMs can have some use for engineers who are tired of digging through yet another few hundred pages of poorly formatted and non-indexed PDF datasheets, but you still need to be on your toes with every step of the way, as the output from the LLM will all too often be slightly to hilariously wrong.

Uncovering ChatGPT Usage In Academic Papers Through Excess Vocabulary

Frequencies of PubMed abstracts containing certain words. Black lines show counterfactual extrapolations from 2021–22 to 2023–24. The first six words are affected by ChatGPT; the last three relate to major events that influenced scientific writing and are shown for comparison. (Credit: Kobak et al., 2024)
Frequencies of PubMed abstracts containing certain words. Black lines show counterfactual extrapolations from 2021–22 to 2023–24. The first six words are affected by ChatGPT; the last three relate to major events that influenced scientific writing and are shown for comparison. (Credit: Kobak et al., 2024)

That students these days love to use ChatGPT for assistance with reports and other writing tasks is hardly a secret, but in academics it’s becoming ever more prevalent as well. This raises the question of whether ChatGPT-assisted academic writings can be distinguished somehow. According to [Dmitry Kobak] and colleagues this is the case, with a strong sign of ChatGPT use being the presence of a lot of flowery excess vocabulary in the text. As detailed in their prepublication paper, the frequency of certain style words is a remarkable change in the used vocabulary of the published works examined.

For their study they looked at over 14 million biomedical abstracts from 2010 to 2024 obtained via PubMed. These abstracts were then analyzed for word usage and frequency, which shows both natural increases in word frequency (e.g. from the SARS-CoV-2 pandemic and Ebola outbreak), as well as massive spikes in excess vocabulary that coincide with the public availability of ChatGPT and similar LLM-based tools.

In total 774 unique excess words were annotated. Here ‘excess’ means ‘outside of the norm’, following the pattern of ‘excess mortality’ where mortality during one period noticeably deviates from patterns established during previous periods. In this regard the bump in words like respiratory are logical, but the surge in style words like intricate and notably would seem to be due to LLMs having a penchant for such flowery, overly dramatized language.

The researchers have made the analysis code available for those interested in giving it a try on another corpus. The main author also addressed the question of whether ChatGPT might be influencing people to write more like an LLM. At this point it’s still an open question of whether people would be more inclined to use ChatGPT-like vocabulary or actively seek to avoid sounding like an LLM.

McDonald’s Terminates Its Drive-Through Ordering AI Assistant

McDonald’s recently announced that it will be scrapping the voice-assistant which it has installed at over 100 of its drive-throughs after a two-year trial run. In the email that was sent to franchises, McDonald’s did say that they are still looking at voice ordering solutions for automated order taking (AOT), but it appears that for now the test was a disappointment. Judging by the many viral videos of customers struggling to place an order through the AOT system, it’s not hard to see why.

This AOT attempt began when in 2019 McDonald’s acquired AI company Apprente to create its McD Tech Labs, only to sell it again to IBM who then got contracted to create the technology for McDonald’s fast-food joints. When launched in 2021, it was expected that McDonald’s drive-through ordering lanes would eventually all be serviced by AOT, with an experience akin to the Alexa and Siri voice assistants that everyone knows and loves (to yell at).

With the demise of this test at McDonald’s, it would seem that the biggest change is likely to be in the wider automation of preparing fast-food instead, with robots doing the burger flipping and freedom frying rather than a human. That said, would you prefer the McD voice assistant when going through a Drive-Thru® over a human voice?

EMO: Alibaba’s Diffusion Model-Based Talking Portrait Generator

Alibaba’s EMO (or Emote Portrait Alive) framework is a recent entry in a series of attempts to generate a talking head using existing audio (spoken word or vocal audio) and a reference portrait image as inputs. At its core it uses a diffusion model that is trained on 250 hours of video footage and over 150 million images. But unlike previous attempts, it adds what the researchers call a speed controller and a face region controller. These serve to stabilize the generated frames, along with an additional module to stop the diffusion model from outputting frames that feature a result too distinct from the reference image used as input.

In the related paper by [Linrui Tian] and colleagues a number of comparisons are shown between EMO and other frameworks, claiming significant improvements over these. A number of examples of talking and singing heads generated using this framework are provided by the researchers, which gives some idea of what are probably the ‘best case’ outputs. With some examples, like [Leslie Cheung Kwok Wing] singing ‘Unconditional‘ big glitches are obvious and there’s a definite mismatch between the vocal track and facial motions. Despite this, it’s quite impressive, especially with fairly realistic movement of the head including blinking of the eyes.

Meanwhile some seem extremely impressed, such as in a recent video by [Matthew Berman] on EMO where he states that Alibaba releasing this framework to the public might be ‘too dangerous’. The level-headed folks over at PetaPixel however also note the obvious visual imperfections that are a dead give-away for this kind of generative technology. Much like other diffusion model-based generators, it would seem that EMO is still very much stuck in the uncanny valley, with no clear path to becoming a real human yet.

Continue reading “EMO: Alibaba’s Diffusion Model-Based Talking Portrait Generator”

What If The Matrix Was Made In The 1950s?

We’ve noticed a recent YouTube trend of producing trailers for shows and movies as if they were produced in the 1950s, even when they weren’t. The results are impressive and, as you might expect, leverage AI generation tools. While we enjoy watching them, we were especially interested in [Patrick Gibney’s] peek behind the curtain of how he makes them, as you can see below. If you want to see an example of the result first, check out the second video, showing a 1950s-era The Matrix.

Of course, you could do some of it yourself, but if you want the full AI experience, [Patrick] suggests using ChatGPT to produce a script, though he admits that if he did that, he would tweak the results. Other AI tools create the pictures used and the announcer-style narration. Another tool produces cinematographic shots that include the motion of the “actors” and other things in the scene. More tools create the background music.

Continue reading “What If The Matrix Was Made In The 1950s?”

Can You Hear Me Now? Try These Headphones

When you are young, you take it for granted that you can pick out a voice in a crowded room or a factory floor. But as you get older, your hearing often gets to the point where a noisy room merges into a mishmash of sounds. University of Washington researchers have developed what they call Target Speech Hearing. In plain English, it is an AI-powered headphone that lets you look at someone and pull their voice out of the chatter. For best results, however, have to enroll their voice first, so it wouldn’t make a great eavesdropping device.

If you want to dive into the technical details, their paper goes into how it works. The prototype uses a Sony noise-cancelling headset. However, the system requires binaural microphones so additional microphones attach to the outside of the headphones.

Continue reading “Can You Hear Me Now? Try These Headphones”

Feast Your Eyes On These AI-Generated Sounds

The radio hackers in the audience will be familiar with a spectrogram display, but for the uninitiated, it’s basically a visual representation of how a range of frequencies are changing with time. Usually such a display is used to identify a clear transmission in a sea of noise, but with the right software, it’s possible to generate a signal that shows up as text or an image when viewed as a spectrogram. Musicians even occasionally use the technique to hide images in their songs. Unfortunately, the audio side of such a trick generally sounds like gibberish to human ears.

Or at least, it used to. Students from the University of Michigan have found a way to use diffusion models to not only create a spectrogram image for a given prompt, but to do it with audio that actually makes sense given what the image shows. So for example if you asked for a spectrogram of a race car, you might get an audio track that sounds like a revving engine.

Continue reading “Feast Your Eyes On These AI-Generated Sounds”