An image of a light grey graphing calculator with a dark grey screen and key surround. The text on the monochrome LCD screen shows "Input: ENEB Result 1: BEEN Confidence 1: 14% [##] Result 2: Good Confidence 2: 12% [#] Press ENTER key..."

A Neural Net For A Graphing Calculator?

July 17, 2025 by Navarre Bartz 4 Comments

Machine learning and neural nets can be pretty handy, and people continue to push the envelope of what they can do both in high end server farms as well as slower systems. At the extreme end of the spectrum is [ExploratoryStudios]’s Hermes Optimus Neural Net for a TI-84 Plus Silver Edition.

This neural net is setup as an autocorrect system that can take four character inputs and match them to a library of twelve words. That’s not a lot, but we’re talking about a device with 24 kB of RAM, so the little machine is doing its best. Perhaps more interesting than any practical output is the puzzle solving involved in getting this to work within the memory constraints.

The neural net “employs a feedforward neural network with a precisely calibrated 4-60-12 architecture and sigmoid activation functions.” This leads to an approximate 85% accuracy being able to identify and correct the given target words. We appreciate the readout of the net’s confidence as well which is something that seems to have gone out the window with many newer “AI” systems.

We’ve seen another TI-84 neural net for handwriting recognition, but is the current crop of AI still headed in the wrong direction?

Continue reading “A Neural Net For A Graphing Calculator?” →

Measuring The Impact Of LLMs On Experienced Developer Productivity

July 11, 2025 by Maya Posch 74 Comments

Recently AI risk and benefit evaluation company METR ran a randomized control test (RCT) on a gaggle of experienced open source developers to gain objective data on how the use of LLMs affects their productivity. Their findings were that using LLM-based tools like Cursor Pro with Claude 3.5/3.7 Sonnet reduced productivity by about 19%, with the full study by [Joel Becker] et al. available as PDF.

This study was also intended to establish a methodology to assess the impact from introducing LLM-based tools in software development. In the RCT, 16 experienced open source software developers were given 246 tasks, after which their effective performance was evaluated.

A large focus of the methodology was on creating realistic scenarios instead of using canned benchmarks. This included adding features to code, bug fixes and refactoring, much as they would do in the work on their respective open source projects. The observed increase in the time it took to complete tasks with the LLM’s assistance was found to be likely due to a range of factors, including over-optimism about the LLM tool capabilities, LLMs interfering with existing knowledge on the codebase, poor LLM performance on large codebases, low reliability of the generated code and the LLM doing very poorly on using tacit knowledge and context.

Although METR suggests that this poor showing may improve over time, it seems fair to argue whether LLM coding tools are at all a useful coding partner.

How To Train A New Voice For Piper With Only A Single Phrase

July 9, 2025 by Dave Rowntree 7 Comments

[Cal Bryant] hacked together a home automation system years ago, which more recently utilizes Piper TTS (text-to-speech) voices for various undisclosed purposes. Not satisfied with the robotic-sounding standard voices available, [Cal] set about an experiment to fine-tune the Piper TTS AI voice model using a clone of a single phrase created by a commercial TTS voice as a starting point.

Before the release of Piper TTS in 2023, existing free-to-use TTS systems such as espeak and Festival sounded robotic and flat. Piper delivered much more natural-sounding output, without requiring massive resources to run. To change the voice style, the Piper AI model can be either retrained from scratch or fine-tuned with less effort. In the latter case, the problem to be solved first was how to generate the necessary volume of training phrases to run the fine-tuning of Piper’s AI model. This was solved using a heavyweight AI model, ChatterBox, which is capable of so-called zero-shot training. Check out the Chatterbox demo here.

As the loss function gets smaller, the model’s accuracy gets better

Training began with a corpus of test phrases in text format to ensure decent coverage of everyday English. [Cal] used ChatterBox to clone audio from a single test phrase generated by a ‘mystery TTS system’ and created 1,300 test phrases from this new voice. This audio set served as training data to fine-tune the Piper AI model on the lashed-up GPU rig.

To verify accuracy, [Cal] used OpenAI’s Whisper software to transcribe the audio back to text, in order to compare with the original text corpus. To overcome issues with punctuation and differences between US and UK English, the text was converted into phonemes using espeak-ng, resulting in a 98% phrase matching accuracy.

After down-sampling the training set using SoX, it was ready for the Piper TTS training system. Despite all the preparation, running the software felt anticlimactic. A few inconsistencies in the dataset necessitated the removal of some data points. After five days of training parked outside in the shade due to concerns about heat, TensorBoard indicated that the model’s loss function was converging. That’s AI-speak for: the model was tuned and ready for action! We think it sounds pretty slick.

If all this new-fangled AI speech synthesis is too complex and, well, a bit creepy for you, may we offer a more 1980s solution to making stuff talk? Finally, most people take the ability to speak for granted, until they can no longer do so. Here’s a team using cutting-edge AI to give people back that ability.

A man’s hand is visible holding a large, potato-shaped object in the foreground. A short, white, cylindrical structure is on the top of the potato, with black wires bending back into the potato. A smaller rectangular structure is to one side of it, and a red alligator clip connects to a nail protruding from the potato.

Building A Potato-based GLaDOS As An Introduction To AI

July 6, 2025 by Aaron Beckendorf 5 Comments

Although not nearly as intimidating as her ceiling-mounted hanging arm body, GLaDOS spent a significant portion of the Portal 2 game in a stripped-down computer powered by a potato battery. [Dave] had already made a version of her original body, but it was built around a robotic arm that was too expensive for the project to be really accessible. For his latest project, therefore, he’s created a AI-powered version of GLaDOS’s potato-based incarnation, which also serves as a fun introduction to building AI systems.

[Dave] wanted the system to work offline, so he needed a computer powerful enough to run all of his software locally. He chose an Nvidia Jetson Orin Nano, which was powerful enough to run a workable software system, albeit slowly and with some memory limitations. A potato cell unfortunately doesn’t generate enough power to run a Jetson, and it would be difficult to find a potato large enough to fit the Jetson inside. Instead, [Dave] 3D-printed and painted a potato-shaped enclosure for the Jetson, a microphone, a speaker, and some supplemental electronics.

A large language model handles interactions with the user, but most models were too large to fit on the Jetson. [Dave] eventually selected Llama 3.2, and used LlamaIndex to preprocess information from the Portal wiki for retrieval-augmented generation. The model’s prompt was a bit difficult, but after contacting a prompt engineer, [Dave] managed to get it to respond to the hapless user in an appropriately acerbic manner. For speech generation, [Dave] used Piper after training it on audio files from the Portal wiki, and for speech recognition used Vosk (a good programming exercise, Vosk being, in his words, “somewhat documented”). He’s made all of the final code available on GitHub under the fitting name of PotatOS.

The end result is a handheld device that sarcastically insults anyone seeking its guidance. At least Dave had the good sense not to give this pernicious potato control over his home.

Convert Any Book To A DIY Audiobook?

July 6, 2025 by Dave Rowntree 12 Comments

If the idea of reading a physical book sounds like hard work, [Nick Bild’s] latest project, the PageParrot, might be for you. While AI gets a lot of flak these days, one thing modern multimodal models do exceptionally well is image interpretation, and PageParrot demonstrates just how accessible that’s become.

[Nick] demonstrates quite clearly how little code is needed to get from those cryptic black and white glyphs to sounds the average human can understand, specifically a paltry 80 lines of Python. Admittedly, many of those lines are pulling in libraries, and some are just blank, so functionally speaking, it’s even shorter than that. Of course, the whole application is mostly glue code, stitching together other people’s hard work, but it’s still instructive and fun to play with.

The hardware required is a Raspberry Pi Zero 2 W, a camera (in this case, a USB webcam), and something to hold it above the book. Any Pi with the ability to connect to a camera should also work, however, with just a little configuration.

On the software side, [Nick] pulls in the CV2 library (which is the interface to OpenCV) to handle the camera interfacing, programming it to full HD resolution. Google’s GenAI is used to interface the Gemini 2.5 Flash LLM via an API endpoint. This takes a captured image and a trivial prompt, and returns the whole page of text, quick as a flash.

Finally, the script hands that text over to Piper, which turns that into a speech file in WAV format. This can then be played to an audio device with a call out to the console aplay tool. It’s all very simple at this level of abstraction.

Continue reading “Convert Any Book To A DIY Audiobook?” →

AI Is Only Coming For Fun Jobs

July 5, 2025 by Bryan Cockfield 39 Comments

In the past few years, what marketers and venture capital firms term “artificial intelligence” but is more often an advanced predictive text model of some sort has started taking people’s jobs and threatening others. But not tedious jobs that society might like to have automated away in the first place. These AI tools have generally been taking rewarding or enjoyable jobs like artist, author, filmmaker, programmer, and composer. This project from a research team might soon be able to add astronaut to that list.

The team was working within the confines of the Kerbal Space Program Differential Game Challenge, an open-source plugin from MIT that allows developers to test various algorithms and artificial intelligences in simulated spacecraft situations. Generally, purpose-built models are used here with many rounds of refinement and testing, but since this process can be time consuming and costly the researchers on this team decided to hand over control to ChatGPT with only limited instructions. A translation layer built by the researchers allows generated text to be converted to spacecraft controls.

We’ll note that, at least as of right now, large language models haven’t taken the jobs of any actual astronauts yet. The game challenge is generally meant for non-manned spacecraft like orbital satellites which often need to make their own decisions to maintain orbits and avoid obstacles. This specific model was able to place second in a recent competition as well, although we’ll keep rooting for humans in certain situations like these.

Why GitHub Copilot Isn’t Your Coding Partner

July 4, 2025 by Maya Posch 80 Comments

These days ‘AI’ is everywhere, including in software development. Coming hot on the heels of approaches like eXtreme Programming and Pair Programming, there’s now a new kind of pair programming in town in the form of an LLM that’s been digesting millions of lines of code. Purportedly designed to help developers program faster and more efficiently, these ‘AI programming assistants’ have primarily led to heated debate and some interesting studies.

In the case of [Jj], their undiluted feelings towards programming assistants like GitHub Copilot burn as brightly as the fire of a thousand Suns, and not a happy kind of fire.

Whether it’s Copilot or ChatGPT or some other chatbot that may or may not be integrated into your IDE, the frustration with what often feels like StackOverflow-powered-autocomplete is something that many of us can likely sympathize with. Although [Jj] lists a few positives of using an LLM trained on codebases and documentation, their overall view is that using Copilot degrades a programmer, mostly because of how it takes critical thinking skills out of the loop.

Regardless of whether you agree with [Jj] or not, the research so far on using LLMs with software development and other tasks strongly suggests that they’re not a net positive for one’s mental faculties. It’s also important to note that at the end of the day it’s still you, the fleshy bag of mostly salty water, who has to justify the code during code review and when something catches on fire in production. Your ‘copilot’ meanwhile gets off easy.