Measuring The Impact Of LLMs On Experienced Developer Productivity

Recently AI risk and benefit evaluation company METR ran a randomized control test (RCT) on a gaggle of experienced open source developers to gain objective data on how the use of LLMs affects their productivity. Their findings were that using LLM-based tools like Cursor Pro with Claude 3.5/3.7 Sonnet reduced productivity by about 19%, with the full study by [Joel Becker] et al. available as PDF.

This study was also intended to establish a methodology to assess the impact from introducing LLM-based tools in software development. In the RCT, 16 experienced open source software developers were given 246 tasks, after which their effective performance was evaluated.

A large focus of the methodology was on creating realistic scenarios instead of using canned benchmarks. This included adding features to code, bug fixes and refactoring, much as they would do in the work on their respective open source projects. The observed increase in the time it took to complete tasks with the LLM’s assistance was found to be likely due to a range of factors, including over-optimism about the LLM tool capabilities, LLMs interfering with existing knowledge on the codebase, poor LLM performance on large codebases, low reliability of the generated code and the LLM doing very poorly on using tacit knowledge and context.

Although METR suggests that this poor showing may improve over time, it seems fair to argue whether LLM coding tools are at all a useful coding partner.

Dearest C++, Let Me Count The Ways I Love/Hate Thee

My first encounter with C++ was way back in the 1990s, when it was one of the Real Programming Languages™ that I sometimes heard about as I was still splashing about in the kiddie pool with Visual Basic, PHP and JavaScript. The first formally standardized version of C++ is the ISO 1998 standard, but it had been making headways as a ‘better C’ for decades at that point since Bjarne Stroustrup added that increment operator to C in 1979 and released C++ to the public in 1985.

Why did I pick C++ as my primary programming language? Mainly because it was well supported and with free tooling: a free Borland compiler or g++ on the GCC side. Alternatives like VB, Java, and D felt far too niche compared to established languages, while C++ gave you access to the lingua franca of C while adding many modern features like OOP and a more streamlined syntax in addition to the Standard Template Library (STL) with gobs of useful building blocks.

Years later, as a grizzled senior C++ developer, I have come to embrace the notion that being good at a programming language also means having strong opinions on all that is wrong with the language. True to form, while C++ has many good points, there are still major warts and many heavily neglected aspects that get me and other C++ developers riled up.

Continue reading “Dearest C++, Let Me Count The Ways I Love/Hate Thee”

Embedded USB Debug For Snapdragon

According to [Casey Connolly], Qualcomm’s release of how to interact with their embedded USB debugging (EUD) is a big deal. If you haven’t heard of it, nearly all Qualcomm SoCs made since 2018 have a built-in debugger that connects to the onboard USB port. The details vary by chip, but you write to some registers and start up the USB phy. This gives you an oddball USB interface that looks like a seven-port hub with a single device “EUD control interface.”

So what do you do with that? You send a few USB commands, and you’ll get a second device. This one connects to an SWD interface. Of course, we have plenty of tools to debug using SWD.

Continue reading “Embedded USB Debug For Snapdragon”

Dithering With Quantization To Smooth Things Over

It should probably come as no surprise to anyone that the images which we look at every day – whether printed or on a display – are simply illusions. That cat picture isn’t actually a cat, but rather a collection of dots that when looked at from far enough away tricks our brain into thinking that we are indeed looking at a two-dimensional cat and happily fills in the blanks. These dots can use the full CMYK color model for prints, RGB(A) for digital images or a limited color space including greyscale.

Perhaps more interesting is the use of dithering to further trick the mind into seeing things that aren’t truly there by adding noise. Simply put, dithering is the process of adding noise to reduce quantization error, which in images shows up as artefacts like color banding. Within the field of digital audio dithering is also used, for similar reasons. Part of the process of going from an analog signal to a digital one involves throwing away data that falls outside the sampling rate and quantization depth.

By adding dithering noise these quantization errors are smoothed out, with the final effect depending on the dithering algorithm used.

Continue reading “Dithering With Quantization To Smooth Things Over”

screenshot of C programming on Macintosh Plus

Programming Like It’s 1986, For Fun And Zero Profit

Some people slander retrocomputing as an old man’s game, just because most of those involved are more ancient than the hardware they’re playing with. But there are veritable children involved too — take the [ComputerSmith], who is recreating Conway’s game of life on a Macintosh Plus that could very well be as old as his parents. If there’s any nostalgia here, it’s at least a generation removed — thus proving for the haters that there’s more than a misplaced desire to relive one’s youth in exploring these ancient machines.

So what does a young person get out of programming on a 1980s Mac? Well, aside from internet clout, and possible YouTube monetization, there’s the sheer intellectual challenge of the thing. You cant go sniffing around StackExchange or LLMs for code to copy-paste when writing C for a 1986 machine, not if you’re going to be fully authentic. ANSI C only dates to 1987, after all, and figuring out the quirks and foibles of the specific C implementation is both half the fun, and not easily outsourced. Object Pascal would also have been an option (and quite likely more straightforward — at least the language was clearly-defined), but [ComputerSmith] seems to think the exercise will improve his chops with C, and he’s likely to be right. 

Apparently [ComputerSmith] brought this project to VCS Southwest, so anyone who was there doesn’t have to wait for Part 2 of the video to show up to see how this turns out, or to snag a copy of the code (which was apparently available on diskette). If you were there, let us know if you spotted the youngest Macintosh Plus programmer, and if you scored a disk from him.

If the idea of coding in this era tickles the dopamine receptors, check out this how-to for a prizewinning Amiga demo.  If you think pre-ANSI C isn’t retro enough, perhaps you’d prefer programming by card?

Continue reading “Programming Like It’s 1986, For Fun And Zero Profit”

Going To The (Parallel) Chapel

There is always the promise of using more computing power for a single task. Your computer has multiple CPUs now, surely. Your video card has even more. Your computer is probably networked to a slew of other computers. But how do you write software to take advantage of that? There are many complex systems, of course, but there’s also Chapel.

Chapel is a reasonably simple programming language, but it supports parallelism in various forms. The run time controls how computers — whatever that means — communicate with one another. You can have code running on your local CPUs, your GPU, and other processing elements over the network without much work on your part.

Continue reading “Going To The (Parallel) Chapel”

Why GitHub Copilot Isn’t Your Coding Partner

These days ‘AI’ is everywhere, including in software development. Coming hot on the heels of approaches like eXtreme Programming and Pair Programming, there’s now a new kind of pair programming in town in the form of an LLM that’s been digesting millions of lines of code. Purportedly designed to help developers program faster and more efficiently, these ‘AI programming assistants’ have primarily led to heated debate and some interesting studies.

In the case of [Jj], their undiluted feelings towards programming assistants like GitHub Copilot burn as brightly as the fire of a thousand Suns, and not a happy kind of fire.

Whether it’s Copilot or ChatGPT or some other chatbot that may or may not be integrated into your IDE, the frustration with what often feels like StackOverflow-powered-autocomplete is something that many of us can likely sympathize with. Although [Jj] lists a few positives of using an LLM trained on codebases and documentation, their overall view is that using Copilot degrades a programmer, mostly because of how it takes critical thinking skills out of the loop.

Regardless of whether you agree with [Jj] or not, the research so far on using LLMs with software development and other tasks strongly suggests that they’re not a net positive for one’s mental faculties. It’s also important to note that at the end of the day it’s still you, the fleshy bag of mostly salty water, who has to justify the code during code review and when something catches on fire in production. Your ‘copilot’ meanwhile gets off easy.