Measuring The Impact Of LLMs On Experienced Developer Productivity

Recently AI risk and benefit evaluation company METR ran a randomized control test (RCT) on a gaggle of experienced open source developers to gain objective data on how the use of LLMs affects their productivity. Their findings were that using LLM-based tools like Cursor Pro with Claude 3.5/3.7 Sonnet reduced productivity by about 19%, with the full study by [Joel Becker] et al. available as PDF.

This study was also intended to establish a methodology to assess the impact from introducing LLM-based tools in software development. In the RCT, 16 experienced open source software developers were given 246 tasks, after which their effective performance was evaluated.

A large focus of the methodology was on creating realistic scenarios instead of using canned benchmarks. This included adding features to code, bug fixes and refactoring, much as they would do in the work on their respective open source projects. The observed increase in the time it took to complete tasks with the LLM’s assistance was found to be likely due to a range of factors, including over-optimism about the LLM tool capabilities, LLMs interfering with existing knowledge on the codebase, poor LLM performance on large codebases, low reliability of the generated code and the LLM doing very poorly on using tactic knowledge and context.

Although METR suggests that this poor showing may improve over time, it seems fair to argue whether LLM coding tools are at all a useful coding partner.

Dearest C++, Let Me Count The Ways I Love/Hate Thee

My first encounter with C++ was way back in the 1990s, when it was one of the Real Programming Languages™ that I sometimes heard about as I was still splashing about in the kiddie pool with Visual Basic, PHP and JavaScript. The first formally standardized version of C++ is the ISO 1998 standard, but it had been making headways as a ‘better C’ for decades at that point since Bjarne Stroustrup added that increment operator to C in 1979 and released C++ to the public in 1985.

Why did I pick C++ as my primary programming language? Mainly because it was well supported and with free tooling: a free Borland compiler or g++ on the GCC side. Alternatives like VB, Java, and D felt far too niche compared to established languages, while C++ gave you access to the lingua franca of C while adding many modern features like OOP and a more streamlined syntax in addition to the Standard Template Library (STL) with gobs of useful building blocks.

Years later, as a grizzled senior C++ developer, I have come to embrace the notion that being good at a programming language also means having strong opinions on all that is wrong with the language. True to form, while C++ has many good points, there are still major warts and many heavily neglected aspects that get me and other C++ developers riled up.

Continue reading “Dearest C++, Let Me Count The Ways I Love/Hate Thee”

Dithering With Quantization To Smooth Things Over

It should probably come as no surprise to anyone that the images which we look at every day – whether printed or on a display – are simply illusions. That cat picture isn’t actually a cat, but rather a collection of dots that when looked at from far enough away tricks our brain into thinking that we are indeed looking at a two-dimensional cat and happily fills in the blanks. These dots can use the full CMYK color model for prints, RGB(A) for digital images or a limited color space including greyscale.

Perhaps more interesting is the use of dithering to further trick the mind into seeing things that aren’t truly there by adding noise. Simply put, dithering is the process of adding noise to reduce quantization error, which in images shows up as artefacts like color banding. Within the field of digital audio dithering is also used, for similar reasons. Part of the process of going from an analog signal to a digital one involves throwing away data that falls outside the sampling rate and quantization depth.

By adding dithering noise these quantization errors are smoothed out, with the final effect depending on the dithering algorithm used.

Continue reading “Dithering With Quantization To Smooth Things Over”

PIC Burnout: Dumping Protected OTP Memory In Microchip PIC MCUs

Normally you can’t read out the One Time Programming (OTP) memory in Microchip’s PIC MCUs that have code protection enabled, but an exploit has been found that gets around the copy protection in a range of PIC12, PIC14 and PIC16 MCUs.

This exploit is called PIC Burnout, and was developed by [Prehistoricman], with the cautious note that although this process is non-invasive, it does damage the memory contents. This means that you likely will only get one shot at dumping the OTP data before the memory is ‘burned out’.

The copy protection normally returns scrambled OTP data, with an example of PIC Burnout provided for the PIC16LC63A. After entering programming mode by setting the ICSP CLK pin high, excessively high programming voltage and duration is used repeatedly while checking that an area that normally reads as zero now reads back proper data. After this the OTP should be read out repeatedly to ensure that the scrambling has been circumvented.

The trick appears to be that while there’s over-voltage and similar protections on much of the Flash, this approach can still be used to affect the entire flash bit column. Suffice it to say that this method isn’t very kind to the Flash memory cells and can take hours to get a good dump. Even after this you need to know the exact scrambling method used, which is fortunately often documented by Microchip datasheets.

Thanks to [DjBiohazard] for the tip.

Turning PET Plastic Into Paracetamol With This One Bacterial Trick

Over the course of evolution microorganisms have evolved pathways to break down many materials. The challenge with the many materials that we humans have created over just the past decades is that we cannot wait for evolution to catch up, ergo we have to develop such pathways ourselves. One such example is demonstrated by [Nick W. Johnson] et al. with a recent study in Nature Chemistry that explicitly targets PET plastic, which is very commonly used in plastic bottles.

The researchers modified regular E. coli bacteria to use PET plastic as an input via Lossen rearrangement, which converts hydroxamate esters to isocyanates, with at the end of the pathway para-aminobenzoate (PABA)  resulting, which using biosynthesis created paracetamol, the active ingredient in Tylenol. This new pathway is also completely harmless to the bacterium, which is always a potential pitfall with this kind of biological pathway engineering.

In addition to this offering a potential way to convert PET bottles into paracetamol, the researchers note that their findings could be very beneficial to studies targeting other ‘waste’ products from biological pathways.

Thanks to [DjBiohazard] for the tip.

Diagnosing Whisker Failure Mode In AF114 And Similar Transistors

The inside of this AF117 transistor can was a thriving whisker ecosystem. (Credit: Anthony Francis-Jones)
The inside of this AF117 transistor can was a thriving whisker ecosystem. (Credit: Anthony Francis-Jones)

AF114 germanium transistors and related ones like the AF115 through AF117 were quite popular during the 1960s, but they quickly developed a reputation for failure. This is due to what should have made them more reliable, namely the can shielding the germanium transistor inside that is connected with a fourth ‘screen’ pin. This failure mode is demonstrated in a video by [Anthony Francis-Jones] in which he tests a number of new-old-stock AF-series transistors only for them all to test faulty and show clear whisker growth on the can’s exterior.

Naturally, the next step was to cut one of these defective transistors open to see whether the whiskers could be caught in the act. For this a pipe cutter was used on the fairly beefy can, which turned out to rather effective and gave great access to the inside of these 1960s-era components. The insides of the cans were as expected bristling with whiskers.

The AF11x family of transistors are high-frequency PNP transistors that saw frequent use in everything from consumer radios to just about anything else that did RF or audio. It’s worth noting that the material of the can is likely to be zinc and not tin, so these would be zinc whiskers. Many metals like to grow such whiskers, including lead, so the end effect is often a thin conductive strand bridging things that shouldn’t be. Apparently the can itself wasn’t the only source of these whiskers, which adds to the fun.

In the rest of the video [Anthony] shows off the fascinating construction of these germanium transistors, as well as potential repairs to remove the whisker-induced shorts through melting them. This is done by jolting them with a fairly high current from a capacitor. The good news is that this made the component tester see the AF114 as a transistor again, except as a rather confused NPN one. Clearly this isn’t an easy fix, and it would be temporary at best anyway, as the whiskers will never stop growing.

Continue reading “Diagnosing Whisker Failure Mode In AF114 And Similar Transistors”

Visiting Our Neighbor Sedna: Feasibility Study Of A Mission To This Planetoid

Image of Sedna, taken by the Hubble Space telescope in 2004. (Credit: NASA)
Image of Sedna, taken by the Hubble Space telescope in 2004. (Credit: NASA)

While for most people Pluto is the most distant planet in the Solar System, things get a lot more fuzzy once you pass Neptune and enter the realm of trans-Neptunian objects (TNOs). Pluto is probably the most well-known of these, but there are at least a dozen more of such dwarf planets among the TNOs, including 90377 Sedna.

This obviously invites the notion of sending an exploration mission to Sedna, much as was done with Pluto and a range of other TNOs through the New Horizons spacecraft. How practical this would be is investigated in a recent study by [Elena Ancona] and colleagues.

The focus is here on advanced propulsion methods, including nuclear propulsion and solar sails. Although it’s definitely possible to use a similar mission profile as with the New Horizons mission, this would make it another long-duration mission. Rather than a decades-long mission, using a minimally-equipped solar sail spacecraft could knock this down to about seven years, whereas the proposed Direct Fusion Drive (DFD) could do this in ten, but with a much larger payload and the ability do an orbital insertion which would obviously get much more science done.

As for the motivation for a mission to Sedna, its highly eccentric orbit that takes it past the heliopause means that it spends relatively little time being exposed to the Sun’s rays, which should have left much of the surface material intact that was present during the early formation of the Solar System. With our explorations of the Solar System taking us ever further beyond the means of traditional means of space travel, a mission to Sedna might not only expand our horizons, but also provide a tantalizing way to bring much more of the Solar System including the Kuiper belt within easy reach.