Two threads running concurrently

The Staggering Complexity And Subtlety Of Concurrency

If you’re gonna be a hacker eventually you’re gonna have to write code. And if you write code eventually you’re gonna have to deal with concurrency. Concurrency is what we call it when parts of our program run at the same time. That could be because of something fairly straightforward, like multiple threads, or multiple processes; or something a little more complicated such as event loops, asynchronous or non-blocking I/O, interrupts and signal handlers, re-entrancy, co-routines / fibers / green threads, job queues, DMA and hardware level concurrency, speculative or out-of-order execution at CPU-level, time-sharing on single-core systems, or parallel execution on multi-core systems. There are just so many ways to get tied up with concurrency.

In this video from [Core Dumped] we learn about The ’80s Algorithm to Avoid Race Conditions (and Why It Failed). This video explains what a race condition looks like and talks through what the critical section is and approaches to protecting it. This video introduces an old approach to protect the critical section first invented in 1981 known as Peterson’s solution, but then goes on to explain how Peterson’s solution is no longer reliable as much has changed since the 1980s, particularly compilers will reorganize instructions and CPUs may run code out of order. So there is no free lunch and if you have to deal with concurrency you’re going to want some kind of support for a mutex of some type. Your programming language and its standard library probably have various types of locks available and if not you can use something like flock (also available as a syscall, to complement the POSIX fcntl), which may be available on your platform.

If you’re interested in contemporary takes on concurrency you might like to read Amiga, Interrupted: A Fresh Take On Amiga OS or The Linux Scheduler And How It Handles More Cores.

Continue reading “The Staggering Complexity And Subtlety Of Concurrency”

GitHub Disables Rockchip’s Linux MPP Repository After DMCA Request

Recently GitHub disabled the Rockchip Linux MPP repository, following a DMCA takedown request from the FFmpeg team. As of writing the affected repository remains unavailable. At the core of this issue is the Rockchip MPP framework, which provides hardware-accelerated video operations on Rockchip SoCs. Much of the code for this was lifted verbatim from FFmpeg, with the allegation being that this occurred with the removal of the original copyright notices and authors. The Rockchip MPP framework was further re-licensed from LGPL 2.1 to the Apache license.

Most egregious of all is perhaps that the FFmpeg team privately contacted Rockchip about this nearly two years ago, with clearly no action taken since. Thus FFmpeg demands that Rockchip either undoes these actions that violate the LGPL, or remove all infringing files.

This news and further context is also covered by [Brodie Robertson] in a video. What’s interesting is that Rockchip in public communications and in GitHub issues are clearly aware of this license issue, but seem to defer dealing with it until some undefined point in the future. Clearly that was the wrong choice by Rockchip, though it remains a major question what will happen next. [Brodie] speculates that Rockchip will keep ignoring the issue, but is hopeful that he’ll be proven wrong.

Unfortunately, these sort of long-standing license violations aren’t uncommon in the open source world.

Continue reading “GitHub Disables Rockchip’s Linux MPP Repository After DMCA Request”

It’s Time To Make A Major Change To D-Bus On Linux

Although flying well under the radar of the average Linux user, D-Bus has been an integral part of Linux distributions for nearly two decades and counting. Rather than using faster point-to-point interprocess communication via a Unix socket or such, an IPC bus allows for IP communication in a bus-like manner for convenience reasons. D-Bus replaced a few existing IPC buses in the Gnome and KDE desktop environments and became since that time the de-facto standard. Which isn’t to say that D-Bus is well-designed or devoid of flaws, hence attracting the ire of people like [Vaxry] who recently wrote an article on why D-Bus should die and proposes using hyprwire instead.

The broader context is provided by [Brodie Robertson], whose video adds interesting details, such as that Arch Linux wrote its own D-Bus implementation rather than use the reference one. Then there’s CVE-2018-19358 pertaining to the security risk of using an unlocked keyring on D-Bus, as any application on said bus can read the contents. The response by the Gnome developers responsible for D-Bus was very Wayland-like in that they dismissed the CVE as ‘works as designed’.

One reason why the proposed hyperwire/hyprtavern IPC bus would be better is on account of having actual security permissions, real validation of messages and purportedly also solid documentation. Even after nearly twenty years the documentation for D-Bus consists mostly out of poorly documented code, lots of TODOs in ‘documentation’ files along with unfinished drafts. Although [Vaxry] isn’t expecting this hyprwire alternative to be picked up any time soon, it’s hoped that it’ll at least make some kind of improvement possible, rather than Linux limping on with D-Bus for another few decades.

Continue reading “It’s Time To Make A Major Change To D-Bus On Linux”

Xcc700: Self-Hosted C Compiler For The ESP32/Xtensa

With two cores at 240 MHz and about 8.5 MB of non-banked RAM if you’re using the right ESP32-S3 version, this MCU seems at least in terms of specifications to be quite the mini PC. Obviously this means that it should be capable of self-hosting its compiler, which is exactly what [Valentyn Danylchuk] did with the xcc700 C compiler project.

Targeting the Xtensa Lx7 ISA of the ESP32-S3, this is a minimal C compiler that outputs relocatable ELF binaries. These binaries can subsequently be run with for example the ESP-IDF-based elf_loader component. Obviously, this is best done on an ESP32 platform that has PSRAM, unless your binary fits within the few hundred kB that’s left after all the housekeeping and communication stacks are loaded.

The xcc700 compiler is currently very minimalistic, omitting more complex loop types as well as long and floating point types, for starters. There’s no optimization of the final code either, but considering that it’s 700 lines of code just for a PoC, there seems to be still plenty of room for improvement.

Abusing X86 SIMD Instructions To Optimize PlayStation 3 Emulation

Key to efficient hardware emulation is an efficient mapping to the underlying CPU’s opcodes. Here one is free to target opcodes that may or may not have been imagined for that particular use. For emulators like the RPCS3 PlayStation 3 emulator this has led to some interesting mappings, as detailed in a video by [Whatcookie].

It’s important to remember here that the Cell processor in the PlayStation 3 is a bit of an odd duck, using a single regular PowerPC core (PPE) along with multiple much more simple co-processors called synergistic processing elements (SPEs) all connected with a high-speed bus. A lot of the focus with Cell was on floating point vector – i.e. SIMD – processing, which is part of why for a while the PlayStation 3 was not going to have a dedicated GPU.

As a result, it makes perfect sense to do creative mapping between the Cell’s SIMD instructions and those of e.g. SSE and AVX, even if Intel removing AVX-512 for a while caused major headaches. Fortunately some of those reappeared in AVX2.

The video goes through a whole range of Cell-specific instructions, how they work, and what x86 SIMD instructions they were mapped to and why. The SUBD instruction for example is mapped to VPDPBUSD as well as VDBPSADBW in AVX-512, the latter of which mostly targets things like video encoding. In the end it’s the result that matters, even if it also shows why the Cell processor was so interesting for high-performance compute clusters back in the day.

Continue reading “Abusing X86 SIMD Instructions To Optimize PlayStation 3 Emulation”

Surviving The RAM Apocalypse With Software Optimizations

To the surprise of almost nobody, the unprecedented build-out of datacenters and the equipping of them with servers for so-called ‘AI’ has led to a massive shortage of certain components. With random access memory (RAM) being so far the most heavily affected and with storage in the form of HDDs and SSDs not far behind, this has led many to ask the question of how we will survive the coming months, years, decades, or however-long the current AI bubble will last.

One thing is already certain, and that is that we will have to make our current computer systems last longer, and forego simply tossing in more sticks of RAM in favor of doing more with less. This is easy to imagine for those of us who remember running a full-blown Windows desktop system on a sub-GHz x86 system with less than a GB of RAM, but might require some adjustment for everyone else.

In short, what can us software developers do differently to make a hundred MB of RAM stretch further, and make a GB of storage space look positively spacious again?

Continue reading “Surviving The RAM Apocalypse With Software Optimizations”

Libxml2 Narrowly Avoids Becoming Unmaintained

In an excellent example of one of the most overused XKCD images, the libxml2 library has for a little while lost its only maintainer, with [Nick Wellnhofer] making good on his plan to step down by the end of the year.

XKCD's dependency model
Modern-day infrastructure, as visualized by XKCD. (Credit: Randall Munroe)

While this might not sound like a big deal, the real scope of this problem is rather profound. Not only is libxml2 part of GNOME, it’s also used as dependency by a huge number of projects, including web browsers and just about anything that processes XML or XSLT. Not having a maintainer in the event that a fresh, high-risk CVE pops up would obviously be less than desirable.

As for why [Nick] stepped down, it’s a long story. It starts in the early 2000s when the original author [Daniel Veillard] decided he no longer had time for the project and left [Nick] in charge. It should be said here that both of them worked as volunteers on the project, for no financial compensation. This when large companies began to use projects like libxml2 in their software, and were happy to send bug reports. Beyond a single Google donation it was effectively unpaid work that required a lot of time spent on researching and processing potential security flaws sent in.

Of note is that when such a security report comes in, the expectation is that you as a volunteer software developer drop everything you’re working on and figure out the cause, fix and patched-by-date alongside filing a CVE. This rather than you getting sent a merge request or similar with an accompanying test case. Obviously these kind of cases seems to have played a major role in making [Nick] burn out on maintaining both libxml2 and libxslt.

Fortunately for the project two new developers have stepped up to take over as maintainers, but it should be obvious that such churn is not a good sign. It also highlights the central problem with the conflicting expectations of open source software being both totally free in a monetary fashion and unburdened with critical bugs. This is unfortunately an issue that doesn’t seem to have an easy solution, with e.g. software bounties resulting in mostly a headache.