Exploit The Stressed-out Package Maintainer, Exploit The Software Package

March 31, 2024 by Donald Papp 10 Comments

A recent security vulnerability — a potential ssh backdoor via the liblzma library in the xz package — is having a lot of analysis done on how the vulnerability was introduced, and [Rob Mensching] felt that it was important to highlight what he saw as step number zero of the whole process: exploit the fact that a stressed package maintainer has burned out. Apply pressure from multiple sources while the attacker is the only one stepping forward to help, then inherit the trust built up by the original maintainer. Sadly, [Rob] sees in these interactions a microcosm of what happens far too frequently in open source.

Maintaining open source projects can be a high stress activity. The pressure and expectations to continually provide timely interaction, support, and updates can easily end up being unhealthy. As [Rob] points out (and other developers have observed in different ways), this kind of behavior just seems more or less normal for some projects.

The xz/liblzma vulnerability itself is a developing story, read about it and find links to the relevant analyses in our earlier coverage here.

Is Your Mental Model Of Bash Pipelines Wrong?

March 28, 2024 by Donald Papp 11 Comments

[Michael Lynch] encountered a strange situation. Why was compiling then running his program nearly 10x faster than just running the program by itself? [Michael] ran into this issue while benchmarking a programming project, pared it down to its essentials for repeatability and analysis, and discovered it highlighted an incorrect mental model of how bash pipelines worked.

Here’s the situation. The first thing [Michael]’s pared-down program does is start a timer. Then it simply reads and counts some bytes from stdin, then prints out how long it took for that to happen. When running the test program in the following way, it takes about 13 microseconds.

$ echo '00010203040506070809' | xxd -r -p | zig build run -Doptimize=ReleaseFast
bytes: 10
execution time: 13.549µs

When running the (already-compiled) program directly, execution time swells to 162 microseconds.

$ echo '00010203040506070809' | xxd -r -p | ./zig-out/bin/count-bytes
bytes: 10
execution time: 162.195µs

Again, the only difference between zig build run and ./zig-out/bin/count-bytes is that the first compiles the code, then immediately runs it. The second simply runs the compiled program. Continue reading “Is Your Mental Model Of Bash Pipelines Wrong?” →

Hybrid Binaries On Windows For ARM: ARM64EC And ARM64X Explained

March 28, 2024 by Maya Posch 43 Comments

With ARM processors increasingly becoming part of the desktop ecosystem, porting code that was written for x86_64 platforms is both necessary and a massive undertaking. For many codebases a simple recompile may be all it takes, but where this is not straightforward Microsoft’s ARM64EC (for ‘Emulator Compatible’) Application Binary Interface (ABI) provides a transition path. Unlike Apple’s ‘Fat Binaries’, this features hybrid PE executables (ARM64 eXtended, or ARM64X) that run mixed ARM64EC and x86_64 binary code on Windows 11 ARM systems. An in-depth explanation is provided by one of the authors, [Darek Mihocka].

ARM64EC was announced by Microsoft on June 28, 2021 as a new feature in Windows 11 for ARM, with more recently Qualcomm putting it forward during the 2024 Game Developers Conference (GDC) as one reason why high-performance gaming on its Snapdragon SoCs should be much easier than often assumed. Naturally, this assumes that Windows 11 is being used, as it contains the x86_64 emulator with ARM64EC support. The major difference between plain ARMv8 and ARM64EC code is that the latter has changes on an ABI level to e.g. calling conventions that ease interoperability between emulated x86_64 and ARM64 code.

Although technologically impressive, Windows 11’s marketshare is still rather small, even before looking at Windows 11 on ARM. It’ll be interesting to see whether Qualcomm’s bravado comes to fruition, and make ARM64EC more relevant for the average software developer.

Grep By Example is also available as a PDF Minibook, and a Grep playground helps you learn quickly.

Galvanize Your Grip On Grep With This Great Grep Guide

March 26, 2024 by Ryan Flowers 11 Comments

These days, you can’t throw a USB stick without hitting something that’s running Linux. It might be a phone, an embedded device, or your TV. Either way, it’s running Linux, and somewhere along the line of the development of whatever your USB stick smacked into, somebody used the Global Regular Expression Print utility- better known as Grep. But what is Grep, and why do you need it? [Anton Zhiyanov] not only answers those questions but provides Grep by example: Interactive Guide to help you along.

To understand Linux, one must understand its commercial predecessor, Unix. One of the things that made Unix (and then Linux) unique was its philosophy: Write programs that work together, do one thing well, and handle text streams. This philosophy describes a huge number of programs, and one of these programs is Grep. It’s installed everywhere there’s a *nix installed, and once one becomes familiar with it, their command-line-fu reaches an all new level.

At its core, Grep is simply a bloodhound. It’s scent? A magical incantation called Regular Expressions. Regular Expressions (aka Regex) are simply a way of describing what a stream of text should look like. So when you feed Grep a bit of Regular Expression, it Prints only the text that matches that expression. Neat, right?

The trouble is that Regex can be kind of hard, and Grep has various versions and capabilities that need to be learned. And this is where the article shines- it covers both in an excellent interactive tutorial that’ll help you become a Grep Guru in no time. And if you want to do a deeper dive, check out what it takes to make your own Regex Engine from scratch!

Obfuscated C 8080 Emulator Ported

March 24, 2024 by Al Williams 7 Comments

[Oscar] is no stranger to writing hard-to-read C code. While most of us do that by accident, there are those who strive to write the most unreadable code and enter it in the IOCCC — the International Obfuscated C Code Contest. One of his winning entries was a single C function that emulates an 8080. With a few support files, the plucky little emulator will run CP/M.

The emulator won best in show, but that was in 2006. Things have changed a bit and [Oscar] has updated the code so that you can continue to try it if you want to give yourself a headache reading code. The portability isn’t a CPU issue — modern CPUs will happily run code from 2006. The problem is the compiler and operating system. Compilers are much stricter these days, and Linux needs a little extra coaxing to give access to the input stream the way the faux computer needs it.

Continue reading “Obfuscated C 8080 Emulator Ported” →

diagram of the radicle node-to-node connectivity

Radicle: An Open-Source, Peer-to-Peer, GitHub Alternative

March 16, 2024 by Dave Rowntree 26 Comments

The actions of certain large social networks have recently highlighted how a small number of people possess significant power over the masses and how this power is sometimes misused. Consequently, there has been a surge in the development of federated (or decentralized) services, such as Mastodon and Matrix. But what about development? While GitHub and similar services are less likely to be used for political manipulation, they are still centralized services with a common failure point. Radicle is an open-source, peer-to-peer collaboration stack built on top of Git but backed with public key cryptography as a standard and a gossip protocol to ensure widespread data sharing across the network and, thus, some fault tolerance.

Essentially, code and associated documentation are secured cryptographically with an identity. The Git protocol is used for actual data transfer from peer-to-peer, which means that updates are only sent as deltas, not complete copies, maximizing channel bandwidth efficiency. A custom gossip protocol is used for metadata transfer around the network of peers. The projects had a local-first ideology, with users running a full-stack node on their hardware and all features available, even offline, which is great for laptop users who move around locations with sporadic access to the internet.

Judging from their Zulipchat instance, this is a highly active space, so perhaps it is worth diving in and seeing if it floats your boat. Fancy getting onto the Fediverse, but only have a spare MS-DOS machine to try it on? We’ve got it covered. Want to use Git but not online? You need a private Git server. Finally, too much Git? How about Gitless?

Thanks [Anonymous] for the tip! No, that wasn’t lost on us :D

Making Floating Point Calculations Less Cursed When Accuracy Matters

March 14, 2024 by Maya Posch 27 Comments

Inverting the earlier exponentiation to reduce floating point arithmetic error. (Credit: exozy)

An unfortunate reality of trying to represent continuous real numbers in a fixed space (e.g. with a limited number of bits) is that this comes with an inevitable loss of both precision and accuracy. Although floating point arithmetic standards – like the commonly used IEEE 754 – seek to minimize this error, it’s inevitable that across the range of a floating point variable loss of precision occurs. This is what [exozy] demonstrates, by showing just how big the error can get when performing a simple division of the exponential of an input value by the original value. This results in an amazing error of over 10%, which leads to the question of how to best fix this.

Obviously, if you have the option, you can simply increase the precision of the floating point variable, from 32-bit to 64- or even 256-bit, but this only gets you so far. The solution which [exozy] shows here involves using redundant computation by inverting the result of e^x. In a demonstration using Python code (which uses IEEE 754 double precision internally), this almost eradicates the error. Other than proving that floating point arithmetic is cursed, this also raises the question of why this works.

Continue reading “Making Floating Point Calculations Less Cursed When Accuracy Matters” →