Query Your C Code

If you’ve ever worked on a large project — your own or a group effort — you know it can be difficult to find exactly where you want to be in the source code. Sure, you can use ctags and most other editors have some way of searching for things. But ClangQL from [AmrDeveloper] lets you treat your code base like a database.

Honestly, we’ve often thought about writing something that parses C code and stuffs it into a SQL database. This tool leverages the CLang parser and lets you write queries like:

SELECT * FROM functions

That may not seem like the best example, but how about:

SELECT COUNT(name) FROM functions WHERE return_type="int"

That’s a bit more interesting. The functions table provides each function’s name, signature, a count of arguments, a return type, and a flag to indicate methods. We hope the system will grow to let you query on other things, too, like variables, templates, preprocessor defines, and data types. The tool can handle C or C++ and could probably work with other CLang front ends with a little work.

Continue reading “Query Your C Code”

Fortran And WebAssembly: Bringing Zippy Linear Algebra To NodeJS & Browsers

With the rise of WebAssembly (wasm) it’s become easier than ever to run native code in a browser. As mostly just another platform to target, it would be remiss if Fortran was not a part of this effort, which is why a number of projects have sought to get Fortran supported on wasm.

For the ‘why’, [George Stagg] makes the point that software packages like BLAS and LAPACK for Fortran are still great for scientific computing, while the ‘how’ is a bit more hairy, but getting better courtesy of the still-in-development LLVM front-end for Fortran (flang-new). Using it for wasm is not straightforward yet, due to the lack of a wasm32 target, but as [George] demonstrates, this is easily patched around.

We reported on Fortran and wasm back in 2016, with things having changed somewhat in the intervening eight years (yes, that long). The Fortran-to-C translator utility (f2c) is effectively EOL, while LFortran is coming along but still missing many features. The Dragonegg GCC-frontend-for-LLVM project was the best shot in 2020 for Fortran and WebAssembly, but obsolete now. Classic Flang has been in LLVM for a while, but is to be replaced with what is now called flang-new. The wish by [George] is now to find a way to get his patched flang-new code for wasm support into the project.

In the article, the diff for patching the flang-new toolchain to target wasm is provided. During compilation of the standard Fortran runtime it was then found that the flang-new code assumes that target system sizeof() results are identical to those of the host system, which of course falls flat for wasm32. One more patch (or hardcoded hack, rather) later the ‘Hello World’ example in Fortran was up and running, clearing the way to build the BLAS (Basic Linear Algebra Subprograms) and LAPACK (Linear Algebra Package) libraries and create a few example projects in Fortran-for-wasm32 which uses them.

The advantage of being able to use extremely well-optimized software packages like these when limited to a browser environment should be obvious, in addition to the benefit of using existing codebases. It is certainly [George]’s hope that flang-new will soon officially support wasm (32 and 64-bit) as targets, and he actively seeks help with making this a reality.

Jpegli: Google’s Better JPEG And Possible Death Knell For WebP

Along with the rise of the modern World Wide Web came the introduction of the JPEG image compression standard in 1992, allowing for high-quality images to be shared without requiring the download of a many-MB bitmap file. Since then a number of JPEG replacements have come and gone – including JPEG 2000 – but now Google reckons that it can improve JPEG with Jpegli, a new encoder and decoder library that promises to be highly compatible with JPEG.

To be fair, it’s only the most recent improvement to JPEG, with JPEG XT being introduced in 2015 and JPEG XL in 2021 to mostly deafening silence, right along that other, better new image format people already forgot about: AVIF. As Google mentions in their blog post, Jpegli uses heuristics developed for JPEG XL, making it more of a JPEG XL successor (or improvement). Compared to JPEG it offers a higher compression ratio, 10+-bit support which should reduce issues like banding. Jpegli images are said to be compatible with existing JPEG decoders, much like JPEG XT and XL images.

Based on the benchmarks from 2012 by [Blobfolio] between JPEG XL, AVIF and WebP, it would seem that if Jpegli incorporates advancements from AVIF while maintaining compatibility with JPEG decoders out there, it might be a worthy replacement of AVIF and WebP, while giving JPEG a shot in the arm for the next thirty-odd years of dominating the WWW and beyond.

Exploit The Stressed-out Package Maintainer, Exploit The Software Package

A recent security vulnerability — a potential ssh backdoor via the liblzma library in the xz package — is having a lot of analysis done on how the vulnerability was introduced, and [Rob Mensching] felt that it was important to highlight what he saw as step number zero of the whole process: exploit the fact that a stressed package maintainer has burned out. Apply pressure from multiple sources while the attacker is the only one stepping forward to help, then inherit the trust built up by the original maintainer. Sadly, [Rob] sees in these interactions a microcosm of what happens far too frequently in open source.

Maintaining open source projects can be a high stress activity. The pressure and expectations to continually provide timely interaction, support, and updates can easily end up being unhealthy. As [Rob] points out (and other developers have observed in different ways), this kind of behavior just seems more or less normal for some projects.

The xz/liblzma vulnerability itself is a developing story, read about it and find links to the relevant analyses in our earlier coverage here.

Is Your Mental Model Of Bash Pipelines Wrong?

[Michael Lynch] encountered a strange situation. Why was compiling then running his program nearly 10x faster than just running the program by itself? [Michael] ran into this issue while benchmarking a programming project, pared it down to its essentials for repeatability and analysis, and discovered it highlighted an incorrect mental model of how bash pipelines worked.

Here’s the situation. The first thing [Michael]’s pared-down program does is start a timer. Then it simply reads and counts some bytes from stdin, then prints out how long it took for that to happen. When running the test program in the following way, it takes about 13 microseconds.

$ echo '00010203040506070809' | xxd -r -p | zig build run -Doptimize=ReleaseFast
bytes: 10
execution time: 13.549µs

When running the (already-compiled) program directly, execution time swells to 162 microseconds.

$ echo '00010203040506070809' | xxd -r -p | ./zig-out/bin/count-bytes
bytes: 10
execution time: 162.195µs

Again, the only difference between zig build run and ./zig-out/bin/count-bytes is that the first compiles the code, then immediately runs it. The second simply runs the compiled program. Continue reading “Is Your Mental Model Of Bash Pipelines Wrong?”

Hybrid Binaries On Windows For ARM: ARM64EC And ARM64X Explained

With ARM processors increasingly becoming part of the desktop ecosystem, porting code that was written for x86_64 platforms is both necessary and a massive undertaking. For many codebases a simple recompile may be all it takes, but where this is not straightforward Microsoft’s ARM64EC (for ‘Emulator Compatible’) Application Binary Interface (ABI) provides a transition path. Unlike Apple’s ‘Fat Binaries’, this features hybrid PE executables (ARM64 eXtended, or ARM64X) that run mixed ARM64EC and x86_64 binary code on Windows 11 ARM systems. An in-depth explanation is provided by one of the authors, [Darek Mihocka].

ARM64EC was announced by Microsoft on June 28, 2021 as a new feature in Windows 11 for ARM, with more recently Qualcomm putting it forward during the 2024 Game Developers Conference (GDC) as one reason why high-performance gaming on its Snapdragon SoCs should be much easier than often assumed. Naturally, this assumes that Windows 11 is being used, as it contains the x86_64 emulator with ARM64EC support. The major difference between plain ARMv8 and ARM64EC code is that the latter has changes on an ABI level to e.g. calling conventions that ease interoperability between emulated x86_64 and ARM64 code.

Although technologically impressive, Windows 11’s marketshare is still rather small, even before looking at Windows 11 on ARM. It’ll be interesting to see whether Qualcomm’s bravado comes to fruition, and make ARM64EC more relevant for the average software developer.

Grep By Example is also available as a PDF Minibook, and a Grep playground helps you learn quickly.

Galvanize Your Grip On Grep With This Great Grep Guide

These days, you can’t throw a USB stick without hitting something that’s running Linux. It might be a phone, an embedded device, or your TV. Either way, it’s running Linux, and somewhere along the line of the development of whatever your USB stick smacked into, somebody used the Global Regular Expression Print utility- better known as Grep. But what is Grep, and why do you need it? [Anton Zhiyanov] not only answers those questions but provides Grep by example: Interactive Guide to help you along.

Grep By Example is also available as a PDF Minibook, and a Grep playground helps you learn quickly.
Grep By Example is also available as a PDF Minibook, and a Grep playground helps you learn quickly.

To understand Linux, one must understand its commercial predecessor, Unix. One of the things that made Unix (and then Linux) unique was its philosophy: Write programs that work together, do one thing well, and handle text streams.  This philosophy describes a huge number of programs, and one of these programs is Grep. It’s installed everywhere there’s a *nix installed, and once one becomes familiar with it, their command-line-fu reaches an all new level.

At its core, Grep is simply a bloodhound. It’s scent? A magical incantation called Regular Expressions. Regular Expressions (aka Regex) are simply a way of describing what a stream of text should look like. So when you feed Grep a bit of Regular Expression, it Prints only the text that matches that expression. Neat, right?

The trouble is that Regex can be kind of hard, and Grep has various versions and capabilities that need to be learned. And this is where the article shines- it covers both in an excellent interactive tutorial that’ll help you become a Grep Guru in no time. And if you want to do a deeper dive, check out what it takes to make your own Regex Engine from scratch!