diagram of the radicle node-to-node connectivity

Radicle: An Open-Source, Peer-to-Peer, GitHub Alternative

The actions of certain large social networks have recently highlighted how a small number of people possess significant power over the masses and how this power is sometimes misused. Consequently, there has been a surge in the development of federated (or decentralized) services, such as Mastodon and Matrix.  But what about development? While GitHub and similar services are less likely to be used for political manipulation, they are still centralized services with a common failure point. Radicle is an open-source, peer-to-peer collaboration stack built on top of Git but backed with public key cryptography as a standard and a gossip protocol to ensure widespread data sharing across the network and, thus, some fault tolerance.

Essentially, code and associated documentation are secured cryptographically with an identity. The Git protocol is used for actual data transfer from peer-to-peer, which means that updates are only sent as deltas, not complete copies, maximizing channel bandwidth efficiency. A custom gossip protocol is used for metadata transfer around the network of peers. The projects had a local-first ideology, with users running a full-stack node on their hardware and all features available, even offline, which is great for laptop users who move around locations with sporadic access to the internet.

Judging from their Zulipchat instance, this is a highly active space, so perhaps it is worth diving in and seeing if it floats your boat. Fancy getting onto the Fediverse, but only have a spare MS-DOS machine to try it on? We’ve got it covered. Want to use Git but not online? You need a private Git server. Finally, too much Git? How about Gitless?

Thanks [Anonymous] for the tip! No, that wasn’t lost on us :D

Making Floating Point Calculations Less Cursed When Accuracy Matters

Inverting the earlier exponentiation to reduce floating point arithmetic error. (Credit: exozy)
Inverting the earlier exponentiation to reduce floating point arithmetic error. (Credit: exozy)

An unfortunate reality of trying to represent continuous real numbers in a fixed space (e.g. with a limited number of bits) is that this comes with an inevitable loss of both precision and accuracy. Although floating point arithmetic standards – like the commonly used IEEE 754 – seek to minimize this error, it’s inevitable that across the range of a floating point variable loss of precision occurs. This is what [exozy] demonstrates, by showing just how big the error can get when performing a simple division of the exponential of an input value by the original value. This results in an amazing error of over 10%, which leads to the question of how to best fix this.

Obviously, if you have the option, you can simply increase the precision of the floating point variable, from 32-bit to 64- or even 256-bit, but this only gets you so far. The solution which [exozy] shows here involves using redundant computation by inverting the result of ex. In a demonstration using Python code (which uses IEEE 754 double precision internally), this almost eradicates the error. Other than proving that floating point arithmetic is cursed, this also raises the question of why this works.

Continue reading “Making Floating Point Calculations Less Cursed When Accuracy Matters”

The White House Memory Safety Appeal Is A Security Red Herring

In the Holy Programming Language Wars, the lingua franca of system programming – also known as C – is often lambasted for being unsecure, error-prone, and plagued with more types of behavior that are undefined than ones that are defined by the C standards. Many programming languages were said to be ‘C killers’, yet C is still alive today. That didn’t stop the US White House’s Office of the National Cyber Director (ONCD) from putting out a report in which both C and C++ got lambasted for being ‘unsafe’ when it came to memory management.

The full report (PDF) is pretty light on technical details, while citing only blog posts by Microsoft and Google as its ‘expert sources’. The claim that memory safety issues are the primary cause of CVEs is not substantiated, or at least ignores the severity of CVEs when looking at the CISA statistics for active exploits. Beyond this call for ‘memory safety’, the report then goes on to effectively call for more testing and validation, while kicking in doors that were opened back in the 1970s already with the Steelman requirements and the High Order Language Working Group (HOLWG) of 1975.

What truly is the impact and factual basis of the ONCD report?

Continue reading “The White House Memory Safety Appeal Is A Security Red Herring”

The ELIZA Archaeology Project: Uncovering The Original ELIZA

Since ELIZA was created by [Joseph Weizenbaum] in the 1960s, its success had led to many variations and ports being written over the intervening decades. The goal of the ELIZA Archaeology Project by Stanford, USC, Oxford and other university teams is to explore and uncover as much of this history as possible, starting with the original 1960s code. As noted in a recent blog post by [Anthony Hay], most of the intervening ‘ELIZA’ versions seem to have been more inspired by the original rather than accurate replicas or extensions of the original. This raises the question of what the original program really looked like, a question which wasn’t answered until 2020 when the original source code was rediscovered. Continue reading “The ELIZA Archaeology Project: Uncovering The Original ELIZA”

Filters Are In Bloom

If you are a fan of set theory, you might agree there are two sets of people who write computer programs: those who know what a Bloom filter is and those who don’t. How could you efficiently test to see if someone is one set or another? Well, you could use a Bloom filter.  [SamWho] takes us through the whole thing in general terms that you could apply in any situation.

The Bloom filter does perform a trade-off for its speed. It is subject to false positives but not false negatives. That is, if a Bloom filter algorithm tells you that X is not part of a set, it is correct. But if it tells you it is, you may have to investigate more to see if that’s true.

If it can’t tell you that something is definitely in a set, why bother? Usually, when you use a Bloom filter, you want to reduce searching through a huge amount of data. The example in the post talks about having a 20-megabyte database of “bad” URLs. You want to warn users if they enter one, but downloading that database is prohibitive. But a Bloom filter could be as small as 1.8 megabytes. However, there would be a 1 in 1000 chance of a false positive.

Increase the database size to 3.59 megabytes, and you can reduce false positives to one in a million. Presumably, if you got a positive, you could accept the risk it is false, or you could do more work to search further.

Imagine, for example, a web cache device or program. Many web pages are loaded one time and never again. If you cache all of them, you’ll waste a lot of time and push other things out of the cache. But if you test a page URL with a Bloom filter, you can improve things quite a bit. If the URL may exist in the Bloom filter, then you’ve probably seen it before, so you might want to cache it.

If it says you haven’t, you can add it to the filter so if it is ever accessed again, it will cache. Sure, sometimes a page will show a false positive. So what? You’ll just cache the page on the first time, which is what you did before, anyway. If that happens only 0.1% of the time, you still win.

In simple terms, the Bloom filter hashes each item using three different algorithms and sets bits in an array based on the result. To test an item, you compute the same hashes and see if any of the corresponding bits are set to zero. If so, the item can’t be in the set. Of course, there’s no assurance that all three bits being set means the set contains the item. Those three bits might be set for totally different items.

Why does increasing the number of bits help? The post answers that and looks at other optimizations like a different number of hash functions and counting.

The post does a great job of explaining the filter but if you want a more concrete example in C, you might want to read this post next. Or search for code in your favorite language. We’ve talked about Python string handling with Bloom filters before. We’ve even seen a proposal to add them to the transit bus.

WoWMIPS: A MIPS Emulator For Windows Applications

When Windows NT originally launched it had ports to a wide variety of platforms, ranging from Intel’s x86 and i860 to DEC’s Alpha as well as the MIPS architecture. Running Windows applications written for many of these platforms is a bit tricky these days, which [x86matthew] saw as a good reason to write a MIPS emulator. This isn’t just any old emulator, though. It maps 32-bit Windows applications targeted at the MIPS R4000 CPU to an x86 CPU instead. Since both platforms run in a little-endian, 32-bit mode, this theoretically should be a walk in the park.

The use of the Windows PE executable format is also the same, so the first task was to figure out how to load the MIPS PE binary in a way that made sense for an x86 platform. This involved some reverse-engineering of the MIPS ntdll.dll file to figure out how relocations on that platform were handled. Following this, the mapping of the instructions of the R4000 CPU to the (CISC) x86 ISA was pretty easy. Only Floating Point Unit (FPU) support was left as a future challenge. Memory access was left as direct access, meaning no sandboxing or isolation, for simplicity’s sake.

The final task was mapping the native API calls, which call almost directly into the underlying host Windows OS’s API, with a bit of glue logic. With all of this done, Windows NT applications originally written for 1990s MIPS ran just fine on a modern-day x86_64 PC running Windows — as long as you don’t need an FPU (for now).

Make Your Bookshelf Clickable

We’ll confess that we have a fondness for real books and plenty of them. So does [James], and he decided he needed a way to take a picture of his bookshelves and make each book clickable to find more information. This is one of those things that sounds fairly simple until you decide to do it. You can try an example of the results and then go back and read about the journey it took to get there.

There are several subtasks involved. First, you want to identify each book’s envelope. It wouldn’t do to click on the Joy of Cooking and get information about Remembrance of Things Past.

The next challenge is reading the title of the book. This can be tricky. Fonts differ. The book could be upside down. Some titles go cross the spine, but most go vertically. The remainder of the task is fairly easy. If you know the region and the title, you can easily find a link (for Google Books, in this case) and build an SVG overlay that maps the areas for each book to the right link.

Continue reading “Make Your Bookshelf Clickable”