A Brief History Of Unix Commands On Windows: CoreUtils (Again)

If you use Windows today and type ls, cat, grep, or awk in a terminal, there is a good chance something useful will happen. That was not always true. For most of the history of personal computing, Unix/Linux and Windows lived on opposite sides of a cultural border. Unix people had pipes, small composable tools, shell scripts, make, sed, awk, grep, tar, and the idea that everything was a file. Windows people had drive letters, backslashes, COMMAND.COM or cmd.exe, and an API that did not care much about what POSIX thought.

Yet there has always been a demand for Unix tools on Windows. Some of it came from programmers who wanted the same build scripts everywhere. Some came from administrators who missed grep and awk. Some came from companies trying to port big Unix applications to NT without rewriting them all. The result is a long, strange history of Unix-on-Windows layers, toolkits, compromises, and almost-but-not-quite compatibility.

Easy?

The simplest version of the problem sounds trivial. How hard can cat be? Open a file, copy bytes to standard output, done. Writing ls is a little more work, but Windows has directory APIs. Common commands like cp, mv, rm, mkdir are not very mysterious. Even pipes are not foreign to Windows. A lot of the everyday Unix command set can be ported as ordinary Win32 console programs with some path handling and enough patience.

But not all of Unix or Linux translates cleanly to Windows. The big issue is fork(). On Unix, a process can clone itself. The child gets a copy of the parent’s address space, open file descriptors, environment, signal state, and so on. Modern kernels make this efficient with copy-on-write memory, but the programming model is old and deeply baked into Unix. Shells use it constantly. Servers use it. Build systems use it. Scripting languages assume it exists, or at least that the surrounding environment behaves as though it does.

Windows process creation is different. Windows has CreateProcess(), which starts a new program. That is a perfectly reasonable model, but it is not fork() (more like fork()+exec()). If you are just launching notepad.exe, no problem. If you are trying to implement a POSIX shell that forks, redirects file descriptors, adjusts the environment, and then starts another program, the mismatch is extreme and you’ll have to do some strange things to fake things out.

One of the early commercial answers was the MKS Toolkit, originally from Mortice Kern Systems. MKS gave Windows users a pile of familiar commands, shells, and development tools. It was not just ls and friends; it included things like ksh, vi, grep, find, awk, make, and many of the utilities needed to move scripts and build procedures between Unix and Windows. The current PTC MKS documentation still describes it in exactly that spirit: Unix shells and hundreds of commands for interoperability with Windows.

MKS was attractive because it treated Windows as Windows. You were not necessarily pretending your machine was a Unix workstation. You were getting a Unix-flavored toolbox that could operate in a Windows environment. For many people, that was enough. You could write scripts, process text, drive builds, and avoid learning three different syntaxes for the same job.

Continue reading “A Brief History Of Unix Commands On Windows: CoreUtils (Again)”

Picking A CRC

You send a file, but how do you know it arrived intact? In other words, how do you know that it didn’t get cut off, garbled, or changed somehow? Simplistically, you could just add up all the bytes in the file — a checksum — and send that along with the file. You compute the checksum when you know the file is good, and the receiver can compare the checksum to see if they match.

However, a simple addition doesn’t catch certain classes of errors, which is why there are better checksum algorithms that, for example, wrap the carry bit around or otherwise modify files with common errors so they produce different checksums. There are two problems with checksums. First, no matter how much you modify the algorithm, the chances that two files produce the same checksum are pretty high. Especially with common error patterns.

For example, assume a very simple algorithm that simply adds the bytes and discards any carry. If a file contains 0x80, 0x80, those numbers essentially cancel each other out. If you replace them with 0, 0, you’ll get the same checksum. To some degree, using anything other than a second copy of the entire file will have this problem — some corruption goes undetected — but you want to minimize the number of times that happens.

The other problem is that a checksum by itself doesn’t let you correct anything. You know the data is bad, but you don’t know why. If you think about it, the simplest checksum is a parity bit on a byte: odd parity is simply summing all the bits together. If the parity bit doesn’t match, you know the byte is bad, but you don’t know why. Any even number of errors goes undetected, but I am sure one-, three-, five-, or seven-bit errors will get caught.

People invent better error-checking codes by devising schemes that can promise they can detect a certain number of bit flips and, at least in some cases, correct them. One of these is the cyclic redundancy check (CRC). It is easy to think of the CRC as a “strong checksum,” but it actually works differently. What’s more, there isn’t just a single CRC algorithm. You have to select or design a particular algorithm based on your needs. Most people pick a “named” implementation like CCITT or Ethernet and assume it must be the best. It probably isn’t.

A CRC is a checksum in the broad sense: you feed it a message, and it gives you a small value that you append, store, or compare later. But unlike a simple additive checksum, a CRC is based on polynomial division over GF(2), which is a fancy way of saying “divide using XOR instead of carries.” That detail matters. It gives CRCs very strong guarantees against common classes of errors, provided you choose the right polynomial for the job. That’s the key. You must choose the right polynomial.

Continue reading “Picking A CRC”

The Merits Of Comment-Driven Development As Counterweight To TDD

The world of software has seen many paradigms come and go, all of which were supposed to revolutionize its development. Still, one of the basic tenets in engineering of there being no shortcuts to just doing the work properly also rings true in the field of software engineering: trying to skip ‘nice to haves’ like proper documentation, code formatting, and proper testing inevitably results in developers nervously trying to ignore the looming avalanche of technical and other project debts as they keep piling up.

While Test-Driven Development (TDD) once got praised as the silver bullet, the principle of writing tests before writing code merely postpones the inevitable project collapse. The elephant in the room is that you cannot pass on the basics in engineering and expect to come out fine on the other end. There’s a reason why phrases like “all tests green, successfully failed in production” have become common.

This is where the concept of Comment-Driven Development (CDD) comes into play. What started as a bit of a joke many years ago stuck in my mind and led me to my current approach in software development that tries to effectively mirror solid engineering principles.

Continue reading “The Merits Of Comment-Driven Development As Counterweight To TDD”

NASA Announces Artemis III Crew And Ambitious Goals

When the Artemis lunar program was first conceived, the third mission would have seen astronauts step foot on the Moon for the first time since Apollo 17 in 1972. But as hard as getting into space is, a sojourn to our nearest celestial neighbor is even more mindbogglingly complex, and so earlier this year it was announced that actually landing on the Moon would be pushed out to the fourth mission.

In turn Artemis III would take a page out of the Apollo 9 playbook and test out rendezvous and docking procedures with commercial landers while operating in the relative safety of low Earth orbit. Moving the target date for the landing a few years down the road gave all involved parties a little more breathing room, but it also provided a valuable opportunity to gain insight into the performance of the vehicles and systems ahead of the critical moment. In the original timeline, the first time Orion would attempt to dock with the lander would have been just before descending to the lunar surface — leaving precious little time to troubleshoot should anything go wrong.

Yesterday NASA held a press conference to update the public on their progress towards the planned 2027 launch of Artemis III, which included the long-awaited announcement of the crew that will kick the tires on the next-generation lunar landers being developed by SpaceX and Blue Origin

Continue reading “NASA Announces Artemis III Crew And Ambitious Goals”

Revisiting Using AI Coding Assistants: You’re Holding It Wrong Edition

After scathing accusations of skimping on due diligence, as well as other feedback to my article on trying to use an ‘AI coding assistant’ for the first time, the only rational, academic response is to lick one’s wounds following a particularly bruising peer review and try to address the raised issues. Reality after all does not care about one’s feelings, and there may be more to this AI assistant technology that can be coaxed out with a more in-depth look.

To this end I’ll do my best to try and work through each raised point, criticism and accusation, to see what I – and perhaps others – can learn of this endeavor. Said points include the use of the wrong frontend – i.e. Copilot – and the wrong model – being Claude Haiku 4.5 – as well as the egregious flaw on my end of ‘prompting wrong’.

For the sake of due diligence the best frontend and models will be investigated for particular tasks, with finally the verbal minefield of ‘prompt engineering’ examined for industry-standard approaches.

Continue reading “Revisiting Using AI Coding Assistants: You’re Holding It Wrong Edition”

Hunting Submarines Via Gravity Is A Tough Errand

Among so many other technological advances, the Cold War saw the advent of the ballistic missile submarine. The concept was simple—pack enough nuclear warheads to destroy a small civilization into a compact metal tube, and then hide it underwater. The oceans would act as a cloak for your fleet of world-enders, and keep your enemies forever on their toes. A terrifying machine that could both start and end a war with the push of a button.

Most nation states are populated by humans with the will to live. Thus, there has been a great incentive to find ways to keep tabs on these sunken doombringers. Great efforts have gone into improving sonar and magnetic detection methods over the decades, which are the bread and butter of sub hunting to this day. However, military researchers have also explored the prospect of whether submarines could be detected via their effect on the gravitational field alone.

Continue reading “Hunting Submarines Via Gravity Is A Tough Errand”

Remember When Flash Drives Were Going To Make Your PC Faster?

The 2000s was a decade of great change in the computer industry. The world had grown accustomed to corruptible floppy disks, blue screens of death, and achingly slow load times. In a few short years, all of that would change, as USB drives, better operating systems, and faster processors brought forth a new age of stability and speed.

Amidst this era of upheaval, Microsoft introduced a new technology. It was intended to increase performance on the cheap to a new generation of machines, but it would turn out to be little more than a gimmick that never really caught on. Let’s explore the easily-forgotten legacy of ReadyBoost.

Continue reading “Remember When Flash Drives Were Going To Make Your PC Faster?”