Fitting A Spell Checker Into 64 KB

By some estimates, the English language contains over a million unique words. This is perhaps overly generous, but even conservative estimates generally put the number at over a hundred thousand. Regardless of where the exact number falls between those two extremes, it’s certainly many more words than could fit in the 64 kB of memory allocated to the spell checking program on some of the first Unix machines. This article by [Abhinav Upadhyay] takes a deep dive on how the early Unix engineers accomplished the feat despite the extreme limitations of the computers they were working with.

Perhaps the most obvious way to build a spell checker is by simply looking up each word in a dictionary. With modern hardware this wouldn’t be too hard, but disks in the ’70s were extremely slow and expensive. To move the dictionary into memory it was first whittled down to around 25,000 words by various methods, including using an algorithm to remove all affixes, and then using a Bloom filter to perform the lookups. The team found that this wasn’t a big enough dictionary size, and had to change strategies to expand the number of words the spell checker could check. Hash compression was used at first, followed by hash differences and then a special compression method which achieved an almost theoretically perfect compression.

Although most computers that run spell checkers today have much more memory as well as disks which are orders of magnitude larger and faster, a lot of the innovation made by this early Unix team is still relevant for showing how various compression algorithms can be used on data in general. Large language models, for one example, are proving to be the new frontier for text-based data compression.

Booting A Desktop PDP-11

Ever heard of VENIX? There were lots of variants of Unix back in the day, and VENIX was one for the DEC Professional 380, which was — sort of — a PDP 11. The 1982 machine normally ran the unfortunately (but perhaps aptly) named P/OS, but you could get VENIX, too. [OldVCR] wanted to put one of these back online and decided the ST-506 hard drive was too risky. A solid-state drive upgrade and doubling the RAM to a whole megabyte was the plan.

It might seem funny to think of a desktop workstation that was essentially a PDP-11 minicomputer, but in the rush to corner the personal computer market, many vendors did the same thing: shrinking their legacy CPUs. DEC had a spotty history with small computers. [Ken Olsen] didn’t think anyone would ever want a personal computer, and the salespeople feared that cheap computers would eat into traditional sales. The Professional 350 was born out of DEC’s efforts to catch up, as [OldVCR] explains. He grabbed this one from a storage unit about to be emptied for scrap.

The post is very long, but you get a lot of history and a great look inside this vintage machine. Of course, the PDP-11 couldn’t actually handle more than 64K without tricks and you’ll learn more about that towards the end of the post, too.

Just as a preview, the story has a happy ending, including a surprising expression of gratitude from the aging computer. DEC didn’t enjoy much success in the small computer arena, eventually being bought by Compaq, which, in turn, was bought by Dell HP. During their heyday, this would have been unthinkable.

The PDP/11 did have some success because it was put on a chip that ended up in several lower-end machines, like the Heathkit H11. Ever wonder how people programmed the PDP computers with switches and lights?

UNIX Archaeology Turns Up 1972 “V2 Beta”

In 1997 a set of DEC tapes were provided by Dennis Ritchie, as historical artifacts for those interested in the gestation of the UNIX operating system. The resulting archive files have recently been analysed by [Yfeng Gao], who has succeeded in recovering a working UNIX version from 1972. What makes it particularly interesting is that this is not a released version, instead it’s a work in progress sitting somewhere between versions 1 and 2. He’s therefore taken the liberty of naming it “V2 Beta”.

If you happen to have a PDP-11/20 you should be able to run this operating system for yourself, and for those of us without he’s provided information on which emulator will work. The interesting information for us comes in the README accompanying the tapes themselves, and in those accompanying the analysis. Aside from file fragments left over from previous users of the same tape, we learn about the state of UNIX time in 1972. This dates from the period when increments were in sixtieths of a second due to the ease of using the mains power frequency in a PDP, so with a 32-bit counter they were facing imminent roll-over. The 1970-01-01 epoch and one second increments would be adopted later in the year, but meanwhile this is an unusual curio.

If you manage to run this OS, and especially if you find anything further in the files, we’d love to hear. Meanwhile, this is not the oldest UNIX out there.

Featured image: “PDP-11/20 Rocker Switches” by Don DeBold

Grep By Example is also available as a PDF Minibook, and a Grep playground helps you learn quickly.

Galvanize Your Grip On Grep With This Great Grep Guide

These days, you can’t throw a USB stick without hitting something that’s running Linux. It might be a phone, an embedded device, or your TV. Either way, it’s running Linux, and somewhere along the line of the development of whatever your USB stick smacked into, somebody used the Global Regular Expression Print utility- better known as Grep. But what is Grep, and why do you need it? [Anton Zhiyanov] not only answers those questions but provides Grep by example: Interactive Guide to help you along.

Grep By Example is also available as a PDF Minibook, and a Grep playground helps you learn quickly.
Grep By Example is also available as a PDF Minibook, and a Grep playground helps you learn quickly.

To understand Linux, one must understand its commercial predecessor, Unix. One of the things that made Unix (and then Linux) unique was its philosophy: Write programs that work together, do one thing well, and handle text streams.  This philosophy describes a huge number of programs, and one of these programs is Grep. It’s installed everywhere there’s a *nix installed, and once one becomes familiar with it, their command-line-fu reaches an all new level.

At its core, Grep is simply a bloodhound. It’s scent? A magical incantation called Regular Expressions. Regular Expressions (aka Regex) are simply a way of describing what a stream of text should look like. So when you feed Grep a bit of Regular Expression, it Prints only the text that matches that expression. Neat, right?

The trouble is that Regex can be kind of hard, and Grep has various versions and capabilities that need to be learned. And this is where the article shines- it covers both in an excellent interactive tutorial that’ll help you become a Grep Guru in no time. And if you want to do a deeper dive, check out what it takes to make your own Regex Engine from scratch!

Running UNIX On A Nintendo Entertainment System

Who wouldn’t want to run a UNIX-like operating system on their NES or Famicom? Although there’s arguably no practical reason for doing so, [decrazyo] has cobbled together a working port of Little Unix (LUnix), which was originally written for the Commodore 64 and 128 by [Daniel Dallmann]. The impetus for this project was initially curiosity, but when [decrazyo] saw that someone had already written a UNIX-like OS for the 6502 processor, it seemed apparent that the NES was too similar to the C64 to not port it.

Much of this is relatively straightforward, as the 6502 MPU in the C64 is nearly identical to the Ricoh 2A03 in the NES, with the latter missing the binary-coded decimal support, which is not a crucial feature. The only significant roadblock was the lack of RAM in the NES. The console has a mere 2 KB of RAM and 2 KB of VRAM, which made it look anemic even next to the C64. Here, a Japan-only accessory came to the rescue: the Famicom Disk System (FDS), which is a proprietary floppy disk-based system that slots into the bottom of the Famicom and was used for games as well as storing saves back in the day.

By using a Famicom with FDS, it was possible to gain an additional 32 kB provided by the FDS, making the userspace utilities available in the shell. The fruits of this labor work well enough that he could also pop it up on an EverDrive cartridge that supports FDS ROMs and boot it up on an unmodified NES. Whether this is cooler than the NES-OS, which we covered previously, is up for debate.

Incidentally, [Maciej Witkowiak] seems to have resumed development on LUnix, with a new release in 2023, so maybe UNIX-on-6502 may see a revival after a few decades of little happening.

Continue reading “Running UNIX On A Nintendo Entertainment System”

Check Out This PDP-11 Running Unix With A Teletype Terminal

If you’ve spent a few years around Hackaday, you’ve probably seen or heard of the DEC PDP-11 before. It was one of the great machines of the minicomputer era, back when machines like the Apple ][ and the Commodore 64 weren’t even a gleam in their creator’s eyes. You’ve also probably heard of Unix, given that so many of us use Linux on the regular. Well, now you can see them both in action, as [HappyComputerGuy] fires up real Unix on a real PDP-11/73… with a real Teletype Model 33 to boot!

It’s a fascinating dive into the tech of yesteryear, with a rich dose of history to boot. It’s mindboggling to think that video terminals were once prohibitively expensive and that teletype printers were the norm for interacting with computers. The idea of interacting with a live machine via a printed page is alien, but it’s how things were done! We’re also treated to a lesson on how to boot the PDP-11 with 2.11BSD which is a hilariously manual process. It also takes a very long time. [HappyComputerGuy] then shows off the Teletype Model 33 rocking the banner command to great effect.

It’s awesome to see this hardware as it would really have been used back in its heyday. Computing really was different before the microcomputer format became mainstream. It’s not the only PDP-11 we’ve seen lately, either! Video after the break.

Continue reading “Check Out This PDP-11 Running Unix With A Teletype Terminal”

Apple System 7… On Solaris?

While the Unix operating systems Solaris and HP-UX are still in active development, they’re not particularly popular anymore and are mostly relegated to some enterprise and data center environments They did enjoy a peak of popularity in the 90s during the “wild west” era of windowed operating systems, though. This was a time when there were more than two mass-market operating systems commercially available, with many companies fighting for market share. This led to a number of efforts to get software written for one operating system to run on others, whether that was simply porting software directly or using some compatibility layer. Surprisingly enough it was possible in this era to run an entire instance of Mac System 7 within either of these two Unix operating systems, and this was an officially supported piece of Apple software.

The software was called the Macintosh Application Environment (MAE), and was an effort by Apple to bring Macintosh System 7 applications to various Unix-based operating systems, including Solaris and HP-UX. This was a time before Apple’s OS was Unix-compliant, and MAE provided a compatibility layer that translated Macintosh system calls and application programming interfaces (APIs) into the equivalent Unix calls, allowing Mac software to function within the Unix environments. [Lunduke] outlines a lot of the features of this in his post, including some of the details the “scaffolding” allowing the 68k processor to be emulated efficiently on the hardware of the time, the contents of the user manual, and even the memory management and layout.

What’s really jarring to anyone only familiar with Apple’s modern “walled garden” approach is that this is an Apple-supported compatibility layer for another system. At the time, though, they weren’t the technology giant they are today and had to play by a different set of rules to stay viable. Quite the opposite, in fact: they almost went out of business in the mid-90s, so having their software run on as many machines as possible would have been a perk at the time. While this era did have major issues with cross-platform compatibility, there was some software that attempted to solve these problems that are still in active development today.

Thanks to [Stephen] for the tip!