Back To The 90s On Real Hardware

As the march of time continues on, it becomes harder and harder to play older video games on hardware. Part of this is because the original hardware itself wears out, but another major factor is that modern operating systems, software, and even modern hardware don’t maintain support for older technology indefinitely. This is why emulation is so popular, but purists that need original hardware often have to go to extremes to scratch their retro gaming itch. This project from [Eivind], for example, is a completely new x86 PC designed for the DOS and early Windows 98 era.

The main problem with running older games on modern hardware is the lack of an ISA bus, which is where the sound cards on PCs from this era were placed. This build uses a Vortex86EX system-on-module, which has a processor running a 32-bit x86 instruction set. Not only does this mean that software built for DOS can run natively on this chip, but it also has this elusive ISA capability. The motherboard uses a Crystal CS4237B chip connected to this bus which perfectly replicates a SoundBlaster card from this era. There are also expansion ports to add other sound cards, including ones with Yamaha OPL chips.

Not only does this build provide a native hardware environment for DOS-era gaming, but it also adds a lot of ports missing from modern machines as well including a serial port. Not everything needs to be original hardware, though; a virtual floppy drive and microSD card reader make it easy to interface minimally with modern computers and transfer files easily. This isn’t the only way to game on new, native hardware, though. Others have done similar things with new computers built for legacy industrial applications as well.

Thanks to [Stephen] for the tip!

Continue reading “Back To The 90s On Real Hardware”

The 386's main register bank, at the bottom of the datapath. The numbers show how many bits of the register can be accessed. (Credit: Ken Shirriff)

The Convoluted Way Intel’s 386 Implemented Its Registers

The fact that modern-day x86 processors still pretty much support the same operating systems and software as their ancestors did is quite a feat. Much of this effort had already been accomplished with the release of the 80386 (later 386) CPU in 1985, which was not only the first 32-bit x86 CPU, but was also backwards compatible with 8- and 16-bit software dating back to the 1970s. Making this work transparently was anything but straightforward, as [Ken Shirriff]’s recent analysis of the 80386’s main register file shows.

Labelled Intel 80386 die shot. (Credit: Ken Shirriff)
Labelled Intel 80386 die shot. (Credit: Ken Shirriff)

Using die shots of the 386’s registers and surrounding silicon, it’s possible to piece together how backwards compatibility was implemented. The storage cells of the registers are implemented using static memory (SRAM) as is typical, with much of the register file triple-ported (two read, one write).

Most interestingly is the presence of different circuits (6) to support accessing the register file for 8-, 16- or 32-bit writes and reads. The ‘shuffle’ network as [Ken] calls it is responsible for handling these distinct writes and reads, which also leads to the finding that the bottom 16 bits in the registers are actually interleaved to make this process work smoother.

Fortunately for Intel (and AMD) engineers, this feat wouldn’t have to be repeated again with the arrival of AMD64 and x86_64 many years later, when the 386’s mere 275,000 transistors on a 1 µm process would already be ancient history.

Want to dive even deeper in to the 386? This isn’t the first time [Ken] has looked at the iconic chip.

Mockup of a printed copy of the Little OS Book

One Book To Boot Them All

Somewhere in the universe, there’s a place that lists every x86 operating system from scratch. Not just some bootloaders, or just a kernel stub, but documentation to build a fully functional, interrupt-handling, multitasking-capable OS. [Erik Helin and Adam Renberg] did just that by documenting every step in The Little Book About OS Development.

This is not your typical dry academic textbook. It’s a hands-on, step-by-step guide aimed at hackers, tinkerers, and developers who want to demystify kernel programming. The book walks you through setting up your environment, bootstrapping your OS, handling interrupts, implementing virtual memory, and even tackling system calls and multitasking. It provides just enough detail to get you started but leaves room for exploration – because, let’s be honest, half the fun is in figuring things out yourself.

Completeness and structure are two things that make this book stand out. Other OS dev guides may give you snippets and leave you to assemble the puzzle yourself. This book documents the entire process, including common pitfalls. If you’ve ever been lost in the weeds of segmentation, paging, or serial I/O, this is the map you need. You can read it online or fetch it as a single 75-page long PDF.

Mockup photo source: Matthieu Dixte

Faster Integer Division With Floating Point

Multiplication on a common microcontroller is easy. But division is much more difficult. Even with hardware assistance, a 32-bit division on a modern 64-bit x86 CPU can run between 9 and 15 cycles. Doing array processing with SIMD (single instruction multiple data)  instructions like AVX or NEON often don’t offer division at all (although the RISC-V vector extensions do). However, many processors support floating point division. Does it make sense to use floating point division to replace simpler division? According to [Wojciech Mula] in a recent post, the answer is yes.

The plan is simple: cast the 8-bit numbers into 32-bit integers and then to floating point numbers. These can be divided in bulk via the SIMD instructions and then converted in reverse to the 8-bit result. You can find several code examples on GitHub.

Continue reading “Faster Integer Division With Floating Point”

Intel Terminates X86S Initiative After Formation Of New Industry Group

Although the world of the X86 instruction set architecture (ISA) and related ecosystem is often accused of being ‘stale’ and ‘bloated’, we have seen a flurry of recent activity that looks to shake up and set the future course for what is still the main player for desktop, laptop and server systems. Via Tom’s Hardware comes the news that the controversial X86S initiative is now dead and buried. We reported on this proposal when it was first announced and a whitepaper released. This X86S proposal involved stripping 16- and 32-bit features along with rings 1 and 2, along with a host of other ‘legacy’ features.

This comes after the creation of a new x86 advisory group that brings together Intel, AMD, as well as a gaggle of industry giants ranging from HP and Lenovo to Microsoft and Meta. The goal here appears to be to cooperate on any changes and new features in the ISA, which is where the unilateral X86S proposal would clearly have been a poor fit. This means that while X86S is dead, some of the proposed changes may still make it into future x86 processors, much like how AMD’s 64-bit extensions to the ISA, except this time it’d be done in cooperation.

In an industry where competition from ARM especially is getting much stronger these days, it seems logical that x86-oriented companies would seek to cooperate rather than compete. It should also mean that for end users things will get less chaotic as a new Intel or AMD CPU will not suddenly sneak in incompatible extensions. Those of us who remember the fun of the 1990s when x86 CPUs were constantly trying to snipe each other with exclusive features (and unfortunate bugs) will probably appreciate this.

Reverse-Engineering The AMD Secure Processor Inside The CPU

On an x86 system the BIOS is the first part of the system to become active along with the basic CPU core(s) functionality, or so things used to be until Intel introduced its Management Engine (IME) and AMD its AMD Secure Processor (AMD-SP). These are low-level, trusted execution environments, which in the case of AMD-SP involves a Cortex-A5 ARM processor that together with the Cryptographic Co-Processor (CCP) block in the CPU perform basic initialization functions that would previously have been associated with the (UEFI) BIOS like DRAM initialization, but also loading of encrypted (AGESA) firmware from external SPI Flash ROM. Only once the AMD-SP environment has run through all the initialization steps will the x86 cores be allowed to start up.

In a detailed teardown by [Specter] over at the Dayzerosec blog the AMD-SP’s elements, the used memory map  and integration into the rest of the CPU die are detailed, with a follow-up article covering the workings of the CCP. The latter is used both by the AMD-SP as well as being part of the cryptography hardware acceleration ISA offered to the OS. Where security researchers are interested in the AMD-SP (and IME) is due to the fascinating attack vectors, with the IME having been the most targeted, but AMD-SP having its own vulnerabilities, including in related modules, such as an injection attack against AMD’s Secure Encrypted Virtualization (SEV).

Although both AMD and Intel are rather proud of how these bootstrapping systems enable TPM, secure virtualization and so on, their added complexity and presence invisible to the operating system clearly come with some serious trade-offs. With neither company willing to allow a security audit, it seems it’s up to security researchers to do so forcefully.

A 64-bit X86 Bootloader From Scratch

For most people, you turn on your computer, and it starts the operating system. However, the reality is much more complex as [Thasso] discovered. Even modern x86 chips start in 16-bit real mode and there is a bit of fancy footwork required to shift to modern protected mode with full 64-bit support. Want to see how? [Thasso] shows us the ropes.

Nowadays, it is handy to develop such things because you don’t have to use real hardware. An emulator like QEMU will suffice. If you know assembly language, the process is surprisingly simple, although there is a lot of nuance and subtlety. The biggest task is setting up appropriate paging tables to control the memory mapping. In real mode, segments have access to fixed 64 K blocks of memory unless you use some tricks. But in protected mode, segments define blocks of memory that can be very small or cover the entire address space. These segments define areas of memory even though it is possible to set segments to cover all memory and — sort of — ignore them. You still have to define them for the switch to protected mode.

In the bad old days, you had more reason to worry about this if you were writing a DOS Extender or using some tricks to get access to more memory. But still good to know if you are rolling your own operating system. Why do the processors still boot into real mode? Good question.