Reverse-Engineering The AMD Secure Processor Inside The CPU

On an x86 system the BIOS is the first part of the system to become active along with the basic CPU core(s) functionality, or so things used to be until Intel introduced its Management Engine (IME) and AMD its AMD Secure Processor (AMD-SP). These are low-level, trusted execution environments, which in the case of AMD-SP involves a Cortex-A5 ARM processor that together with the Cryptographic Co-Processor (CCP) block in the CPU perform basic initialization functions that would previously have been associated with the (UEFI) BIOS like DRAM initialization, but also loading of encrypted (AGESA) firmware from external SPI Flash ROM. Only once the AMD-SP environment has run through all the initialization steps will the x86 cores be allowed to start up.

In a detailed teardown by [Specter] over at the Dayzerosec blog the AMD-SP’s elements, the used memory map  and integration into the rest of the CPU die are detailed, with a follow-up article covering the workings of the CCP. The latter is used both by the AMD-SP as well as being part of the cryptography hardware acceleration ISA offered to the OS. Where security researchers are interested in the AMD-SP (and IME) is due to the fascinating attack vectors, with the IME having been the most targeted, but AMD-SP having its own vulnerabilities, including in related modules, such as an injection attack against AMD’s Secure Encrypted Virtualization (SEV).

Although both AMD and Intel are rather proud of how these bootstrapping systems enable TPM, secure virtualization and so on, their added complexity and presence invisible to the operating system clearly come with some serious trade-offs. With neither company willing to allow a security audit, it seems it’s up to security researchers to do so forcefully.

A 64-bit X86 Bootloader From Scratch

For most people, you turn on your computer, and it starts the operating system. However, the reality is much more complex as [Thasso] discovered. Even modern x86 chips start in 16-bit real mode and there is a bit of fancy footwork required to shift to modern protected mode with full 64-bit support. Want to see how? [Thasso] shows us the ropes.

Nowadays, it is handy to develop such things because you don’t have to use real hardware. An emulator like QEMU will suffice. If you know assembly language, the process is surprisingly simple, although there is a lot of nuance and subtlety. The biggest task is setting up appropriate paging tables to control the memory mapping. In real mode, segments have access to fixed 64 K blocks of memory unless you use some tricks. But in protected mode, segments define blocks of memory that can be very small or cover the entire address space. These segments define areas of memory even though it is possible to set segments to cover all memory and — sort of — ignore them. You still have to define them for the switch to protected mode.

In the bad old days, you had more reason to worry about this if you were writing a DOS Extender or using some tricks to get access to more memory. But still good to know if you are rolling your own operating system. Why do the processors still boot into real mode? Good question.

Comparing X86 And 68000 In An FPGA

[Michael Kohn] started programming on the Motorola 68000 architecture and then, for work reasons, moved over to the Intel x86 and was not exactly pleased by the latter chip’s perceived shortcomings. In the ’80s, the 68000 was a very popular chip, powering everything from personal computers to arcade machines, and looking at its architecture and ease of programming, you can see why this was.

Fast-forward a few years, and [Michael] decided to implement both cores in an FPGA to compare real applications, you know, for science. As an extra bonus, he also compares the performance of a minimal RISC-V implementation on the same hardware, taken from an earlier RISC-V project (which you should also check out !)

Utilizing their ‘Java Grinder’ application (also pretty awesome, especially the retro console support), a simple Mandelbrot fractal generator was used as a non-trivial workload to produce binaries for each architecture, and the result was timed. Unsurprisingly, for CISC architectures, the 68000 and x86 code sizes were practically identical and significantly smaller than the equivalent RISC-V. Still, looking at the execution times, the 68000 beat the x86 hands down, with the newer RISC-V speeding along to take pole position. [Michael] admits that these implementations are minimal, with no pipelining, so they could be sped up a little.

Also, it’s not a totally fair race. As you’ll note from the RISC-V implementation, there was a custom RISC-V instruction implemented to perform the Mandelbrot generator’s iterator. This computes the complex operation Z = Z2 + C, which, as fellow fractal nerds will know, is where a Mandelbrot generator spends nearly all the compute time. We suspect that’s the real reason RISC-V came out on top.

If actual hardware is more your cup of tea, you could build a minimal 68k system pretty easily, provided you can find the chips. The current ubiquitous x86 architecture, as odd as it started out, is here to stay for the foreseeable future, so you’d just better get comfortable with it!

Continue reading “Comparing X86 And 68000 In An FPGA”

A System Board For The 8008

Intel processors, at least for PCs, are ubiquitous and have been for decades. Even beyond the chips specifically built by Intel, other companies have used their instruction set to build chips, including AMD and VIA, for nearly as long. They’re so common the shorthand “x86” is used for most of these processors, after Intel’s convention of naming their processors with an “-86” suffix since the 1970s. Not all of their processors share this convention, though, but you’ll have to go even further back in time to find one. [Mark] has brought one into the modern age and is showing off his system board for this 8008 processor.

The 8008 predates any x86 processor by about six years and was among the first mass-produced 8-bit processors even before the well-known 8080. The expansion from four bits to eight was massive for the time and allowed a much wider range of applications for embedded systems and early personal computers. [Mark] goes into some of the details for programming these antique processors before demonstrating his system board. It gets power from a USB-C connection and uses a set of regulators and level shifters to make sure the voltages all match. Support for all the functions the 8008 needs is courtesy of an STM32. That includes the system memory.

For those looking to develop something like this, [Mark] has also added his development tools to a separate GitHub page. Although it’s always a good idea for those interested in computer science to take a look at old processors like these, it’s not always the easiest path to get original hardware like this, which also carries the risk of letting smoke out of delicate components. A much easier route is to spin up an emulator like an 8086 IBM PC emulator on an ESP32. Want to see inside this old chip? Have a look.

Continue reading “A System Board For The 8008”

BASIC Classroom Management

While we don’t see it used very often these days, BASIC was fairly revolutionary in bringing computers to the masses. It was one of the first high-level languages to catch on and make computers useful for those who didn’t want to (or have time) to program them in something more complex. But that doesn’t mean it wasn’t capable of getting real work done — this classroom management software built in the language illustrates its capabilities.

Written by [Mike Knox], father of [Ethan Knox] aka [norton120], for his classroom in 1987, the programs were meant to automate away many of the drudgeries of classroom work. It includes tools for generating random seating arrangements, tracking attendance, and other direct management tasks as well as tools for the teacher more directly like curving test grades, tracking grades, and other tedious tasks that normally would have been done by hand at that time. With how prevalent BASIC was at the time, this would have been a powerful tool for any educator with a standard desktop computer and a floppy disk drive.

Since most people likely don’t have an 80s-era x86 machine on hand capable of running this code, [Ethan] has also included a docker container to virtualize the environment for anyone who wants to try out his father’s old code. We’ve often revisited some of our own BASIC programming from back in the day, as our own [Tom Nardi] explored a few years ago.

Why X86 Needs To Die

As I’m sure many of you know, x86 architecture has been around for quite some time. It has its roots in Intel’s early 8086 processor, the first in the family. Indeed, even the original 8086 inherits a small amount of architectural structure from Intel’s 8-bit predecessors, dating all the way back to the 8008. But the 8086 evolved into the 186, 286, 386, 486, and then they got names: Pentium would have been the 586.

Along the way, new instructions were added, but the core of the x86 instruction set was retained. And a lot of effort was spent making the same instructions faster and faster. This has become so extreme that, even though the 8086 and modern Xeon processors can both run a common subset of code, the two CPUs architecturally look about as far apart as they possibly could.

So here we are today, with even the highest-end x86 CPUs still supporting the archaic 8086 real mode, where the CPU can address memory directly, without any redirection. Having this level of backwards compatibility can cause problems, especially with respect to multitasking and memory protection, but it was a feature of previous chips, so it’s a feature of current x86 designs. And there’s more!

I think it’s time to put a lot of the legacy of the 8086 to rest, and let the modern processors run free. Continue reading “Why X86 Needs To Die”

X86 ENTER: What’s That Second Parameter?

[Raymond Chen] wondered why the x86 ENTER instruction had a strange second parameter that seems to always be set to zero. If you’ve ever wondered, [Raymond] explains what he learned in a recent blog post.

If you’ve ever taken apart the output of a C compiler or written assembly programs,  you probably know that ENTER is supposed to set up a new stack frame. Presumably, you are in a subroutine, and some arguments were pushed on the stack for you. The instruction puts the pointer to those arguments in EBP and then adjusts the stack pointer to account for your local variables. That local variable size is the first argument to ENTER.

The reason you rarely see it set to a non-zero value is that the final argument is made for other languages that are not as frequently seen these days. In a simple way of thinking, C functions live at a global scope. Sure, there are namespaces and methods for classes and instances. But you don’t normally have a C compiler that allows a function to define another function, right?

Turns out, gcc does support this as an extension (but not g++). However, looking at the output code shows it doesn’t use this feature, but it could. The idea is that a nested function can “see” any local variables that belong to the enclosing function. This works, for example, if you allow gcc to use its extensions:

#include <stdio.h>

void test()
{
   int a=10;
   /* nested function */
   void testloop(int n)
   {
      int x=a;
      while(n--) printf("%d\n",x);
   }
   testloop(3);
   printf("Again\n");
   testloop(2);
   printf("and now\n");
   a=33;
   testloop(5);
}

void main(int argc, char*argv[])
{
   test();
}

Continue reading “X86 ENTER: What’s That Second Parameter?”