Sometimes you might need to use assembly sometime to reach your project objectives. Previously I’ve focused more on embedding assembly within gcc or another compiler. But just like some people want to hunt with a bow, or make bread by hand, or do many other things that are no longer absolutely necessary, some people like writing in assembly language.
In the old days of DOS, it was fairly easy to write in assembly language. Good thing, because on the restricted resources available on those machines it might have been the only way to get things to fit. These days, under Windows or Linux or even on a Raspberry Pi, it is hard to get oriented on how to get an assembly language off the ground.
What Do You Need?
Obviously, one thing you need is an assembler. Granted, if you are a true macho hacker you could just use a hex editor to build executable files by hand from your binary notation, but in reality that is probably too much for just about everyone. You also need to understand your CPU architecture, the instruction set, the mnemonics the assembler uses for the instructions, and the interface to the operating system.
In this example, we’ll look at writing code for Linux on a PC. The same ideas will apply to other Linux platforms like the Raspberry Pi. The details will be different for other systems (like the AVR found in an Arduino) but the basic outline of steps will be the same. Of course, some systems don’t have an operating system at all, which makes it both easier and harder. Easier because you don’t have to conform. Harder, because you have to find resources (like a serial port or display) and handle them yourself.
Generally, if you have a C compiler, you probably have an assembler. For gcc, this assembler is named gas or as. That’s usable, but they aren’t always as friendly as you would like. On PC-based Linux, you might consider using the netwide assembler (nasm) that you can install with your package manager.
Even for Linux, you have to consider the platform. In my case, I’m using a 64-bit Intel/AMD PC. But you might be using a 32-bit version or running on ARM (or any other CPU Linux supports). There is even a 32-bit interface for 64-bit Linux (x32), if you are interested in that. The second order of business, then, is to figure out what the CPU architecture looks like.
To start, you need to understand the registers (like the diagram below). Of course, that diagram assumes you have some idea what those registers are for, so you’d have to do more digging in a reference manual, a data sheet, or a tutorial aimed at your specific processor.
That can be daunting on any modern CPU. You can always find the data sheets, but it can take a long time to parse through one of those. There are also plenty of resources online, as you might expect. If you are working with something like an AVR, the data sheet is pretty straightforward. The more complex processors usually have a lot of details you don’t need to know, so that’s part of the problem with the data sheets. For example, an Intel processor has a lot of features aimed at people writing operating systems. Unless you are writing an operating system or programming bare metal, you don’t care about that.
Also, unless you care about bare metal programming, you need to know the application binary interface or ABI that defines how the program starts and can call libraries and services. For example, for 64 bit Linux, you can read this document.
If you start reading data sheets and ABI documents, you might feel like giving up pretty quickly. However, it really isn’t that bad. You can find many sites (like this one) that you can use to find out the particulars of calling system calls.
For example, system call with code 1 writes to a file. If you look it up on the site above, you can double-click the corresponding row and see how you need to make the call. This image is an example of how to use that table.
In this case, you’ll see that %rdi gets the file descriptor, %rsi gets a pointer to the data to write, and %rdx is the count of bytes to write. Call 60 is the exit function (you can look it up on the same table). You probably know that file descriptor 1 is the stdout device for a Linux program. So using nasm, a very simple program can look like this:
section .data msg db "Hello Hackaday!",10 msgend equ $ section .text global _start _start: mov rax,1 ; print msg using syswrite (1) mov rdi,1 ; fd 1 mov rsi,msg ; buffer mov rdx,msgend-msg ; count syscall mov rax,60 ; exit mov rdi,0 ; return code syscall
Granted, this is pretty much the simplest program you could write, but it works. A simple nasm command does the assembly:
nasm -f elf64 -o hello.o hello.asm
Then you need to link it just like you would an object file from a C compiler (you could even link in some object files produced from the C compiler, if you wanted to):
ld -o hello hello.o
Go Forth and Assemble
Clearly, you have a lot of details to fill in, but the point is, it doesn’t have to be very hard. If you use 32-bit Linux or ARM, the details will be different (there will be different syscall numbers and registers, for starters).
Just as a reminder, I’m not suggesting that writing entire programs in assembly is a great idea. But it is a challenge and one that can have unexpected benefits. If all you want to do is optimize a few hot spots, you are better off sticking with inline assembly (discussed last time). But if you want to flex your logic and programming muscles, try writing some entire programs in assembly. You might get it to work and you might not, but I’d bet you’ll learn something.