Who’s Afraid Of Assembly Language?

This week, [Al Williams] wrote a great thought piece about whether or not it was worth learning an assembly language at all anymore, and when. The comments overflowed, and we’re surprised that so many people basically agree with us: yes. Of course, it’s a Hackaday crowd, but I still didn’t expect the outpouring of love for the most primitive of languages.

Assembly language isn’t really one language, though. Every chip speaks its own dialect. Of course there are similarities: every CPU has an add function, right? But almost no CPU has just one add – there are variants with and without carry, storing and reading from working registers or RAM. And once you start talking about memory access, direct or indirect, the individual architectures of the chips demand different assembly languages.

But still, although the particular ways that CPUs do what they do can be incompatible from a strictly language perspective, they are a lot more similar in terms of the programming idioms that you’ll pick up along the way. Just as learning a set of solid algorithms will help you no matter which higher-level language you use, learning the concepts behind crafting loops and simple memory structures out of raw assembly language will serve you no matter which CPU you choose.

I have only written assembly language for a handful of CPUs, and not much of it at that, but I’ve found the microcontrollers to be the friendliest. So if you want to dip your toes in that water, pick up an AVR or an MSP430. Or maybe even the new hotness – a RISC-V. You’ll find the instruction sets small enough that you have to do most of the work yourself. And that is, after all, the point of learning an assembly language: learning to think like the silicon. If you treat it like a fun puzzle to solve, you’ll probably even enjoy the experience.

[Al]’s original question was when you should learn an assembly language: before or after a higher-level language. For 99% of our readers, I’d say the answer is right now.

Ask Hackaday: Learn Assembly First, Last, Or Never?

A few days ago, I ran into an online post where someone pointed out the book “Learn to Program with Assembly” and asked if anyone had ever learned assembly language as a first programming language. I had to smile because, if you are a certain age, your first language may well have been assembly, even if it was assembly for machines that never existed.

Of course, that was a long time ago. It is more likely, these days, if you are over 40, you might have learned BASIC first. Go younger, and you start skewing towards Java, Javascript, or even C. It got me thinking, though: should people learn assembly, and if so, when?

Continue reading “Ask Hackaday: Learn Assembly First, Last, Or Never?”

A Literate Assembly Language

A recent edition of [Babbage’s] The Chip Letter discusses the obscurity of assembly language. He points out, and I think correctly, that assembly language is more often read than written, yet nearly all of them are hampered by obscurity left over from the days when punched cards had 80 columns and a six-letter symbol was all you could manage in the limited memory space of the computer. For example,  without looking it up, what does the ARM instruction FJCVTZS do? The instruction’s full name is Floating-point Javascript Convert to Signed Fixed-point Rounding Towards Zero. Not super helpful.

But it did occur to me that nothing is stopping you from writing a literate assembler that is made to be easier to read. First, most C compilers will accept some sort of asm statement, and you could probably manage that with compile-time string construction and macros. However, I think there is a better possibility.

Reuse, Recycle

Since I sometimes develop new CPU architectures, I have a universal cross assembler that is, honestly, an ugly hack, but it works quite well. I’ve talked about it before, but if you don’t want to read the whole post about it, it uses some simple tricks to convert standard-looking assembly language formats into C code that is then compiled. Executing the resulting program outputs the desired machine language into a desired file format. It is very easy to set up, and in the middle, there’s a nice C program that emits machine code. It is not much more readable than the raw assembly, but you shouldn’t have to see it. But what if we started the process there and made the format readable?

At the heart of the system is a C program that lives in soloasm.c. It handles command line options and output file generation. It calls an external function, genasm with a single integer argument. When that argument is set to 1, it indicates the assembler is in its first pass, and you only need to fill in label values with real numbers. If the pass is a 2, it means actually fill in the array that holds the code.

That array is defined in the __solo_info instruction (soloasm.h). It includes the size of the memory, a pointer to the code, the processor’s word size, the beginning and end addresses, and an error flag. Normally, the system converts your assembly language input into a bunch of function calls it writes inside the genasm function. But in this case, I want to reuse soloasm.c to create a literate assembly language. Continue reading “A Literate Assembly Language”

It Isn’t WebAssembly, But It Is Assembly In Your Browser

You might think assembly language on a PC is passe. After all, we have a host of efficient high-level languages and plenty of resources. But there are times you want to use assembly for some reason. Even if you don’t, the art of writing assembly language is very satisfying for some people — like an intricate logic puzzle. Getting your assembly language fix on a microcontroller is usually pretty simple, but on a PC there are a lot of hoops to jump. So why not use your browser? That’s the point of this snazzy 8086 assembler and emulator that runs in your browser. Actually, it is not native to the browser, but thanks to WebAssembly, it works fine there, too.

No need to set up strange operating system environments or link to an executable file format. Just write some code, watch it run, and examine all the resulting registers. You can do things using BIOS interrupts, though, so if you want to write to the screen or whatnot, you can do that, too.

The emulation isn’t very fast, but if you are single-stepping or watching, that’s not a bad thing. It does mean you may want to adjust your timing loops, though. We didn’t test our theory, but we expect this is only real mode 8086 emulation because we don’t see any protected mode registers. That’s not a problem, though. For a learning tool, you’d probably want to stick with real mode, anyway. The GitHub page has many examples, ranging from a sort to factorials. Just the kind of programs you want for learning about the language.

Why not learn on any of a number of other simulated processors? The 8086 architecture is still dominant, and even though x86_64 isn’t exactly the same, there is a lot of commonalities. Besides, you have to pretend to be an 8086, at least through part of the boot sequence.

If you’d rather compile “real” programs, it isn’t that hard. There are some excellent tutorials available, too.

Assembly Language 80’s Minicomputer Style

In the days before computers usually used off-the-shelf CPU chips, people who needed a CPU often used something called “bitslice.” The idea was to have a building block chip that needed some surrounding logic and could cascade with other identical building block chips to form a CPU of any bit width that could do whatever you wanted to do. It was still harder than using a CPU chip, but not as hard as rolling your own CPU from scratch. [Usagi Electric] has a Centurion, which is a 1980s-vintage minicomputer based on a bitslice processor. He wanted to use it to write assembly language programs targeting the same system (or an identical one). You can see the video below.

Truthfully, unless you have a Centurion yourself, the details of this are probably not interesting. But if you have wondered what it was like to code on an old machine like this, you’ll enjoy the video. Even so, the process isn’t quite authentic since he uses a more modern editor written for the Centurion. Most editors from those days were more like CP/M ed or DOS edlin, which were painful, indeed.

The target program is a hard drive test, so part of it isn’t just knowing assembly but understanding how to interface with the machine. That was pretty common, too. You didn’t have a lot of help from canned routines in those days. For example, it was common to read an entire block from a hard drive, tape, or drum and have to figure out what part of it you were actually interested in instead of, say, opening a file and reading a stream of characters.

If nothing else, fast forward over to the 25-minute mark and see what a hard drive from that era looked like. Guess how much storage was on that monster? If you guessed more than 10 MB, you probably didn’t live through the 1980s. We won’t even guess what the price tag was, but you can bet it was spendy.

If you think entering programs like this is painful, try a front panel. That made paper tape seem like a great thing.

Continue reading “Assembly Language 80’s Minicomputer Style”

ARM Programming By Example

The ARM processor is popping up everywhere. From Raspberry Pis, to phones, to Blue Pill Arduino-like boards, you don’t have to go far to find an ARM processor these days. If you program in C, you probably don’t care much or even think about it. But do you know ARM assembly language? Well, if you look at it one way, it can’t be too hard. The CPU only has about 30 distinct operations — that’s why it is called RISC. Of course, sometimes fewer instructions actually make things more difficult. But you can get a great starting tutorial with the 21 programs on the ARM Assembly by Example website.

You need a 32-bit ARMv6 or better — so Raspberry Pi will work here. The compiler, of course, is gcc and all the associated tools. if you have the right hardware, there are sections on using the floating point unit and the NEON co-processor, too.

Continue reading “ARM Programming By Example”

Building MS-DOS From Scratch Like It’s 1983

Building a complete operating system by compiling its source code is not something for the faint-hearted; a modern Linux or BSD distribution contains thousands of packages with millions of lines of code, all of which need to be processed in the right order and the result stored in the proper place. For all but the most hardcore Gentoo devotees, it’s way easier to get pre-compiled binaries, but obviously someone must have run the entire compilation process at some point.

What’s true for modern OSes also holds for ancient software such as MS-DOS. When Microsoft released the source code for several DOS versions a couple of years ago, many people pored over the code to look for weird comments and undocumented features, but few actually tried to compile the whole package. But [Michal Necasek] over at the OS/2 Museum didn’t shy away from that challenge, and documented the entirely-not-straightforward process of compiling DOS 2.11 from source.

The first problem was figuring out which version had been made available: although the Computer History Museum labelled the package simply as “MS-DOS 2.0”, it actually contained a mix of OEM binaries from version 2.0, source code from version 2.11 and some other stuff left from the development process. The OEM binaries are mostly finished executables, but also contain basic source code for some system components, allowing computer manufacturers to tailor those components to their specific hardware platform.

Compiling the source code was not trivial either. [Michal] was determined to use period-correct tools and examined the behaviour of about a dozen versions of MASM, the assembler likely to have been used by Microsoft in the early 1980s. As it turned out, version 1.25 from 1983 produced code that most closely matched the object code found in existing binaries, and even then some pieces of source code required slight modifications to build correctly. [Michal]’s blog post also goes into extensive detail on the subtle differences between Microsoft-style and IBM-style DOS, which go deeper than just the names of system files (MSDOS.SYS versus IBMDOS.COM).

The end result of this exercise is a modified DOS 2.11 source package that actually compiles to a working set of binaries, unlike the original. And although this does not generate any new code, since binaries of DOS 2.11 have long been available, it does provide a fascinating look into software development practices in an age when even the basic components of the PC platform were not fully standardized. And don’t forget that even today some people still like to develop new DOS software.