Assembly Language For Real

August 25, 2020

We all probably know that for ultimate control and maximum performance, you need assembly language. No matter how good your compiler is, you’ll almost always be able to do better by using your human smarts to map your problem onto a computer’s architecture. Programming in assembly for PCs though is a little tricky. A lot of information about PC assembly language dates back from when assembly was more common, but it also covers old modes that, while still available, aren’t the best answer for the latest processors. [Gpfault] has launched a series on 64-bit x86 assembly that tries to remedy that, especially if you are working under Windows.

So far there are three entries. The first covers setting up your toolchain and creating a simple program that does almost nothing. But it is a start.

The second entry talks more about FASM and how to use macros and other features to simplify your programming. In particular, he shows macros that can wrap details like PE tables and calling convention protocols to make things easier. You wind up with a working Hello World program.

The third entry starts work on a fantasy CPU emulator, QBX. This isn’t a bad idea since emulating a CPU forces you to use many of the host computer instructions and doesn’t require any special knowledge other than what you probably have if you are trying to learn assembly language, anyway.

Of course, if you are writing boot code, you need to know all that old-fashioned legacy stuff. We liked [Ben Jojo’s] tutorial for that. If Linux is more your jam, we have an introduction for that, too.

Header: AMD Ryzen x86-64 processor, Fritzchens Fritz / CC0

48 thoughts on “Assembly Language For Real”

John Benham says:

August 25, 2020 at 8:20 pm

Takes me back a bit, I remember writing an assembly I/O routine to allow a PDP 11/23 to talk to a somewhat specialized $1.0M Watkins Johnson radio receiver. The receiver received and reported its frequency in plain binary in multiple words on a parallel interface bus. Debugging an error in binary numbers as large as 18*10^9 proved to be a PITFA. Also those were the days when you had to squeeze your code into 256K of memory so some skills in memory link mapping were required. As I discovered after a long day of pulling my hair out, if you accidentally removed a module name from the linker command line while forgetting to remove the accompanying comma the DEC linker was quite happy to link in a chunk of blank memory for your program to fall into.
Later on I needed some tools for microwave analysis work and wrote a matrix + FFT library using the MASM assembler. I used the bios INT=15H / AH=87 call to run the code in extended memory on the 80286/80287 as a way to avoid the 640K memory limit. Came with a memory garbage collection system that was fairly efficient. And so much faster than the code you got from the Fortran compilers back then.
Yes, assembler is a useful tool if all else fails!

Report comment

Reply
Steve L says:

August 25, 2020 at 8:34 pm

“No matter how good your compiler is, you’ll almost always be able to do better by using your human smarts to map your problem onto a computer’s architecture.”

As someone who did a lot of x86 assembler when it was just 808x, I will say that on a modern CISC processor, you might be hard pressed to beat a GOOD C compiler, like Intel’s. The “x86” is probably 100x more complicated than it was 40 years ago with all sorts of timing variability and state dependencies. Maybe you could if you were an obsessive wizard but your productivity would be degraded, even with a big library of macros.

Even assembly has become fat. On a PC. I wrote .COM programs in assembler that people wanted and that did useful things that were nine bytes long (OK, I cheated, called the BIOS). I know, code/data segregation and security makes that size impossible today.

I wonder what the smallest possible size is today for “hello, world” on a PC?

Report comment

Reply
1. Carl Smith says:
  
  August 25, 2020 at 9:13 pm
  
  If you wish to go down a rabbit hole just Google “smallest hello world executable.”
  
  Report comment
  
  Reply
2. hammarbytp says:
  
  August 26, 2020 at 2:08 am
  
  I agree. Writing assembly that is more efficient than a good compiler is virtually impossible nowadays. Assembly is useful is edge cases where you need to access a specific register or hardware, but coding efficiency and speed is rarely a good enough reason. Also assembler tends to make your code less readable and portable
  
  Report comment
  
  Reply
3. Greg A says:
  
  August 26, 2020 at 6:44 am
  
  yeah i second this. compilers do a pretty good job and even assembly isn’t what it used to be. i recently disassembled a .com file i made for DOS back in the day, it was pretty eye-opening.
  
  i want to expand a little on why compiler-generated code is usually better. if you’ve got a talented and determined programmer looking at a small inner loop, i think he will typically be able to do a better job if he’s working in assembly than in C. but your program is going to be a lot larger than just that small inner loop, and no matter how talented your programmer is, he won’t be able to keep up that level of effort over 10,000 lines of assembler code. but he *will* be able to keep up a relatively high effort over 1,000 lines of C code.
  
  the point is, if you have to generate a lot of lines of assembly code, you are forced to make a bunch of performance compromises just to make the code manageable (and even so, it isn’t really manageable). the equivalent C code will generally be *MUCH* more performant, as well as (of course) more manageable.
  
  i’ve had the privilege of interacting with a few large ASM projects and it is really amazing how poorly performing they are. it’s just impossible to keep up that level of effort across such a large volume of bloated unmanageable code.
  
  as an aside, i work with a fancy macro assembler and a lot of times large assembly projects make heavy use of that macro assembler. the result is that it can take a lot longer to compile assembly code than C code!!
  
  Report comment
  
  Reply
  1. Dominic says:
    
    August 26, 2020 at 11:19 am
    
    “if you’ve got a talented and determined programmer looking at a small inner loop, i think he will typically be able to do a better job if he’s working in assembly than in C”
    
    Then he will leave for another job or retiered and you will not find anyone the maintain that software anymore
    
    Report comment
    
    Reply
    1. Boz says:
      
      August 26, 2020 at 4:27 pm
      
      not all software needs updating.
      
      Report comment
      
      Reply
4. Pat says:
  
  August 26, 2020 at 6:47 am
  
  ” I will say that on a modern CISC processor, you might be hard pressed to beat a GOOD C compiler, like Intel’s.”
  
  OK. There’s a bit of silliness here, especially for x86. x86 processors do so much magic behind the scenes to translate x86 into a more-optimized instruction stream that really, “x86 assembly” isn’t actually “assembly” anymore at all, in the sense that “assembly” is supposed to be ‘what the CPU executes’. And because the CPU vendors are *translating* the instructions (via microcode, register renaming, instruction reordering, branch prediction) they *don’t optimize the CPU* for hand-coded patterns. They optimize it for the code a compiler puts out.
  
  So really, you’re just saying “on an x86 system, you’d be hard pressed to get the same performance by hand-writing instructions for Intel’s black box rather than using Intel’s black box to generate instructions for Intel’s *other* black box.” Not particularly surprising.
  
  It’s also silly because if you’re writing for a CPU (not MCU, so like, GHz-ish ARM or x86) you’re likely running it on an operating system, likely with a bunch of interfacing libraries. You’ve *already given up* performance for convenience at that point. You’re not comparing C and assembly at that point anymore.
  
  But that’s the key: assembly *is not a programming language*. Not in the way we think of them anymore. You can’t write “assembly libraries” that are easily interoperable because there’s no “assembly language calling convention” or “assembly language function linking” or anything like that. Those are things that a high-level language *gives* you. So when you say “you’d be hard pressed to beat a GOOD C compiler” – even on a non-x86, where you *can* actually program in “assembly” – you’re really saying “unless you write the entire freaking system from scratch yourself, you’re hard pressed to beat a C compiler.” Again. Not particularly surprising.
  
  Just to explain what I mean, I’ve actually worked with a Python tool that ‘translates’ C code into assembly for a softcore processor. You can’t use temporary variables, or complex expressions. Functions can’t take parameters or return anything. You’ve got 32 global variables, and that’s it, plus goofy syntax to view those variables as multi-register objects. RAM/IO access is via functions.
  
  It still *looks* like C, and it translates exactly to the processor (duh), but you can’t write libraries that way. You have to start agreeing on *some* minimal calling convention, and memory organization. And as soon as you do that, you do give up performance.
  
  That’s my point. If you hardcoded the entire thing in assembly, start to finish, bare metal, of *course* you’d beat any operating system implementation. Duh. You’d win on startup time alone by huge amounts. But that’s not the metric we actually judge things by.
  
  Report comment
  
  Reply
5. RubyPanther says:
  
  August 26, 2020 at 8:42 am
  
  Yeah, this is what I was thinking too.
  
  On ARM, sure, I can do better by hand if I want to spend the time and lose the maintainability. But on x86? No way! That’s not a reasonable assumption in any way, shape, or form, and it ignores the reasons that CISC even still exists.
  
  Report comment
  
  Reply
6. Steve says:
  
  August 26, 2020 at 1:29 pm
  
  “I wonder what the smallest possible size is today for “hello, world” on a PC?” I remember back in the late 80’s I believe, someone wrote a binary using only 7 bit instruction mnemonics, making it transmittable as ascii text. One could copy/paste the ascii text from a BBS and save it as a COM file and run it. It worked! I was impressed. :-P This was years before uuencoding mind you. Is uuencode even used anymore?
  
  Report comment
  
  Reply
  1. Steve says:
    
    August 26, 2020 at 1:41 pm
    
    “years before uuencoding” Just a correction, as this forum picks nits. :-P I meant to say years before uuencode was ported to DOS. I of course realize it’s been with unix since the early 80’s thus unix-to-unix encoding.
    
    Report comment
    
    Reply
John says:

August 25, 2020 at 10:25 pm

I was just thinking about learning a bit of assembly, then this interesting post popped up. I read the first tutorial and good grief, what a tangled mess assembly is. I’m not sure I want to know assembly that bad.

Report comment

Reply
1. Miroslav says:
  
  August 26, 2020 at 5:39 am
  
  Assembly on a CISC (like modern x86) is difficult. Assembly on RISC (some microcontrollers) can be a poetic experience.
  
  Report comment
  
  Reply
  1. Pat says:
    
    August 26, 2020 at 7:12 am
    
    Assembly on any superscalar/out of order processor is fairly pointless. The actual assembly you pass isn’t “really” what gets executed, so you basically need to perform code flow analysis to figure out if what you’re doing is actually faster. Which… means you basically need a compiler.
    
    Report comment
    
    Reply
2. jawnhenry says:
  
  August 26, 2020 at 6:03 am
  
  The approach taken here is in no way representative of how the very valuable subject of Assembly Language should be taught. What it IS representative of is how to discourage a complete neophyte from learning any Assembly Language, for any processor.
  [It most certainly does NOT help that the author of this particular approach decided to START with the Assembly Language of–arguably–one of the most arcane and difficult processors one could possibly choose to use for an introduction to this valuable subject. Starting with the most complex example is no way to teach a rudimentary subject.]
  
  Be well aware that any Assembly Language is totally, completely tied to a particular processor, or CPU. Starting out by learning Assembly for a less-difficult, far-less-complex processor should, perhaps, be one of your choices.
  
  Any time one presumes to teach Assembly Language, and then you find that the first requirements of that particular approach are to learn high-level ‘aids’ to “help you”, stop. Look elsewhere. [This is precisely the reason that so many expert books which purported to teach ‘Raspberry Pi Assembly Language’ were such utter failures and disasters. When was the last time you saw ANY mention of a stand-alone, complete RPi Assembly Language program?]
  
  The worst that happens when this approach is taken is shown, writ bold, here: people are thoroughly discouraged, and most will probably never make another attempt to learn the subject, for ANY processor, because the one thing they’ve already been taught (by this technique) is how hard Assembly Language is.
  
  The learning of any subject can be made either relatively easy or relatively hard, depending only on the approach taken as to how it is taught.
  
  Report comment
  
  Reply
  1. Pat says:
    
    August 26, 2020 at 10:49 am
    
    Why are we pretending that “assembly language” is a thing?
    
    x86 assembly is not ARM assembly, which is not ARM Thumb, which is not RISCV assembly, which is not PIC assembly, which is not AVR assembly, which is not 8051 assembly.
    
    I’m completely fluent in the assembly language of several processors, some enough that I don’t even need mnemonics. Doesn’t help me in the slightest understanding x86 or ARM assembly. Hell even if we forced all of them to start using similar syntax it wouldn’t help, because the entire reason to *use* assembly is to use all the features of the processor.
    
    Why do we bother calling them the same thing? Why don’t we just call it “x86 language”, “ARM language”, “ARM Thumb language”. That’s what they *are*, after all. Pretending that they have *any* relationship to each other just leads to total silliness like this.
    
    Report comment
    
    Reply
3. a Jaded Hobo says:
  
  August 26, 2020 at 10:34 am
  
  Get yourself something small and 8bit like an Atmel ATtiny13. On such devices programming in assembly is fun and rewarding (and often more understandable than C).
  
  Report comment
  
  Reply
  1. nbunnell776 says:
    
    August 26, 2020 at 11:03 am
    
    Second that! I taught myself on the ATtiny45/85s and am getting into some light ARM stuff now. I hear the low end PICs are simple and fun too
    
    Report comment
    
    Reply
Xeon says:

August 25, 2020 at 10:52 pm

ASM has always been the most powerful way to code anything.
Hard, Fast and a lot more real than any compiled language.

Report comment

Reply
1. RubyPanther says:
  
  August 26, 2020 at 8:47 am
  
  On some platforms, but this is about x86 ASM.
  
  Your ASM instructions get ignored by this processor, and it does other stuff it wants to instead. ASM is no more “real” than anything else on these processors.
  
  If you want to write code that is “hard” and “real” you better have a processor that honors your instructions!
  
  I can write embedded Ruby on RISC that is more “hard” and more “real” than ASM on CISC. If I turn off compiler optimizations I can even know exactly how many cycles each line of Ruby will take. You can’t know that using ASM on CISC.
  
  Report comment
  
  Reply
  1. Pat says:
    
    August 26, 2020 at 7:15 pm
    
    It’s not “RISC” vs “CISC.” This isn’t the 1990s.
    
    Any superscalar/out-of-order processor throws away your exact instructions and does what it wants, so long as the end result is architecturally the same. That’s the entire point. You say “move r7 to r3”, it says “ehh…. don’t need to ACTUALLY do that” and ignores it. As soon as a processor is superscalar/out of order, you will have no idea how long each instruction will take without doing *some* code flow analysis.
    
    It gets super bad with branch prediction, obviously, but the entire point of a superscalar processor is that it can look at *groups* of instructions and say “yeah, no, I’ll do this instead, that’s fine.”
    
    Report comment
    
    Reply
Cyna says:

August 25, 2020 at 11:25 pm

While I agree that x86 ASM is not as common today (especially when most programs rely on JIT), ASM for microcontrollers is still highly relevant. Of course, not for everything but for special cases. Like when I tried to get gcc to remove enough fat that a 16 MHz AVR (obligatory in that case) could meet the specification for the WS1812 (and meet it strictly). Stripping out most of the stack/frame calls while interleaving some memory and arithmetic instructions in critical places did the trick, but it would be impossible with pure C (again, in this case – you should always optimize for the specific conditions after profiling and consulting Mr. Knuth).

Report comment

Reply
1. Cyna says:
  
  August 25, 2020 at 11:33 pm
  
  And this case, the 5% rule of thumb actually fit pretty good (mostly due to the number of NOP’s required when unrolling the loops).
  
  Report comment
  
  Reply
Steve says:

August 25, 2020 at 11:40 pm

When Covid first hit and I was unable to do anything else, as we were under lock down, I wrote (in 8085 assembly) an entire operating system for a 1970’s single board computer (the Rigel computer) since nobody had ever done that before. All that it had was a machine language monitor in rom. So I expanded the ram using modern memory, expanded the rom region using flash memory, wrote the operating system and flashed it in. I then wrote a C compiler for the computer (C98 version) and wrote the first C compiled and run on the SBC ever. Thinking how awesome this was (since I haven’t even powered that computer up in over 30 years) I went out on the net to see if anyone else had a Rigel and wanted the software. Nothing. No interest. :-/ But it was a fun exercise in retro computing at least. I don’t normally deal much with assembly these days. :-P

Report comment

Reply
1. Cyna says:
  
  August 25, 2020 at 11:50 pm
  
  I assume you meant C89?
  
  Report comment
  
  Reply
  1. Steve says:
    
    August 25, 2020 at 11:52 pm
    
    Of course. :-P Transposition, sorry. :-P
    
    Report comment
    
    Reply
2. open_source? says:
  
  August 26, 2020 at 1:37 am
  
  Very interesting, do you have the code and notes published somewhere?
  
  Report comment
  
  Reply
3. nbunnell776 says:
  
  August 26, 2020 at 9:37 am
  
  Sweet! Id love to see the documentation if you have any! They’ve got a tip-line around here somewhere
  
  Report comment
  
  Reply
Stephen Walters says:

August 26, 2020 at 12:51 am

FORTH anyone? Quick, compact, mature…
https://www.facebook.com/groups/PROGRAMMINGFORTH

Report comment

Reply
1. really? says:
  
  August 26, 2020 at 1:37 am
  
  Facebook?
  
  Report comment
  
  Reply
2. Artenz says:
  
  August 26, 2020 at 10:46 am
  
  The average Forth enthusiast spends more time writing Forth interpreters than actually writing Forth code.
  
  Report comment
  
  Reply
scompo says:

August 26, 2020 at 2:07 am

Another interesting resource about assembly is this guy https://www.youtube.com/c/WhatsACreel

Report comment

Reply
1. mj says:
  
  August 26, 2020 at 3:02 am
  
  Thanks! Looks like hours of fun!
  
  Report comment
  
  Reply
John says:

August 26, 2020 at 2:26 am

If you don’t write in assembler then you are not writing software you are using other people’s software in various different orders of execution in order to make it do what you want in much the same way as you use an excel spreadsheet.

Report comment

Reply
1. Cyna says:
  
  August 26, 2020 at 3:40 am
  
  And the x86 assembly you write is somehow not further being interpreted/decoded by the RISC core embedded inside the CPU? (mine is)
  
  Report comment
  
  Reply
  1. Alexander Wikström says:
    
    August 26, 2020 at 5:56 am
    
    Not wanting to repeat myself.
    But the line between RISC and CISC and weather or not modern x86 processors are RISC or not under the hood is actually not as clear cut as some might think.
    
    Nor is one superior over the other, it all depends on a lot of factors.
    
    Here is a comment I made on another article explaining the situation in a bit more depth:
    https://hackaday.com/2020/08/12/degrees-of-freedom-booting-arm-processors/#comment-6270448
    
    Report comment
    
    Reply
2. nah! says:
  
  August 26, 2020 at 5:10 am
  
  you must be fun at parties
  
  Report comment
  
  Reply
3. chango says:
  
  August 26, 2020 at 5:43 am
  
  https://xkcd.com/378/
  
  Report comment
  
  Reply
4. Truth says:
  
  August 26, 2020 at 5:46 am
  
  Butterflies https://xkcd.com/378/
  
  Report comment
  
  Reply
5. RubyPanther says:
  
  August 26, 2020 at 8:53 am
  
  Now find out that the CPU runs firmware that’s underneath the published interface, and that your ASM is also just software.
  
  And that when you go even deeper, and get to the hardware, you’re just selecting which circuits you want, when, much the same as when you use a spreadsheet.
  
  And then there are more layers that are like that, where at each layer you’re simply selecting between choices provided by other engineers in the past. This is what “on the shoulders of giants” means.
  
  Report comment
  
  Reply
  1. Alexander Wikström says:
    
    August 26, 2020 at 10:47 am
    
    Standing on the shoulders of giants is indeed something that most people will need to do.
    
    For an example. I have been developing an architecture for a while, and it uses fractions instead of floating point. But when handling fractions, it is rather nice if one had an instruction for finding the greatest common divisor between two numbers.
    
    Now, I don’t want to fiddle with trying to find out an efficient solution to the problem. Instead, I just use the binary GCD algorithm (the Stein’s algorithm). Saves me a ton of time trying to figure it out myself. And the algorithm is proven to be very efficient to the point where even a more efficient algorithm isn’t going to be a major improvement. (And a lot of the bit operations in Stein’s algorithm can be done in a single cycle with dedicated hardware, greatly improving its performance. (And the work needed to implement Stein’s algorithm in hardware is to me rather trivial, figuring out the maths behind the algorithm is though “black magic” as far as I am concerned.))
    
    By using an off the shelf solution, one can instead focus on the larger things within one’s project, and thereby more efficiently use one’s time. It is a collaboration, though, at times some people just “pick” a thing thinking it is great even if it might not be good at all for one’s application. (Like always using floating point regardless of what one is doing with it…)
    
    So obviously one shouldn’t be careless when choosing what off the shelf stuff one uses in one’s project, but in general, there is nothing wrong with building on a foundation made by others.
    
    Report comment
    
    Reply
6. Steve says:
  
  August 26, 2020 at 1:07 pm
  
  “If you don’t write in assembler then you are not writing software” Those of us with grey hair can say things like “assembler is for lazy folks” as we started out programming micros by hex entry on a keypad, byte by byte. Or at least I did anyway, on a Kim-1, my first 6502. Most early micros were poorly supported. When assemblers became available, that made life a whole lot easier. Imagine counting a branch forward or reverse in hex and getting it off by just one byte. That would usually cause a crash. Early micros were VERY frustrating, so I’m not embarrassed using a modern macro assembler, especially where macros are useful (embedded controllers). It makes assembly almost as easy as any early higher level language. :-)
  
  Report comment
  
  Reply
7. L says:
  
  August 26, 2020 at 1:42 pm
  
  everything anyone talks about assembly I read through the comments to see if anyone says assembler instead of assembly. I was most definitely correct.
  
  Report comment
  
  Reply
  1. Steve says:
    
    August 27, 2020 at 7:17 pm
    
    Assembler is the utility and assembly is the language, you are correct and we are bad. :-P
    
    Report comment
    
    Reply
jawnhenry says:

August 26, 2020 at 6:53 pm

“By understanding a machine-oriented language,‭ ‬the programmer will tend to use a much more efficient method‭; ‬it is much closer to reality.”–Donald Knuth‭

By “…machine-oriented language…”, Dr. Knuth means, precisely, exactly, Assembly Language.

********************************

Donald Knuth is, arguably, one of the premier, world-class computer scientists of the 20th and 21st centuries. See

“Donald Knuth”
Wikipedia
https://en.wikipedia.org/wiki/Donald_Knuth

Report comment

Reply
Old Guy says:

August 26, 2020 at 8:14 pm

“[A] simple program that does almost nothing” ? That’s right in my wheelhouse! What I was put on Earth to do. Haven’t played with assemblY since the ’80s.

#WhatADelight

Report comment

Reply
1. Steve says:
  
  August 27, 2020 at 7:15 pm
  
  So what are we talking here? A single no operation (nop) instruction?
  
  Report comment
  
  Reply
jawnhenry says:

August 27, 2020 at 2:48 pm

A VERY SIMPLE TEST…

…for all you people who really want to learn Assembly Language:

The simplest test of the quality of the instruction purported to BE given is this: it shouldn’t cost you anything!
That’s right: nothing.
All manufacturers who offer assembly-language-programmable devices all offer Assembler programs for their devices for free, or some nominal sum very close to ‘free’ (these ‘Assemblers’ are also referred to as ‘Macro Assemblers’, but since all support the ‘macro’ capability, most manufacturers may or may not use this adjective).

No, and I do mean NO, Assembler (the program) REQUIRE YOU to generate your code with an expensive add-on usually referred to as an “IDE”–Integrated Development Environment”. You WILL be told by some that you absolutely can NOT get by without purchasing one; that you absolutely ‘NEED’ an IDE. Run away. As fast as you possibly can. Find someone who is really an expert to get your advice from (you just encountered one way to sort out the experts from the other types).

Any Assembler worth its salt will happily accept–and generate PERFECT machine code from–plain English-language code written with any text editor or word processor which has been set to generate (save its output as) pure text, or ‘ASCII’ code. That is the only crucial part: again–make certain that the program you’re using is set to save its output in ‘pure text’, or ASCII format.
This is the only way I have ever written Assembly Language programs all the years I’ve been doing it. Never had a problem. Never will have.

Have a lot of fun; Assembly Language programming IS a lot fun, to say nothing of being extremely satisfying.

Report comment

Reply

Hackaday

Assembly Language For Real

48 thoughts on “Assembly Language For Real”

Leave a Reply to PatCancel reply

Search

Never miss a hack

If you missed it

Field Guide To The North American Weigh Station

The Rise And The Fall Of The Mail Chute

Mining And Refining: Drilling And Blasting

Eulogy For The Satellite Phone

Just For Laughs: Charlie Douglass And The Laugh Track

Our Columns

Hackaday Podcast Episode 326: A DIY Pockels Cell, Funny Materials To 3D Print With, And Pwning A Nissan Leaf

This Week In Security: MegaOWNed, Store Danger, And FileFix

Announcing The 2025 Hackaday One Hertz Challenge

FLOSS Weekly Episode 838: AtomVM And The Full Stack Elixir Developer

The Tao Of Bespoke Electronics

48 thoughts on “Assembly Language For Real”

Leave a Reply to PatCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns