From High Level Language To Assembly

If you cut your teeth on Z-80 assembly and have dabbled in other assembly languages, you might not find much mystery in creating programs using the next best thing to machine code. However, if you have only used high level languages, assembly can be somewhat daunting. [Shikaan] has an introductory article aimed to get you started at the “hello world” level of x86-64 assembly language. The second part is already up, too, and covers control structures.

You can argue that you may not need to know assembly language these days, and we’ll admit it’s certainly not as important as it used to be. However, there are unusual cases where you really need either the performance or the small footprint, which is only possible in assembly language. What’s more, it is super useful to be able to read assembly from your high-level tools when something goes wrong.

Of course, one of the problems is that each assembly language is different. For example, knowing that the x86 assembly doesn’t completely transfer to ARM instructions. However, in most cases, the general concepts apply, and it is usually fairly easy to learn your second, third, or fourth instruction set.

We’ve had our own tutorials on this topic. You can also debate if you should learn assembly first or wait, although in this case, the audience is people who waited.

24 thoughts on “From High Level Language To Assembly

  1. I don’t know about others, but I found compilers to be pretty darn optimised now.
    Bear in mind that I’m not a coder by trade, so no expert by any means, but last time I wrote some inline asm was in Turbo Delphi to try optimising the calculation of whether a point is member of a fractal set. First few attempts were SLOWER than Pascal compiled code, and it took me a couple of hours optimising before I started being faster in asm, and then the gain was only marginal. So if people don’t bother, I can understand.
    Also, not great for people who want to write platform independent code obviously.
    Then again, some others might have examples were asm did make a significant improvement, presumably because they write a large chunk of code in asm, and they are just much better than me (I never learnt asm, and my programming teaching was very sparse as a mech eng).

    1. Yeah, this is my experience. And I am a software engineer by trade. Even if you are in the need of more optimized code, you don’t take assembly, but use compiler intrinsics and still leverage the vast intelligence of the compiler about register allocation and instruction reordering.

      Only times I’ve need ASM is for some really low level things, usually solved with one or two lines of inline ASM.

      (Note: This is for modern machines, if you go to vintage hardware, the compilers aren’t as good as handling with that. SDCC vs asm Z80 can make a huge difference)

      1. I, too, was a software engineer. I have spent decades working in assembler on 4-bit and 8-bit real-time systems. A lot of these systems did not have a UART and I had to roll my own. Having later access to the C language (and UARTs) came as a great relief.

        I once had to refactor a function that used interrupts to receive a radio transmission and was intermittently missing a message. It was coded in C. After rewriting the function I reviewed the assembly code output (the compiler went straight to binary, but had an intermediate assembler output listing) and could not improve it anywhere. The rewrite worked and improved the reliability, but I never understood how!

        For reference, it was an IAR compiler running on a 16-bit H8 CPU.

    2. Sure, compilers are optimized and there is little need for major efforts in assembly. Even in embedded work this is the case.
      But, I’d argue the the level of understanding one achieves of the low level workings of the processor are well worth the effort. Also, as mentioned , being able to read the assembly produced by compilers can be handy in trouble shooting and understanding where the compiler might not be doing what you think it is.
      In addition, when you need to write, for example, comm protocols that involve tight, specific timing it can be invaluable.

      1. I agree, but these days with machine code being further processed by CPU’s into one or more microoperations, it is not as low level of an understanding as it once was. The CPU reads the μops from a ROM, stacks them up in buffers and then typically execute the microoperations out of order to maximise register and cache usage, while minimising memory latency. x86/x64 do this and the real processor under the hood is RISC based!

    3. One reason to understand assembler is because compilers are “pretty darn optimised”. Many people don’t adequately understand the consequences of various C language constructs and compiler flags (I know I don’t adequately know :) ). Hence it is beneficial to see that the generated code is what you expect (and/or want) it to be.

      Another is when single-stepping through code during debugging.

      And, of course, to check that the compiler has done a good enough job combining all the clauses of an I/O operation into a single instruction.

      Besides, knowing how your tools work, what they do and don’t do, is always beneficial./

    4. well, my first introduction to that sort of thing was when i discovered the fortran compiler knew where to plunk some strategic nops to wring the most out of our little vax-11/730.

      and you should see the things itanium compilers do to code. “hmm. he’s reading a longword here, but only ever uses the high-order byte, so i’m’a just use a byte read instead.” no, compiler, that particular longword is on the far end of a vmebus adapter and if you use a byte read you’ll get the wrong byte-swapping mode. gotta keep on your toes.

  2. I agree it’s good to have some knowledge of assembly, even in these modern times, but it’s application is extremely limited. Apart from (obviously) the small group of gods who design the compilers and the instruction sets for the computers, the next useful application for school projects to study computer architecture. After that, a practical application is for being able to read assembly. (Not write it!).

    I once bought a PIC16F84, which was the first cheap uC with flash. It’s instruction set was horrible (although there are worse). And mostly because of that I did not do much with it. Then came School, and for some school projects I had to write some assembly. We used an 8085 for that, and it had a fairly easy to understand and decent instruction set. It was good and educational to learn to write some assembly.

    After that I bought the AT90S2313. It was one of my first encounters with Open Source software. I bought that processor because there was a port of GCC available for it. For the first time in history is was easy and affordable to write C code for microcontrollers, and it was wonderful. In those days I also started reading the ASM output of the compiler. GCC normally (always?) works in steps. It first creates assembly, and then it assembles that assembly. With a few compiler options you can add the C code as comments to the generated ASM, which makes it easy to examine what the compiler does. Changing your C code a bit so the optimizer can generate better assembly is useful sometimes. One of the horrors (of GCC 20 years ago) was that when a function was called from within an ISR, GCC pushed (and popped) nearly all of the 32 registers of the AVR controller. Such things are good to know (and be able to discover / examine).

    But these days microcontrollers easily run at 100MHz or more, have multiple clock domains and it all gets very complicated. In my last STM32 program, I set an output bit and immediately after that wrote data to the uart output register, and I saw a delay of a few micro seconds, and with the thing running at 96MHz, that translates to a few hundred instructions. Examining the assembly and I still could not figure it out, because there was nothing in between those to actions in the assembly either. Just now I realize it may be possible that the action of writing to the uart data register triggers an ISR which then runs before the data arrives in the uart hardware.
    And even though I never studied the instruction sets of either the AVR controllers or the ARM Cortex M3, it’s still pretty easy to read and get a good idea what is going on without knowing all the details. For most practical uses this is good enough.

  3. People would be surprised if they know how much Assembly is needed to keep industries rolling.
    That’s also true about analog electronics…
    And that’s the basic of programming, by learning it, any programmer becomes better at programming, don’t matter the language one works with.
    And all the pretty high level stuff can do marvels, but it’s the good ol’ assembly that keeps the revolution wheel going on.

  4. It really depends on what hardware – and software – you’re working with. Z80 was a simple device so programming it in assembly was easy. But assembly on a Motorola CPU with Amiga OS was on another level, the OS insides were so clean and organized (in an object-oriented way that was visible even at this lowest level) that it allowed you to open windows and display buttons and textboxes inside them without breaking a sweat – and not to mention using the graphics chips to push blocks of memory back and forth, animate sprites etc. Tried assembly on a 386 afterwards, it was so ugly in comparison that I gave up on it.

  5. i think every programmer should learn assembly, but i don’t know that they should even use it (though i use it for a lot of different reasons). the thing about assembly is that if you know it then you can imagine how the compiler and the cpu fit together. you know, i think every computer science student should study compilers and CPU implementation a little bit too :)

    i want to take this opportunity to complain about the worst thing in assembly: src,dst vs dst,src. different platforms make different choices. different assemblers make different choices on the same platform (“AT&T style” vs “Intel style”), and some assemblers support both with like a “.intel_syntax” statement. some platforms seem very consistently dst,src within themselves and then you get to the STORE instruction and it’s following the “reg,mem” convention from the LOAD instruction even though now that comes across as src,dst. every now and then i run into the torture where different stages of the same compiler to assembler to disassembler pipeline use different representations for the same code.

    i’m partial to dst,src but mostly i just wish i never had that moment of disorientation wheneven i start to read asm in a different context. there are a ton of quirks in the world but being unconfident about the meaning of “mov %ecx,%ebx” is a serious handicap to understanding the rest of it, and it never seems to go away for me…if i switch around enough to become disoriented then now i start distrusting all of the platforms. out of all the assembly languages i’ve used, there’s only one that i know well enough my confidence hasn’t eroded from dealing with x86 assembly in particular.

    1. Agreed that every programmer should learn assembly! We HAD to take VAX assembly as part our CS degree. It was an eye opener. I then learned z-80, x86, x86-64, and later ARM. Now that I have a PICO 2, I will start kicking around the RISCV assembly instruction set as well.

      I find it ‘fun’ once in a while to write assembly. And not because I can write better code than C compiler or it makes me more productive… It’s just ‘fun’ once in awhile to make a utility or text game in assembly (and without using ready built libraries)… Write your ‘own’ for a lean mean application.

    2. I’m afraid to say that ship has sailed.
      I’m EE, assembly came 3rd. After Fortran and at the same time as Basic. In highschool.
      Might have touched some APL and even some cobol (will never confess IRL) in those first years.

      But we’re living in the age of Javascript…on the server. (spit).
      There are industries existing to provide the children with their expected tools wherever they go.
      Code monkeys can throw JS mixed with Python at you, just like a zoo ape throws it’s shit. (Ewww, you got node in your hair.)

      Kids should learn C first.
      JS (and the libraries, god the libraries…) has rotted most of their brains.
      Only those that get pointers can handle assembler.

      1. Back in my day we used Basic only in High School as the introduction to programming. In College the CS learning language was Pascal used throughout. There was a class that was called Programming Landscape which introduced us to Cobol, Fortran, Lisp and probably a few other languages I’ve forgotten. Assembly was a separate class. I learned ‘C’ on my own as I saw it was being used out in the industry. As Pascal used pointers, C was easy transition. Paid off as my first 15 years of career was working with ‘C’ and Assembly in Real Time Systems for SCADA applications. Turbo Pascal and eventually Delphi, and Borland C/C++, gcc, on the non-realtime side was used.

        Your probably right on the ‘ship sailing’. I notice CS majors now at my old college don’t have to take Physics, Chem, or much math. I was one class away from a Math degree when I graduated. Wasn’t going back just for that ‘one’ class though…. Had to take all the Physic classes offered too. That weeded out a lot of would be CS majors. Thinking back, I believe only about about 1/3 of freshman CS applied majors graduated. Now they have a ‘bunch’ of different tracks (focuses) and call them CS degrees (networking, security, software engineering, control systems, business, etc.) so not quite the same …

      2. As for C being first. I would make it the second language. Python first as it makes programming fun. Sort of ‘wet the appetite’ (Basically like Basic was used in my day — turned me on to programming) . Then focus on ‘C’ if you are serious about programming and want to continue in the CS world which forces you to think a bit about architecture and data structures (linked lists, double linked lists, red-black trees, binary trees, hash tables, etc.) . The basics. And finally move on to C++ for OO focus. Or C#, Java…. Languages that ‘hide’ the pointers and data structures…. But now you know when to use a map, list, etc. and why. Assembly would be taught as part of the Computer Architecture class. Working with big endian, little endian, stacks and registers and such… At least that is my opinion….

        That said, I suppose WebAssemby/JavaScript (don’t really care for) would have to be a class now, as a LOT of applications are now done in the browser context. Depends on the career path I suppose (those different CS tracks I mentioned) …

        1. By the time the ‘good ones’ get to formal CS or EE education they’ve been coding for 6-8 years or more.
          ‘I had to walk uphill both ways. 10 miles in the snow. Barefoot. Just to get to the card punch.’

          The problem is they’ve mostly been coding in the worst possible tool, have gotten to know it’s bugs.
          Now have coders Stockholm syndrome. They think they like JS. See no reason not to use it everywhere.

          There is hope. RP nano is simple enough to develop ‘close to metal’ mindset.
          Too bad about the web programmers, but ‘World needs (ditch diggers/JS coders) too’.

          CS has always been watered down ‘EE with computer focus’.
          At least your school once had rigor.
          Better than some.
          Some CS programs are in the business school.

          Never hire those clowns.
          Like people with BAs in a science.
          That’s code. e.g. BA in Chem=failed p-chem 3 times. Given BA and told ‘go away’.
          Just say ‘hell no’.

  6. Performance is one reason to use assembly but reliability is another.

    Compiler quality varies so assembly provides an independent means to validate compilers for applications where accuracy is critical.

    My normal workflow involves prototyping code in C, and then writing key routines in assembly. I end up with many pairs of routines that should be functionally equivalent. I unit test them by randomly generating test data and feeding that as input to each variant of a routine, and then compare the output and measure the runtime.

    In general I find that I can’t trust the competence of compiler writers to properly do optimization, so I never enable optimization as a C compiler option. I rely instead on piecewise optimization through assembly routines.

  7. You’re correct Al. Hobbyists rarely need to bother with assembly these days. If you’re designing a professional battery powered product though, it’s the difference between a couple AA batteries and a single coin cell.

  8. I’ve written and analysed many whole programs in assembly and there’s always been the arguement that assembly is more optimised and faster. This is not always true and depends on the programmer.
    Assembly can be faster if well written and the programmer understands the processor architecture, has a good program architecture and knows the device well.
    However well written C, or whatever language your compiler accepts, can generate just as few instructions with less development time, require less intimate knowledge of the processor and maintains portability to other architectures.
    It’s good to have experienced assembly language as it teaches you what is easy for the processor and what is hard and maybe you shouldn’t do in your higher level language.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.