Learn ARM Assembly With The Raspberry Pi

We live in a time when you don’t have to know assembly language to successfully work with embedded computers. The typical processor these days has resources that would shame early PCs and some of the larger ones are getting close to what was a powerful desktop machine only a few years ago. Even so, there are some cases where you really want to use assembly language. Maybe you need more speed. Or maybe you need very precise control over timing. Maybe you just like the challenge. [Robert G. Plantz] from Sonoma State University has an excellent book online titled “Introduction to Computer Organization: ARM Assembly Langauge Using the Raspberry Pi.” If you are interested in serious ARM assembly language, you really need to check out this book.

If you are more interested in x86-64 assembly and Linux [Plantz] has you covered there, too. Both books are free to read on the Internet, and you can pick up a printed version of the Linux book for a small payment if you want.

Since these are meant to be college textbooks, they aren’t quick reads, but they also are a lot more detailed than the typical blog post about how to do assembly. Even if you don’t want to read it cover to cover, you might find some of the specific write-ups about debugging and interacting with C code useful.

The Raspberry Pi book is written using a system called PreTeXt which looks interesting. We liked how the output looked, although it would be handy if you could dump it to a PDF for your book reader.

We were very impressed by the comprehensive nature of both books. We’ve looked at very simple and brief introductions before. We’ve even done our own short takes about Linux and assembly with C.

41 thoughts on “Learn ARM Assembly With The Raspberry Pi

  1. Sweet! I can retire in less than two years and want to get back into assembly. In my Advanced Physics Lab class in college, we programmed in assembly on Commodore 64 machines. We wired A/D converters directly to the address bus on the CPU (through optical isolators) and got data points every 10 ms.

    The professor shot a BB gun into a wooden block; and we got a data curve of the deceleration. Majoring in Engineering Physics was fun.

    1. Opto-isolators are very slow – their propagation delay of microseconds if not more. It is not suited for direct interfacing to a CPU bus. If you are not using a separate supply with a separate ground, then an opto-isolator shouldn’t be there.

      There are some fast logic level/specialzed ones these days, but those aren’t available back then.

      1. Nice but you are talking about a 1 Mhz computer with slow bus speeds …… It’s a limiting factor more than the opto-isolators. Inputs back then weren’t exactly robust as they are today so you definitely needed opto-isolators

        You had to learn assembly and then convert it to PEEK and POKE in the Basic routines in order to have any kind of speed. I had a bunch of Assembly routines for my printer because you had no screen resolution to print graphs and such and doing graphics routines was slow too. Actually the printers were more powerful than the C-64 was and through assembly you could also leverage them to do some processing.

        1. You need optoisolators only if you have separate grounds. Otherwise simple resistors, transistors or buffers are the way to go.
          And with a 1MHz CPU clock you can do 100kHz sampling, at least for direct write to memory (and limited by it’s size). That would be 10µs sampling, not 10ms.

          1. Well. . . resistors and transistors are far more components. We just ran wires to the opto isolator ICs – simple. A 1MHz CPU won’t provide 100kHz sampling. Some commands require several clock cycles. You have to read the address bus, then calculate the next address to store the data, then actually transfer the data to the address, then go back to the address bus for the next data point.

            10 ms was the fastest we could get with the C64 and was plenty fast to get a good curve for decelerating a BB in a block.

    2. Can you email me. Trying to avoid bots with email address. I am older and Working on a rasp pi project for schools but lack this experience interacting Between the device and the output for students. Trying to incorporate assembly language, NFC, displays, etc. Would love to chat live. -Mark

  2. Why would anyone want to program in slooow ASSembly?

    Java is much much faster, and gets faster every time the program runs. It just needs a few GB of ram, a few cores, and a few GB of disk space for the helper libraries and classes.

    ASSembly has all kinds of security risks, while java is safe and secure!

    Why risk crashing your computer, Java is the future.

    1. Different instruction sets. Same principles. AVR is simpler. I have no experience writing assembly code for AVR but I can read assembly output reasonably well because the bulk of it is made up by just a few instructions which I remember without checking the manual. The register system is also very simplistic on AVR.

  3. Maybe I missed it above, but there is a good intro book on this called “Assembly Language Raspbian Beginners” by Bruce Smith” . I liked it anyway as good introduction to ARM assembly (32 bit). I’ll check out the above.

    As above, when I retire, I’ll dig more into writing apps (simple) in assembly when maximum ‘productivity’ is not a concern. It’s my machines, so don’t care about crashing the app, or even the machine. I too started on 6502 assembly back in college as well as VAX, then did some drivers in Z80, 68XXX, in my RT work for automation projects. Dabbled in x86 land when I had to write graphics algorithms in assemble to get max speed out of them. Yep, I want to get back to it … for fun. I miss writing programs where you write instructions to the processor without having a compiler do it for you.

  4. “more speed” – most people will have a hard time beating a modern compiler on anything but the tiniest function
    “precise control over timing”, that’s out the window from the get go when you are running an OS like linux/windows

    assembler is a good way to realize how a cpu work and occasionally for debugging, but in most cases it has little practical use

    1. Lots of compilers still don’t make good use of modern vector extensions, so humans can definitely still do a good job there. Of course, you can usually use them with intrinsics so you don’t need to drop down in to assembly.

      Also, precise control over timing is out of the window as soon as you’re using anything much more complex than an AVR – even smaller ARM cores come with caches and branch predictors, and anything superscalar will make it very difficult to get precise timing.

      Of course, if you want to bring up a system or write a compiler (or a JIT compiler, which is what I spend most of my time doing) then you’ll need a good understanding of assembly and even machine code.

      1. ARM microcontrollers can usually be configured to run with deterministic timings, but it usually requires giving up significant performance (there’s no dynamic branch prediction, but flash prefetch and even tiny caches give a huge speed boost at higher clock frequencies).

  5. There are some very basic tests which one can employ to answer the question, “Can the Raspberry Pi (or ANY processor, for that matter) be programmed in Assembly Language?–

    1. Get a full “Programmer’s Model” from the processor manufacturer;
    2. Get a full, descriptive processor instruction set–including machine code and timing–from the manufacturer;
    3. Obtain a full, complete description of the processor’s interrupt mechanism, including latency;
    4. Obtain a full and complete description of the processor’s Interrupt Vector Table, and where it must reside in the memory map;
    5. Obtain (know absolutely) the starting address required for the very first ‘boot’ word;
    6. Be able to obtain a full, complete, and modern-in every sense of the word–Assembler; free or paid for–it doesn’t matter; but one absolutely must be available;
    7. Obtain from the processor manufacturer any and all limitations regarding the placement and location of the stack within the memory map.

    If one cannot get answers to all of these VERY simple assembly-language-processing BASICS, one will not learn the processor-specific assembly language, of any machine. These are the basics. There ARE no short-cuts.
    Bottom line–if you can’t get full transparency from a manufacturer regarding programming one of their processors, you will NOT learn how to program that processor in Assembly Language.

    There are certain ‘models’ of ‘teaching’ Raspberry Pi assembly language which purport to make one an assembly-language programmer ‘expert’ via the reverse process of writing high-level code first, and THEN looking at the machine language produced after running this code through a compiler. This, quite simply, doesn’t work. Never has; never will. Save your money.
    As Douglas Adams (“Hitch-Hiker’s Guide to the Galaxy”) says–
    “The first thing you get when you take a cat apart is a dead cat.”.

    There’s no ‘magic bullet’, no easy way to learn assembly language, even though Assembly Language has made it into the IEEE”s list of the Top-10 most desirable programming languages this year .
    You just may want to take a clue from the experts.

    1. You don’t really need information about most of that stuff to learn assembly, given that it’s perfectly possible to write a user mode assembly program where most of those things are handled by the operating system. You also don’t necessarily need to know instruction or interrupt latencies (and in some cases, such as when trying to write code which will run on a range of machines, you don’t even know which machine’s latency you should be aware of). Most modern large-ish machines will also have variable instruction and interrupt latencies depending on pipeline state.

      1. Here’s a REAL clue for you, and everyone else who wants to THINK they’re an assembly-language programmer, AND, by the way, an indication that you are, beyond a shadow of a doubt, being scammed by all those books which tell you they’re going to MAGICALLY make you into an assembly-language programmer with absolutely NO hard work–at ALL–required :


        Have this statement chiseled in granite. Put it on your desk.

        [From a previous comment] You not only need “…most of that “…stuff>”, you need ALL of that “stuff”, as well as one more VERY important item which most people will find an insurmountable deal-breaker: HARD WORK.

        Do you need absolutely concrete, unassailable proof? Try getting a job as an assembly-language programmer without following this track.

        Best of luck…

          1. Wrong.

            I’ve had my conversation.
            Other than the rapier-wit, information-packed “That is literally, factually false…”, when do we get something even slightly resembling a conversation on this subject from you?

            The only thing which is “…clear…” is that you don’t write assembly-language programs, do you?

            Your examples from your obviously hard-won assembly-language programming experience–along with details as to the actual machines you’ve programmed–are anxiously awaited, by absolutely everyone.


        1. We had to learn the hardware specs with a Z80 board my old company designed, and write some board support packages for 68xxx boards. Not for the faint of heart. As a caveat, when I came on board, I just had to ‘maintain’ the Z80 code and add a few features thank goodness! The Z80 code was ‘one’ monolithic app written in assembly. Create a hex file and burn to an EPROM. On the 68XXX boards some Assembly startup code that did all the setup and then hand-off to the real-time app to run. On the 68XXX family we used VRTX RTOS, so we only wrote certain sections in assembly (like serial port drivers), but main app written in C (cross compiled). These also were burned to EPROM (later flash). The RPI, I believe, you don’t have to write the SD boot code (proprietary as I remember) as I’ve tested FreeRTOS on the RPI (blink the light test). About as far as I got a few years ago trying an RTOS on RPI.

          At this stage of my life, I’ll be just content writing some assembly apps in user mode on top of an OS such as Linux, rather than write a boot loader and OS from the ground up. Not planning on getting a job in writing Assembly Language apps any time soon. Assembly language programming is ‘hard-work’ and I do agree there is no ‘magic wand’ to make you a proficient assembly programmer. That said, if I am only a dabbler (freely admit it), and not a ‘real assembly programmer’ as defined in your mind, well, that’s ok with me :) .

          1. Me too. I have built Z80 system and boards, and then had to make them work. First in assembler and then in CP/M. The OS does help to get results quickly, but with tradeoffs which included lost of control of basic processes. But I do not have to write a driver for the hard drives or the serial communication, that is were an assembly programmers can and do shines the best.

        2. utter nonsense, a program written in assembly is an assembly language program no matter where and how it is run.

          you list of demands are more related to Bare-Metal Programming, which on some platforms like cortex-M can be done entirely in C with no assemble required

        3. Take your gatekeeping and leave it. Writing a hand optimised assembly languange function within a C program is just as valid as bootstrapping the CPU and hardware directly in assembly.

          “They say great science is built on the shoulders of giants. Not here. At Aperture, we do all our science from scratch. No hand holding” – Jawnhenry (Probably)

        4. Jawn, think of it this way. An assembly program. is like a tractor on semi truck, True it can run by itself, but it can pull a trailer where an assembly language can also run alongside other programs and provide more functionality. If you think assembly language was meant to run by itself, go back to school and find out the other uses it has so you might be able to get the full benefit of using it. You are amazingly short on what you know if you really believe what you say. I just found this text today so thats why its late but be more creative Jawn, the only limitations to a computer is your imagination and yours isnt going far at the moment.

    2. These are all available for ARM, although the specific timings and latencies vary somewhat between cores. Also some of the higher-end ARMs add multi-stage pipelines, branch prediction, and a cache hierarchy, which you also have to take into consideration for best results. It can be a little annoying to pull together some of the information (ARM’s extensive documentation is terribly formatted, for example), but it’s all there.

  6. I have just started reading this book and are only at chapter 5. There is just one thing : the equation at 5.5.8 can be further simplified to : ‘(x+y), requiring only 1 NOT gate and 1 OR gate.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.