Learn Assembly The FFmpeg Way

You want to learn assembly language. After all, understanding assembly unlocks the ability to understand what compilers are doing and it is especially important for time-critical code. But most tutorials are — well — boring. So you can print “Hello World” super fast. Who cares?

But decoding video data is something where assembly can really pay off, so why not study a real project like FFmpeg to see how they do things? Sounds like a pain, but thanks to the FFmpeg asm-lessons repository, it’s actually quite accessible.

According to the repo, you should already understand C — especially C pointers. They also expect you to understand some basic mathematics. Most of the FFmpeg code that uses assembly uses the single instruction multiple data (SIMD) opcodes. This allows you to do something like “add 5 to these 200 data items” very quickly compared to looping 200 times.

There are three lessons so far. Of course, some of the material is a little introductory, but they do jump in quickly to SIMD including upcoming instruction sets like AVX10 and older instructions like MMX and AVX512. It is no surprise that FFmpeg needs to understand all these variations since it runs on behalf of (their words) “billions of users.”

We enjoyed their link to a simplified instruction list. Not to mention the visual organizer for SIMD instructions.

The course’s goal is to prepare developers to contribute to FFmpeg. If you are more interested in using FFmpeg, you might enjoy this browser-based GUI. Then again, not all video playback needs high performance.

26 thoughts on “Learn Assembly The FFmpeg Way

      1. Just because it’s not relevant for you, specifically, doesn’t mean it’s not relevant for someone else, and there are arguably way more use cases targeting constrained IoT devices interfacing HW than there are for SIMD optimizations that modern compilers don’t (yet) handle optimally.

        And yes, I do both, just for good measure :P

      2. hahaha i feel exactly and precisely the opposite.

        the PIC assembly experience is universal. the struggles you have there will be the same struggles everywhere. i learned x86 assembly as a kid using debug.exe, and m68k and sparc assembly at college, and that has really served me well. my whole life since, i’m constantly learning new assembly languages, and it’s just the same ideas over and over.

        SIMD, though. man. every SIMD architecture is different. and they’re constantly being thrown away. learning one doesn’t help you even a little bit, because it’s obsolete by the time you finished learning it. so you’re always in the process of learning a new one. and when you’re doing that, the general awareness of assembly and differences between assemblies is going to serve you a lot better than any in depth knowledge of any single SIMD set.

        SIMD is just so fantastically arcane. i assume people get good at learning one vector instruction set after another, but they only do that because they have a strong basis in general assembly. it’s not a shortcut to learn SIMD without learning general assembly. and for learning general assembly, there’s no wrong choice. if i wanted to learn modern SIMD, starting with PIC to get a foundation in general assembly would be a better path than starting with just the MMX subset of pentium instructions from 28 years ago, even though MMX is SIMD and PIC is basically a toy

    1. Why do we pretend PIC assembly and x86 assembly are remotely similar?

      I don’t get it. Learning PIC assembly only helps you learn x86 assembly in the same way that learning any language helps you learn another one.

      I know at least 5 different assembly languages and I don’t consider any of them the same language.

      1. if you’ve learned 5 different assembly languages then you know it’s basically the same struggle everywhere. out of the dozens of assembly languages i’ve used, i am always “using a new assembly language”, i am never working from memory and familiarity (ok that’s not true, there’s one i know pretty well). and when i’m using a new one, i am always asking the same questions of the reference manual:

        can i have loads/stores of different bit widths?

        what addressing modes are available on load/store, compared to add/sub?

        what’s the conditional branch idiom look like?

        what are the string accelerators like? (move multiple, etc)

        is it “mov src,dst” or “mov dst,src”? (and why isn’t that consistent across the instructions! argh!)

        so i’m good at asking those questions and reading them off of a reference manual. the biggest struggle in learning a ‘new assembly language’ for me is just finding a good reference manual and orienting to its index and jargon. so using PIC isn’t really any different from using RP2040 or x86 for me, it’s just a question of how good the reference manual is. FWIW, PIC has better reference manuals than ARM. that’s the only difference :)

        1. “i am always asking the same questions of the reference manual:”

          This is literally what learning a new programming language is.

          There’s a reason why Rosetta Code exists. You know what you want to do, every language has different idioms for how to do it, and if you know how to program, you go grab it there and stick it in.

          The mental gymnastics you go through when learning a new device’s assembly language is exactly the same thing you do when learning a new programming language. The only way learning one device’s assembly helps you with another device (assuming they’re totally different devices, obviously) is “you know how to program, so your brain’s already wired that way.”

          1. yes and no. the difference is that once i learn a new language, i know it. but i basically never ‘learn’ a new assembly language.

            like, i know perl. i do use the reference manual quite a bit but off the top of my head i really know a lot about variables, functions, types, scoping, and some of the unique operators like =~. i know C. i do use the man pages but off the top of my head i know variables, functions, types, scoping. java, C++, ML, scheme…all of these, i know the variables, functions, types, scoping, and some some subset of the unique features.

            i’ve rather inadvertently learned one assembly language but i’m gonna put that aside because it’s not representative. in general i don’t know any assembly languages. i’ve never learned them. i do not know about load/store addressing modes, arithmetic operands, conditional branch, string accelerators, and src/dst ordering. i really don’t! i’ve written so much assembly code in different contexts in my life and i’ve never learned any of that stuff. i always look it up in the reference manual. i might remember it to the end of the day…sometimes i may even remember it to the end of the project but i only accidentally ever remember it from one project to the next.

            learning assembly is the process of learning how to use the reference manual to the exclusion of the process of learning the instructions and operands. that’s unique to assembly imo.

          2. I’m super-super confused why you think that those two things are any different. There are people who never learn a higher level language and just crib stuff from reference manuals, too.

            Like I said, there’s a reason Rosetta Code exists. Yes, there are people who just crib assembly examples to remember stuff, but that’s the same thing as someone who just grabs code examples because they’re forced to use something for a quick tool or something.

            And conversely there are people who actually do know assembly for a given device backwards and forwards. Sometimes you just had to.

          3. The word “script kiddie” comes to mind. But seriously, is there really a dichotomy here? Open up, let it soak in, what doesn’t soak in, the manual’s right there. As President Adams said “Go not abroad in search of monsters to destroy.” And definitely don’t say that in a job interview. Or do, your call. NMP.

            If you have to “orient to the index” of the manual there may be a bigger problem.

  1. Nice article. For signal processing or compiler writing, is essential to know what’s really going on under the hood.

    ARM Neon SIMD mode assembly is also quite fun to play with and GCC has intrinsics for all the assembly instructions which lowers the barrier to getting started a little. Even a lower-end Cortex-A makes a fairly accomplished DSP, and runs Linux at the same time.

    “This allows you to do something like “add 5 to these 200 data items” very quickly compared to looping 200 times.”

    I think you would still need a loop in this case, it’s just that you might update eight or sixteen of your data points in parallel every time around the loop, and the predicate for exiting the loop might need to be calculated a bit in advance to squeeze out the best performance.

    1. Even when I was writing intrinsics and doing a lot of experiments to get the most performance, gcc would outperform me from time to time with just plain C code.

      Optimizing at this level is black magic. Sometimes using smaller registers would give more performance, it’s not even that the highest supported type instruction would always be the best…

    2. “aren’t that many instances where someone writing raw x86 or x64 assembly could do that much better”

      It depends on how big the problem is. The reason compilers are awesome isn’t because they generate great code (seriously, they don’t) it’s because they allow you to factor the problem and approach it section by section and then integrate and reuse that code over and over.

      Once you start saying “okay I need to use an operating system” or “I need to use this other code I’ve written” you start having to make the same tradeoffs the compiler’s making.

    3. Yeah, but sometimes writing assembly is just more fun. I started on z-80 and 8088 assembly, then 68K for drivers, PPC, and now play around with ARM assembly (32 and 64bit) now and then on the RPIs. Not as productive as a compiler/interpreter but it is fun and keeps the mind working twiddling the bits now and then…

    4. heh i’m gonna disagree with everyone here. most compilers do generate FANTASTIC code, in addition to letting you factor and maintain in a way that would create better results even if they generated poor code. but the reason you beat the compiler sometimes — especially for ffmpeg — is that compilers barely support SIMD. and when they do support SIMD, it’s usually by a thin layer exposing the underlying assembly features to you, rather than by like automatically turning your for loops into vector instructions.

      it’s not so much that a human can outperform a compiler at creating general assembly code, more like there’s a coprocessor there that the compiler simply doesn’t know about.

      and the funny thing is, i strongly suspect that most of the hand-generated SIMD code out there is pretty awful in a lot of ways, but since the co-processor is a zillion times more efficient than the general instruction set, it is still a huge win. it’s not that you use the vector coprocessor more efficiently than a compiler would, but just that you use it at all.

  2. “you should already understand C — especially C pointers.” And once again that gets me out of an enormous amount of work. It’s a huge lacuna in my education. I fiddled with ASM in the 8086 world in my misspent youth but never worked at it to my everlasting regret. Stay in school, kids!

      1. “C is basically a scripting language for assembly.”

        It seriously isn’t. Some of it is. You can write code in C that you can turn into assembly with scripts (I’ve literally done this).

        But there are a ton of restrictions on this. The main one is simple – functions can’t take arguments or return values. Because that’s what a high-level language provides you. That’s literally the entire root history of ALGOL and its many descendants, including C. You represent code as functional algorithms.

        This is how C allows you to have code that’s totally separate and isolated and combined together like Legos: because all the code’s been factored into an algorithm form with a fixed function calling interface.

        That’s not the entirety of it, though, because C also removes some of the features that virtually every processor has – the most common being carry flag tricks. This is usually the point at which I tend to fall back into assembly on microcontrollers, because implementing soft shift registers in C is usually disastrously slow.

  3. I have a Mechanical background and became religiously dedicated to learning computer science (I learned to say architecture) since six years ago. Nothing impresses me more (ignoring microcode and FPGAs due to applicability if knowledge) than assembly language and I think this article is very keen to point readers to a learning mechanism, FFmpeg, for SIMD with assembly.

  4. The only time I’ve needed assembly has been to write marker bytes on the stack in the init file on a bare metal project for monitoring, and for figuring out stack traces in crash data. It’s interesting and useful to know how a language like C actually works, but if you need to optimise your code with assembler something is probably not good with your C code.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.