Flashing An ARM With No Soldering

[Sami Pietikäinen] was working on an embedded Linux device based on an Atmel SAMA5D3x ARM-A5 processor. Normally, embedded Linux boxes will boot up off of flash memory or an SD card. But if you’re messing around, or just want to sidestep normal operation for any reason, you could conceivably want to bypass the normal boot procedure. Digging around in the chip’s datasheet, there’s a way to enter boot mode by soldering a wire to pull the BMS pin. As [Sami] demonstrates, there’s also a software way in, and it makes use of mmap, a ridiculously powerful Linux function that you should know about.

Embedded Linux devices and the microcontroller in your Arduino or clone aren’t nearly as different as you’d think — you just haven’t read the datasheet for the former. If you’ve gotten deep into microcontrollering, you’re used to the paradigm of controlling the chip’s functionality by twiddling bits that lie in memory-mapped hardware registers. Flip this bit and an LED lights up. Flip that bit and you change the PWM peripheral’s clock speed. That sort of thing.

In Atmel’s SAMA5D3x, there’s a register that controls the initial boot media. Not surprisingly, nobody has written a Linux device driver for setting these bits, so you’ll have to flip them yourself. And the easy way to do that is using mmap which does just what it says — maps a region in memory to some physical peripheral and vice-versa.

Most of the time, you’re better off using the kernel and its drivers instead of directly setting registers with mmap — there’s no mechanisms to prevent multiple access for instance. Fooling around with mmap, with the possible exception of directly memory-mapping files, is best left for debugging device drivers or crashing your system. Unless you need to control the boot mode of a chip by setting some bits directly, that is.

[Sami]’s demonstration isn’t anything super secret, and some of you will cry “not a hack!1!!” because he’s just using information straight out of the manufacturer’s datasheet, but we found his writeup to be a nice tutorial and reminder of just how powerful memory-mapping can be in bridging the gap between what you might think of as a computer and a microcontroller. If you want to mess around with an embedded Linux system, and you can get root, mmap may be just the ticket. Or you may just be interested in blinking LEDs very quickly. Pick your poison.

7 thoughts on “Flashing An ARM With No Soldering

  1. Actually, I would call it a hack… an outright ugly one.
    It’s fine when you’re developing something and wish to interactively tinker with the peripheral, but once you have a solid understanding down pat, I’d recommend moving that code into a kernel driver for maintenance reasons.

    1. meh. it’s ugly from the perspective of a multi-user unix environment. downright slick from the perspective of embedded development. cramming the two together produces some aesthetic confusion but if you’re gonna do it, accessing the memory-mapped registers directly is not unreasonable.

      1. Well, memory mapped I/O is how many systems interface to peripherals. CGA on the IBM PC/XT is memory mapped for example, and yes, many applications bypassed “the interface” (BIOS calls) to write directly to it. Usually this is done within the kernel on the device. If the device is simple enough, the application is the kernel (e.g. many 8-bit home computers). DOS is more of a boot-loader and filesystem driver for applications than a kernel.

        mmapping /dev/mem from userspace is a non-portable hack that can be a real pain to maintain. Using that locks you down to that specific combination of hardware and that particular OS. You can use mmap to bitbang I²C for example, or you can make use of i2c-gpio if you’re using Linux. One locks you down to a specific implementation of GPIO, the other is generic and while won’t be as fast, the difference in performance isn’t likely to be a killer.

        I’ve used devmem and its elk to play around with registers before, and that was invaluable whilst getting to understand the beast that was the I²S routing logic in the Freescale i.MX27. When you’re learning how to drive a particular piece of hardware, it’s an invaluable tool.

        Having gained understanding in how to work a piece of hardware, even on a single user system, it is cleaner to then write a kernel module that can expose an interface to userland via some interface. Then, the device nodes can be assigned permissions that allow your application to run as an unprivileged user, reducing the impact of security exploits.

        If you move hardware, you can write a new driver to present the same interface, the application in userspace needs no changes. It forces modularisation of your application which is generally a good thing, and you can do things that aren’t easily possible just hacking it with mmap from userspace.

  2. I just taught myself a couple of months back how to use mmap. The Linux sysfs-interface to the GPIO-pins is excruciatingly slow and I wanted to speed this up on my C.H.I.P. — toggling-speed went from a couple of hundred kilohertz to about 5.4MHz with not-very-optimized code. Also, the sysfs-interface offers no way of setting the internal pull-up/pull-down resistors — they are always disabled — but using mmap-access I could set those myself, which is quite nice.

    Mmap-access is good stuff and the sysfs-interface is only good for extremely rudimentary, slow things. I hear there’s a new GPIO-system in the newer kernels and I have yet to play around with it, but I am under the impression that it doesn’t support setting pull-ups/pull-downs, either, so it seems I’m still going to be using mmap for the foreseeable future.

  3. mmap() is also worth its weight in gold for low latency I/O if you have a performance critical task that doesn’t belong in the kernel but cannot afford the multiple microseconds burned in context switches and a trip through the scheduler.

    It is also useful for interacting with hugetlbfs mounts to allocate large pages (either to get physically contiguous chunks up to 1GB for your low latency DMA needs, or just to have a large hash table or tree in core without having to take a TLB miss in addition to a cache miss for every random access. In addition, on x86_64 at least, any software prefetch instruction that asks for an address not currently mapped in the hardware TLB will be ignored so if your working set exceeds what fits in the TLB in 4k pages you can’t effectively prefetch ahead of yourself without the help of hugetlbfs and mmap()).

  4. I call it ‘mmaping into /dev/mem’…..This technique is not new and has been used on the RPi and beaglebone to access a variety of peripherals for years now. For example direct register access to the GPIO module via mmaping into /dev/mem allows much faster GPIO toggle speeds (10s of MHz) than what can be achieved with the traditional SYSFS based approach (100s of KHz). It also allows for fast PWM on the GPIO via use of the DMA and doing a few other neat tricks.

    Only drawback…is that any program that access /dev/mem needs to be run as root. RPi Foundation’s workaround was to create an alternate access mechanism /dev/gpiomem…..mmaping into that one doesn’t seem to need root access.

Leave a Reply to Redhatter (VK4MSL)Cancel reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.