CortexProg Is A Real ARM-Twister

We’ve got a small box of microcontroller programmers on our desktop. AVR, PIC, and ARM, or at least the STMicro version of ARM. Why? Some program faster, some debug better, some have nicer cables, and others, well, we’re just sentimental about. Don’t judge.

[Dmitry Grinberg], on the other hand, is searching for the One Ring. Or at least the One Ring for ARM microcontrollers. You see, while all ARM chips have the same core, and thus the same SWD debugging interface, they all write to flash differently. So if you do ARM development with offerings from different chip vendors, you need to have a box full of programmers or shell out for an expensive J-Link. Until now.

[Dmitry] keeps his options open by loading up the flash-specific portion of the code as a plugin, which lets the programmer figure out what chip it’s dealing with and then lookup the appropriate block size and flash memory procedures. One Ring. He also implements a fast printf-style debugging aid that he calls “ZeroWire Trace” that we’d like to hear more about. Programming and debugging are scriptable in Lua, and it can do batch programming based on reading chip IDs.

You can build your own CortexProg from an ATtiny85, two diodes, and two current-limiting resistors: the standard V-USB setup. The downside of the DIY? Slow upload speed, but at least it’ll get you going. He’s also developed a number of fancier versions that improve on this. Version four of the hardware is just now up on Kickstarter, if you’re interested.

If you’re just using one vendor’s chips or don’t mind having a drawer full of programmers, you might also look into the Black Magic Probe. It embeds a GDB server in the debugger itself, which is both a cool trick and the reason that you have to re-flash the programmer to work with a different vendor’s chips. Since the BMP firmware is open, you can make your own for the cost of a sacrificial ST-Link clone, about $4.

On the other hand, if you want a programmer that works across chip families, is scriptable, and can do batch uploads, CortexProg looks like a caviar programmer on a fish-bait budget. We’re going to try one out soon.

Oh and if you think [Dmitry Grinberg] sounds familiar, you might like his sweet Dreamcast VRU hack, his investigations into the Cypress PSOCs, or his epic AVR-based Linux machine.

32 thoughts on “CortexProg Is A Real ARM-Twister

    1. Exactly what I was thinking of. Run that on a Raspberry Pi, you can program just about anything. In fact, using that with a cheap ST-Link clone will work with non-ST chips too (but not necessarily perfectly – on a project I was doing a while back (using a Kinetis MKE04Z8VWJ4 – nice little micro but had to compile OpenOCD myself to get support for the required flash driver) I went to the Pi because of issues I was having, but I can’t say for sure whether that was OpenOCD or the ST-Link to blame).

    2. Yeah, their strict review process lacks reviewers.
      Testers are even more scarce because of the often very niche hardware.

      In October I posted 6 patches to Gerrit. Only one of them ever got feedback.

  1. (Sigh)
    The only reason why one “needs” so many different debuggers for (modern, read: Cortex) ARM cores is because different debug vendors can only be bothered to support some specific devices – 90% of the code that they use to talk to a specific chip is implementation agnostic and supports anything that is an ARM Cortex. This is what [Dmitry] has realised because he bothered to read the ADIv5 spec (that’s the ARM Debug Interface specification – version 5 is the one that came along with Cortex cores – all of them, not just Cortex-M although that is all [Dmitry] is looking at).

    Enough with the hyperbole – what [Dmitry] is doing is no different to what debug vendors can do – in the end it all falls down to a trade off on cost, support and speed. While pushing things to “open source” will help with some aspects, support can be patchy and speed limited (usually because no one wants it to cost). The biggest hurdle [Dmitry] will be facing (along with the open-source community in general) are the magic runes that may be needed by the debugger in order to gain security access to some features (or even knowledge of where those features are) as they’re often behind NDAs or only released to the “main players” (Lauterbach, Greenhils, etc) which is sometimes why you need many different debuggers (software and/or hardware dongles) because different vendors were given different access.

    1. Yeah, Recently I debugged a custom-made arm processor, luckily there was some special “programmer” script for it, which was just a series of writes to special memory addresses. I was able to program/debug this with st-link without much problems (st-link utility didn’t even detect target, but after proper configuration, keil managed it). You could do everything with st-link, you just need to provide proper “knocking procedure” for some targets.

    2. And you completely skipped the part of the article where it’s explained that it’s not the *debug* interface that differs between chips, it’s the *flash hardware*, which is a manufacturer supplied peripheral with a non-standardised interface, just like all the peripherals that a manufacturer adds to the ARM core.

      So sure, if all you want is to use breakpoints and singlestep through code that is already present in flash, then ADIv5 is all you need. If you want to actually modify that code, it’s not sufficient.

      1. But that is also a solved problem. I don’t see him supporting more chips and programming algorithms than OpenOCD, st-link, and urJTAG do between themselves already. It is not like he was the first one who thought about creating an universal JTAG/SWD dongle.

        Worse, you have to reflash the dongle each time you want to work with a different target – which is a pain in the butt. Microchip’s Pickit 3 uses this system too and it is very annoying, plus there is always a possibility that something goes pear shaped and the dongle bricks itself.

      2. Those non-standardised bits you mention are the “magic runes” of which I wrote – getting details of them can be incredibly difficult (if not impossible outside of closed circles) and, if they are freely available, then I would expect those devices to be widely supported by a multitude of debug tools.
        And therein lies the rub – for [Dmitry] to create a true universal (ARM) debugger, he would need to gain better access than some of the larger players in the world can gain! (not sure about them nowadays but TI used to be really bad on this front).

    3. I think the biggest problem with what [Dmitry] did is that he has duplicated efforts behind projects like OpenOCD and BMP for little reason, creating a support nightmare for himself – all the while pairing this with a very crappy hardware solution. Bitbanged V-USB USB implementation is both slow and notorious for being flaky, not working properly with many USB ports and hubs, etc.

      That’s about the last thing one needs when debugging – having to debug the debugger. A cheap generic SWD/JTAG dongle paired with OpenOCD will do much better job for 90% of the common use cases a developer (especially hobbyist) is going to find. And pros will certainly not have issues buying J-Link or Segger or whatever else. Heck, the cheap ST Nucleo and Discovery dev boards come with ST Link debugger included and a clone from China costs $2 …

  2. Is “ZeroWire Trace” using the standard ARM “Instrumentation Trace Macrocell” peripheral? Its a fast and efficient logging mechanism, unfortunately the ST-Link dongles miss the required pin which annoys me.

    1. No, that’s his specific hack if I have understood it correctly. The ZeroWire trace is closer to the regular semihosting (you hook up your printf() to write into a specific memory location where the dongle will fetch the data from), just he claims it is faster.

    2. If you’re using a trace port then it’s using a wire so tough to call it “zerowire”. However, I have seen some marketing spin sometimes used to describe the single-pin trace port as ‘free’ if the original design used JTAG and one migrates to SWD which allows the JTAG TDO pin (not used in the JTAG to SWD pin mapping) for trace data aka SWO (“Serial WIre Output” which is the aforementioned ITM with a single-pin Trace Port Interface Unit/TPIU).

      After that, to use no pins at all, one either needs to do, as Jan Ciger described, semi-hosting (which has an impact on real-time behaviour and code images) or using an on-chip trace buffer to capture data that can then be read back over SWD/JTAG at a slower rate (in ARM parlance: an ETB to store data from an ITM or ETM that can then be read back via the DAP).

  3. I just with ARM would add a standard’ized flash programming standard to the next iteration of the Cortex M spec and require all licensees to implement a standard register interface and state machine. Of course that would add cost to implementations of Cortex M, but it would reduce overall cost to the industry.

  4. ” So if you do ARM development with offerings from different chip vendors, you need to have a box full of programmers or shell out for an expensive J-Link. Until now.” *waves finger* Don’t do that! Or more accurately, they want lock-in with all the advantages it gives them, and the disadvantages it gives you. Suppose we should be thankful they’re not ignoring us completely.

    1. If the ASIC has been correctly implemented (as per ARM guidelines) then the debug infrastructure (and so the SWD interface) should have no relation to the CPU being in reset or not. In fact, a typical use-case (particularly with debugging startup firmware) is being able to debug from reset which requires the debug logic to be setup while the core is held in reset.
      If you really wanted to deep dive (and, as mentioned before, if it was correctly implemented) then the Debug Port (the first module on the layers of abstraction as covered in the ADIv5) even has provision for controlling system vs debug power domains (which if the domain is off then the reset is effectively held)

      1. Which is fine for large pincount devices. But on an 8-pin devices, dedicating 2 pins for debug is quite limiting.
        Since there is no reset line on cortexprog, maybe it toggles Vcc to force the target into power-on-reset. But that won’t help when you have external power for your target.

        1. Sorry, maybe I misunderstood your original question but you seemed to be implying that one could have a reset control pin (externally) that can be asserted – while there’s no combined SWDIO/SWCLK solution, there are existing ways to push all the reset controls inside the device (i.e. no pins).
          If you toggle Vcc to the target then you’d certainly hit power-on-reset which has the effect of resetting everything inside the device, even the debug logic so most traditionally debug tools would keep that static (although there’s nothing to stop an ASIC implementation from doing something esoteric and having say PoR coming off at one voltage on Vcc but keeping the core in reset until a higher voltage is seen – but that is very much in the land of “implementation defined” and not something cortexprog could rely on)

          1. I’m not talking about step-by-step debugging, which I know rewuires dedicating the swdio&swclk pins for debug. For pin-limitied devices I want to use the swdio and swclk for GPIO. When I don’t even have a single spare gpio for uart debug output, save debug logs in RAM or eeprom. When I want to check the logs, I reset and then read them.
            What I really want is something like AVR debugwire, which lets me use the reset pin for interactive debugging and logging, while leaving 5 pins free for gpio on a 8-pin device.

          1. You’re missing his point – if you’ve only got a few pins, you could be using SWDIO / SWCLK for something else as well and can’t use them for debugging or reset.

  5. Err what?!? There’s a standard interface for all ARM processors called CMSIS-DAP ( and there’s a standard and open debugger firmware to use CMSIS-DAP called DAPlink ( which is actually used by plenty of vendors out-of-the-box with the most notable exception being STMicro although there’re ports to STMicro controllers as host as well, e.g. here:

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.