Better Stepping With 8-Bit Micros

The electronics for motion control systems, routers, and 3D printers are split into two camps. The first is 8-bit microcontrollers, usually AVRs, and are regarded as being slower and incapable of cool acceleration features. The second camp consists of 32-bit microcontrollers, and these are able to drive a lot of steppers very quickly and very smoothly. While 32-bit micros are obviously the future, there are a few very clever people squeezing the last drops out of 8-bit platforms. That’s what the Buildbotics team did with their ATxmega chip — they’re using a clever application of DMA as counters to drive steppers.

The usual way of driving steppers quickly with an ATMega or other 8-bit microcontroller is abusing the hardware timers. It’s quick, but there is a downside. It takes time for these timers to start and stop, and if you’re doing it two hundred times per second with four stepper motors, that clock jitter will ruin your CNC machine. The solution is to use a DMA channel to count down, with each count sending out a pulse to a stepper. It’s a clever abuse of the hardware, and the only drawback is the micro can’t send more than 2¹⁶ pulses per any 5ms period. That’s not really an issue because that would mean some very, very fast acceleration.

The Buildbotics team currently has a Kickstarter running for their four-axis CNC controller using this technique. It’s designed for Taig mills, 6040 routers, K40 lasers, and other various homebrew robots. It’s an interesting solution to the apparent end of the of the age of 8-bit microcontrollers in CNC machines and certainly worth checking out.

43 thoughts on “Better Stepping With 8-Bit Micros

  1. They already using RPi, so why don’t they drive stepper controllers with it? IIRC RPi can output signals at 80MHz. Beaglebone can run peripherals even faster. Their clever hack gives them top frequency of a bit over 13MHz. They should drop the ATXmega and drive the controllers directly. Unless they can’t because they are bad programmers…

    1. RPi’s are terrible for this kind of work because the usual OS (Linux) is horrible for Real-Time control work needed to drive steppers directly. The time it takes Linux to get back to an ISR is horrid in it’s extremely random, which is just fine for regular computing but not so much when driving external hardware and precise timing is critical.

      Thats why using a Pi for interfaces and high level stuff, while an ATMega, STM8, SAM32, etc are left to do the real work driving steppers where timings are clock-cycle accurate

        1. I agree, that there are better solutions just like you said.

          But you asked:
          > They already using RPi, so why don’t they drive stepper controllers with it?

          And [RobHeffo] was kind enough to spend some lifetime answering that question of yours (and what he pointed out was also perfectly true).

          So coming back with “Still there are better solutions..” is not proving your point or anything along that lines.
          But whatever.. I am not the Internet-Behaviour-Police.


          1. I mentioned Beaglebone. And RPi can run a RTOS. But okay, let’s assume that RPi sucks. Tere are alternatives to using and abusing an 8-bit micro. Cheaper ones, and simpler…

        2. I agree with [Moryc]’s thought provocation.

          Personally I would use a 8-bit micro-controller driving a CPLD to handle the functions that need fine time graduation.

          Using a 32-bit micro-controller is not that much different but as soon as your stepper drive code has to share time granulation with and OS then it’s just useless.

          Time granulation is key to driving steppers well and an OS is no good for that.

          1. I’m getting to old to follow this new fangled vocabulary. I’m translating to old-speak:
            fine time graduation = fine timing
            share time granulation with and OS… = share time divisions (or time slices) with and OS
            Time granulation is key… = timing is key

          2. @[TheRegnirps]

            Time granulation is the smallest unit of time needed for the task.

            You want the OS to return to the stepper code at precise times. The problem is that most OS’s have tasks that use huge chunks of time to complete the task so that it doesn’t return to your stepper code at the precise times needed. This is catstrophic for stepper control in terms of acceleration.

            You can fix the OS’s by redesigning it or find a kernel that has been fixed OR just hand the high speed tasks to a micro or CPLD/FPGA.

            Timing is key for fast steppers. Then you are working with close to optimum acceleration and deceleration then the two considerations are friction and mass/momentum.

            With acceleration you adding small amounts of energy at the right times so that it adds momentum to the system. If the step even is too early or too late then you are instead taking energy from the system,

          3. My Bad –
            CPLD – Complex Programmable Logic Device – You can get one of these and program the internal connections to build a custom logic circuits. CPLDs are smaller FLASH based programmable devices that don’t have a lot of logic or RAM.

            FPGA – Field Programmable Gate Array – Like a CPLD in steroids, has lots and lots of programmable logic and even largish blocks of RAM. It isn’t FLASH based to it has to read in it’s configuration from a Serial RAM when it’s turned on.

            Both are very good for custom or parallel or timing critical tasks.

      1. I think the larger problem with the Pi is the lack of the available I/O and integrated peripherals. The SoC is not really meant for this type of application so it lacks hw that even the cheapest micros have.

        Linux can be “manhandled” into soft-realtime without too much hassle, especially if you write a kernel module (“driver”) to do this (how do you think disk or USB I/O is done?). However, without having an actual peripheral to program it is rather pointless.

        1. Running at a GHz (from cache) on an ARM means greater than 1000 instructions per microsecond. Think of it as running a 1000 instruction piece of code a million times a second. Ad in the ARM instruction set, that is typically a couple hundred lines of C code. You should be able to do anything related to the fastest of motors with ease – lots of motors at once. FreeRTOS, MicroC/OSIII, etc. There are real-time OS’s that will run on the Pi and the clones and compatibles.

      2. isolcpus + nohz_full + maybe smp_affinity, no need for real time hacks/patches, instead you dedicate/sacrifice core to one userspace process and gain 100%* of cycles without interruption. This is how big boys do it to achieve ridiculous real-time thruoutputs

        *Pees LOVE to throttle invisibly to the kernel (unless pee foundation finally wrote module properly reporting back), either underclock or provide active cooling, otherwise you will be scratching your head wondering why guaranteed real time code speed execution varies.

    1. What for? To implement timers with output capture? Which you have built-in in any reasonable micro?

      Moreover, small CPLDs suck for this because they don’t have enough flip-flops. Common CPLDs are around 72-144 flip-flops. The exception are Lattice parts but those are more a small FPGAs than CPLDs (and cost as much too). One 32 bit timer needs 32 flipflops (surprise), then for each output capture line you need another 32 flipflops for a register, some more for clock division, some more again for talking to the host system, etc. And you would be doing it with a lot more expensive part too.

      Moreover, you will still need a G-code interpreter and all the rest that you can’t really do on a CPLD.

      1. @[Jan Ciger]

        72-144 flip-flops reminds me of the old 5V era CPLD’s like Xilinx XC95x00 series. That was a decade ago or more and things have moved on since then.

        Even the ancient Altera EPM series leaves that for dead!

        The newer and cheap Cyclone II series starts at a level much higher than that.

        This project is about a stepper driving system that is driven by a Raspberry PI.

        I an saying that the high speed function should be in a simple CPLD that is driven by a Micro-controller of whatever bits.

        CPLD is inherently good at isolated parallel functions.

        Throwing the whole lot including G code decoding into a larger FPGA would be less cost effective.

    1. 16 bit never took off because Texas Instruments marketing fails miserably in promoting their processors. Nevertheless in practical terms 16 bit were never necessary because you can either abuse 8 bit or “under use” 32 bit microprocessors.

  2. While this is a neat trick, the practical usefulness of this is probably going to be limited by the fact that it really does need an ATxmega (lesser AVRs don’t have DMA) and those work out to be more expensive than 32-bit ARM chips such as the STM32F103.

    1. Pretty much this. Everyone and their grandma is moving to ARM these days because the parts are plentiful and cheap. Building something on an obscure architecture like the XMega is pretty much doomed from start – expensive, completely proprietary tooling required and the XMega may be well on a chopping block now because it competes with way too many other offerings Microchip has.

      1. The XMega does not need proprietary tooling. I use gcc and avrdude both are Open-Source. ARM chips are great but I have yet to find a Cortex who’s peripherals come even close to the quality and ease of use of the XMega series chips.

        I like to work on bare metal. Most people using ARM (or AVR for that matter) pile on bloated hardware abstraction libraries. I like the XMegas because the hardware itself provides a great API without the need to add software layers.

    1. The TinyG moved to ARM because of the problems we fixed. We did start with the TinyG firmware but have moved way beyond it. The only part of the TinyG firmware that still remains in the Buildbotics controller is the motion planner. Even that has been heavily modified and I’m in the process of completely replacing it.

  3. I’ve always thought that a good way to go would be to use very low-end 8-bit MCUs, with a dedicated MCU per axis, linked by a fast UART bus.
    The nice thing with this is that there is no possibility of jitter caused by activity on other axes, and it’s scalable to any number of axes. The code stays very simple, so testing & debugging is easier. The MCU can also handle current and endstop sensing, maybe also temperature monitoring and encoder feedback.

        1. If not tuned correctly. Properly tuned PID servo loop with decent encoder count will out perform stepper motors. Almost all industrial CNC machines use servo motors these days. I’ve been using servo motors (Geckodrives) for years on my bench mills. Never had a problem with under/overshoot.

    1. Sounds good. It reminds me of the propeller chip, though I have never used a propeller.

      The issue with the need for different step rates could be to specify an endpoint along with the time for that endpoint to be reached.

  4. “The electronics for motion control systems, routers, and 3D printers are split into two camps. The first is 8-bit microcontrollers, usually AVRs, and are regarded as being slower and incapable of cool acceleration features. The second camp consists of 32-bit microcontrollers, and these are able to drive a lot of steppers very quickly and very smoothly.”

    Been out of the loop for awhile but I remember when there were dedicated inexpensive ICs for driving motors since it’s such a common need.

  5. There are tons of CNC controllers out there, with 32 MCU and FPGA driving the steppers under $200 usd.
    Even CSMIO/IP-M 4-axis Motion Controller is just only 229.00 €.
    And you can buy it right now, there are experienced users who can help, and a real manufacturer.

    So why would anybody buy a project for more than 2.5x (the lowest pledge is $475 the next level is $650), which is not ready, who knows when will be ready, no real life experience with it etc. Oh I got it the same people who buys plastic CNCs for more than 10x.

    Anyway just don’t driver critical signals from software.

    1. Now thinking –
      Micro-controller controlling bi-lateral switches that switch capacitors in and out of a 555 oscillator circuit and siad micro reading the output frequency to adjust capacitance . Said 555 then driving a 4 state Phase Locked Loop with each phase output driving a step coil of the stepper motor. The steps numbers could be displayed in nixies and the PPL phase error could then drive a decatron.

      To do the 555 true justice you could have a Raspberry PI do nothing more then generate the square wave that drives the voltage step up for the nixie and decatron.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.