Modern microcontroller platforms spoil us with their performance and expansive spec sheets. These days it’s not uncommon to be developing for a cheap micro that has a clock rate well in excess of 100MHz, with all manner of peripherals baked in. DACs, WiFi, you name it – it’s in there, with a bunch of libraries to boot. It wasn’t always this way, and sometimes you would even find yourself lacking hardware serial support. In these cases, the bitbanged software UART is your friend, and [MarcelMG] decided to document just how it’s done.
The amateur programmer’s first recourse may be to use delays to properly time the output data stream. This has the drawback of wasting processor cycles and doesn’t let the microcontroller do much else useful. Instead, [Marcel] discusses the proper way to do things, through the use of interrupt service routines and hardware timers.
[Marcel]’s implementation is for the ATtiny24A, though it should be easily portable to other AVR8 processors. Taking up just 2 bytes of RAM and 276 bytes of program space, it’s compact – which is key on resource-limited 8-bit devices. The code is available on Github if you fancy trying it out yourself.
It’s a technique that is more than familiar to the old hands, but useful to those new to the art. It can be particularly useful if you need to get data out of a legacy platform with limited options. As times change, it’s important to pass on the techniques of yesteryear to the new generation. Of course, if things are really tight, you can even do a half-duplex UART on a single pin.
The USI hardware on ATTiny’s is actually really good for UART functionality. For writing of course it’s super simple. You can queue up to 16 bits and let it run, preferably loading up another 8 bits each time the buffer empties.
Reading is a bit more difficult because the signal is asynchronous. You can’t sample once per baud rate, you’d risk getting a sample time right around transition and get the occasional double bit or missed bit. Typical hardware samples around 16 times per baud period. It then adjusts the sample window when it sees a clock edge. You can get away with 4 (or even 3) samples per baud period and get reliable communication.
So for a bi-directional uart, you need to have 4 bits of buffer space for each bit on the channel. You end up servicing it on average every 2 channels bits, but since there are two USI buffers you can have up to 4 channel bits interrupt latency and still not drop things.
Another way to do things is sample on start bit. This is a little more tricky but can save a lot of processor time and code complexity. When the line goes low, it’s a start bit. You then wait 1.5 bit periods and then take 8 samples, once per bit period. And then of course setup for the next start bit. If things get out of sync with this method it can be very difficult to resync if the bitstream is steady.
That’s true, but the disadvantage of the USI hardware is that it’s tied to specific pins. But if you’re okay with using the specific pins you’re right, USI is probably the way to go.
Yeah, just stupidly associated MOSI, MISO to the same pins as DI, DO. Possibly the USI uses the same HW as flash programming. This is why I prefer software UART.
Yes, the USI is the way to receive on a ATtiny45/85.
I’m doing MIDI which is 31.25Kb/s and not dropping anything.
http://blog.dspsynth.eu/the-dsp-g1-synth-on-the-attiny85/
It’s a nice tool to have, but in most cases, learning to use a more capable microcontroller is a better tool.
The ones where you need to read 3.1415kg of documentation before you even are able to blink a LED?
;-)
Btw.: Happy PI day!
You mean 3.1416 right? ;-)
Who’s taken a bit of my Pi?
0.0001. Om nom nom nom!
Not like your typical user is going to read documentation anyways. He/she could copy/paste some code and use a framework without wanting to understand any of this.
Just for the record, a simple Pi aproximation: 355/113 (mnemonic rule: 11 33 55) I read it in a math or programming book many (40 ?) years ago (sorry can’t remember the title now).
Best regards,
Daniel.
“How I wish I could calculate pi”
3 .1 4 1 5 9 2
(the number of letters in each word)
Very nice, Marcel. Cycle counting and bit-banging a serial interface can be quite fun. I posted an example on PICLIST many years ago for interrupt driven full-duplex 9600 baud on a PIC16F819 or PIC12F683 (8-MHz INTOSC) with 16 byte RX and TX circular buffers. The timer driven interrupt fired at three times the bit rate (every 59 instruction cycles) and used 34 or 35 cycles (about 50% ‘overhead’). The ISR driver, Init, Put, and Get routines used ~189 words of program memory.
its a good example of one of the bad points of arduino. by wrapping all the hardware into a couple of general purpose features you generally are not aware of how powerful that hardware can actually be. sure you can delay(), or pulseout() all you want. but you dont really learn how such low level systems actually work. also the attiny has a usi which can be configured to do half-duplex asynchronous serial, why dont you use it?
I actually think of this as an advantage of Arduino and other “high” level uC frameworks. You get lots of convenience, but you can always drop it and seamlessly work with the lower level stuff when you need. Of course, if you have the attitude of “meh I don’t care how any of this Arduino magic works” it can make you a shitty programmer, but it can also enable a skilled user to focus on the parts they care about.
im generally pro arduino but this is one of the areas where it is weak. but i see a lot of people do something in a round about way when the chip already has that feature built in.
Yeah, I started learning Arduino and within a few months I was already digging past the high-level layer and learning about direct port manipulation, AVR assembly, etc. Now I’m actually much more likely to pull up the datasheet for the device I’m dealing with and write specific values to registers rather than using Arduino functions (but I still do use a few – delay() is still convenient for example). Sure, it makes my code less portable (even to other AVRs) but I can optimize my code in ways that no framework can even begin to offer.
But I would argue that if I hadn’t had Arduino to get me started I might not have made the jump to the lower-level stuff as easily. I would often get something working with Arduino and a third-party library, and then go “hmm, I’m using 25K of bytes of my 32K of flash… can I optimize this?” Then I’d look at the code from the library and dig into the Arduino framework’s source code and ultimately I’d rewrite exactly what I need using lower-level constructs, and I got to where I could take a 25K firmware and get it down to like 7K while it simultaneously was more efficient and faster.
For hobbyists there’s no real advantage to sourcing, say, an ATmega88P versus an ATmega328P, so simply reducing the code size isn’t necessarily a big deal for hobby projects, but at the least I have a marketable skill! (And in larger volume, an 88P might actually be notably cheaper–or even just easier to source–than a 328P.)
I tried to do it mainly for educational purposes from the ground up. I think is just makes you feel proud when you did something completely from scratch (and it works!) vs. just using someone’s library. :)
Another advantage of this approach is that you can use any GPIO pin, no only the one tied to the USI.
276 bytes of flash for code is over 4x the size of my half-duplex soft UART that you linked to. It’s not just for single-pin operation as a one-line code change lets it work with separate Rx and Tx pins.
Or just use the version I tweaked further that defaults to PB0/PB1 for Rx/Tx. It takes 54 bytes of flash for the code.
https://github.com/nerdralph/nerdralph/tree/master/avr/libs/bbuart
You are missing the point here. Your code use idle loop for delay wasting these cycles. Your code is caught in the rx and tx procedures until the end of byte transmission.
[MarcelMG] point is to avoid that using a timer and interrupt.
Pretty sure the graphic for this post would have been a meme for some AVR or PIC Fanboy page 10-15 years ago. Microchip logo on an AVR part.
Better has Microchip logo on package then XiaoJing logo ;)
Atmel sell to Asia some IP and compatible chips with ATmega cores are on the market long time.
check https://en.wikipedia.org/wiki/AVR_microcontrollers#Other_vendors
I implemented a soft RX routine that doesn’t waste any CPU cycles by timestamping signal transitions within the interrupt routine and reconstructing bytes offline.
https://github.com/sdo9/talking-multimeter/blob/master/dmm-talking-arduino-sketch/soft_rx.c
nice idea, just needs more RAM.
I did a similar routine for the Z-80 CPU (which has no UART), 35 years ago.
More people need to learn how to do stuff like this! Writing highly optimized code that deals with low-level stuff is becoming a lost art. If nothing else I’m glad we have platforms like Arduino today that make the barrier to entry lower, but I honestly have more fun trying to figure out how to get something to work in 2K of memory than I do writing “normal code” for modern computers.
This is a bit of an older post, and I’m a bit of a novice but I arrived here by trying to address a perennial issue with assorted flight controls- that is too few UARTS and i2c (or no i2c’s at all). I’m getting familiar with all this but it sure seems to me like it should be possible (or easy even!) to consolidate the usual assortment of navigation sensors into a single low bitrate, low frequency data stream. The three usual sensors are GPS (2 coords- this typically occupies one Uart), Magnetometer (Compass- this is typically i2c), and Barometer (this is also i2c but can share with compass). From my layman’s perspective it seems like we could bit bang that data into a single Uart using a resource/technique like this and allow for the flight control to read and react to it. Please let me know if/why I’m wrong. These three sensors enable a whole assortment of features, so I’m always saddened when I can’t get this critical trifecta.