Often the Morse Code centered projects that we feature are to help you practice transmitting messages. This one takes a tack and builds an automatic decoder. We think [Nicola Cimmino’s] project is well worth featuring simply based on his explanation of the Digital Signal Processing used on the signal coming in from the microphone. Well done. But he’s really just getting warmed up.
What makes this really stand out is a brilliant algorithm that allows conversion from Morse to ASCII using a lookup table of only 64 bytes. This provides enough room for A-Z and 0-9 without chance of collision but could be expanded to allow for more characters. Below is a concise description of how the algorithm works but make sure you take the time to read [Nicola’s] project description in its entirety.
The algorithm can be decribed as follows. Have an index inside the lookup string inizialied to zero. Have an initial dash jump size of 64. At every received element (dot or dash) halve the initial dash jump and then increase by 1 the index inside the lookup string if a dot was received and by dash jump size if a dash was received. Repeat until a letter separator is reached, at that point the index inside the lookup string will point to the ASCII corresponding to the decoded morse.
Have you heard of this technique before? If so, tell us about it in the comments below. Before you jump all over this one, realize that Magic Morse uses a different technique.
27 thoughts on “Morse Decoder’s Lean And Sexy Search Algorithm”
If I’m reading it properly, this is just a 6-deep binary tree stored in linear format — as commonly used in heaps. Actually it’s a nonstandard version of linear format, but it seems to work just as well (the standard version would give a string of “.ETIANMSURWDKGOHVF.L.PJBXCYZQ..54.3…2…….16…….7…8.90”, also 64 characters)
His string is:
So for your 543210 bit positions, his is 123450
Sound a bit like a binary search in which you take left side or right side. That would eventually get you to the letter based on the length of the symbol.
In this case, the two sides are folded together and offset by one. Not sure if there are any computational significant advantages as the search algorithm is of the same order as binary search. May be you would on the number of local temp variables.
It is different way of storing the search data and that’s about it.
incidentally, searching for “ETIANMSURWDKGOHVF” on Google turns up a substantial number of hits. “EISH54V” gets many less, but one of them is a solution to LiraNuna’s morse code golf problem on SO, so apparently someone thought of that representation before.
Can you not treat each if dot/dash as a binary bit ‘0’ or ‘1’ in serial bit stream and treat “word” as a binary and use that to directly index into the array for look up?
i.e. serial -> parallel conversion -> look up number -> letter
tekkieneet: you can’t, because if you did, U = “..-” = 001 and T = “-” = 1 would get the same index. So would L = “.-..” = 0100 and D = “-..” = 100, and many other pairs. You need an extra bit to account for the length of the code. You could do it by prepending a 1 bit to all of them, which gets you yet another 64-entry table :)
Ah. Forgot about the variable length part.
So it is really has 3 values: dot/dash/end of word, with the end of word only happens once. Hence your suggestion of prepending would work.
Sounds like something that can be implemented in a CPLD + some timing logic and drives a 14/16 segment display. (pun intended)
Saw this (or at least similar) algorithm in a Swedish computer magazine in the 80’s. I thought it was brilliant back then, but it is (as many before me points out) a binary tree in linear form.
At my mind, the performance of any algo isn’t in the searching of the character in any matrix, but in the detection of the valid code despite dots and dashes that haven’t the good shape during the transmitting.
It is generally the case during manual transmitting.
But perhaps, I didn’t really understood your algo ?
That’s what I was thinking too… and if you want to correct poorly transmitted/received data, a simpler lookup table may be more flexible. The speed and memory requirements for Morse code are very low anyway.
The letters and digits in Morse are all five or fewer elements, but several common symbols are longer than that. The question mark, period, and comma are all six elements long. There are quite a few prosigns in common use that are composed of double characters with no space in between them — for example, BK, (meaning roughly “over”, or “It’s your turn to send” is formed with an B and K stuck together with no space. That’s seven symbols, -…-.- A sequence of eight dots is the “telegrapher’s backspace”, in other words, it means that there was an error in the previous word or letter. At nine elements, perhaps the longest prosign is also the most commonly known one, SOS. Though it’s written with three letters, it is sent on the air as one long character, with no space between the dots and dashes.
More of the common prosigns are listed on Wikipedia http://en.wikipedia.org/wiki/Prosigns_for_Morse_code
So while that 64 byte table may handle over 90% of the Morse characters transmitted on the air, it will miss some important ones.
73 de AG6QR
Thanks. Learned a lot in that article. Who am I kidding- I was made aware of how little I know.
That was actually one way, long ago (about 45 years ago) that Morse code was taught. I can recall my dad bringing home sheets of paper with one “tree” (what we’d call a binary tree today) on each page. I no longer have the pages and can’t credit the original author.
The morse code tree is what I saw as i read this description.
In fact you can see the sequence running from left to right in the diagram. Hit submit too quickly. 73
Yeah, this seems like the obvious way to encode it. I did the same thing for a demo back in college: http://www.kk4ead.org/pic16cw.asm
I was always thinking of how a keying device could be used in place of a keyboard. This might a good way to start.
Here’s a ready made solution for that :)
Damnit, I really need to learn German. lol.
You’re not the first one who was thinking about that. You can find a morse key-keyboard using an Arduino over here: http://www.elektronik-labor.de/Arduino/MorseKB.html
And a similar one using an AtTiny over here: http://www.elektronik-labor.de/AVR/VUSBmorse.html
Thank you. Love the second circuit. It looks simple and interesting.
Come to think of it… isn’t Morse code formally an expression of asynchronous sequential logic?
The algorithm to me looks like a DFA that has been modified to trade computations efficiency (storing the pointer to the next state in the table) for storage efficiency (calculating the next position in a binary tree). If what your trying to recognize fits compactly into a binary tree (such trying to decode Morse dots and dashes) then this works nicely.
I’m doing something similar using a CPLD for a project. The crazy thing is just how compact the code is when a lookup table enters the picture. Without it, you’d end up with a switch statement with at least 100 potential outcomes. With it, I think it’s around 10 lines of code and still could be shorter if I didn’t unroll a loop unnecessarily.
Im new to this what would the premp circuit look for this kind of microphone?
Please be kind and respectful to help make the comments section excellent. (Comment Policy)