Conventional wisdom holds that the best way to learn a new language is immersion: just throw someone into a situation where they have no choice, and they’ll learn by context. Militaries use immersion language instruction, as do diplomats and journalists, and apparently computers can now use it to teach themselves Morse code.
The blog entry by the delightfully callsigned [Mauri Niininen (AG1LE)] reads like a scientific paper, with good reason: [Mauri] really seems to know a thing or two about machine learning. His method uses curated training data to build a model, namely Morse snippets and their translations, as is the usual approach with such systems. But things take an unexpected turn right from the start, as [Mauri] uses a Tensorflow handwriting recognition implementation to train his model.
Using a few lines of Python, he converts short, known snippets of Morse to a grayscale image that looks a little like a barcode, with the light areas being the dits and dahs and the dark bars being silence. The first training run only resulted in about 36% accuracy, but a subsequent run with shorter snippets ended up being 99.5% accurate. The model was also able to pull Morse out of a signal with -6 dB signal-to-noise ratio, even though it had been trained with a much cleaner signal.
Other Morse decoders use lookup tables to convert sound to text, but it’s important to note that this one doesn’t. By comparing patterns to labels in the training data, it inferred what the characters mean, and essentially taught itself Morse code in about an hour. We find that fascinating, and wonder what other applications this would be good for.
Thanks to [Gordon Shephard] for the tip.
19 thoughts on “Machine Learning System Uses Images To Teach Itself Morse Code”
Morsecode using machine learning ? Seriously ?
Waiting for the next “breaktrough in ai”: decoding i2c using gpu clusters.
It is a bit like I2C, except it’s sent by a human, which tend to have inaccurate oscillators. The receiver also doesn’t get the luxury of discrete clock signal for reference, and the signals often travel intercontinental distances, which makes it just a bit noisy
Now try dolphins.
That I would like to see.
Given how simple and straightforward Morse code it, what’s the point of having a machine learn it. Simply tell it that dot-dash is A and so forth.
How does a computer tell what is a “dot” and what is a “dash”? How does it know when to identify the sounds are one character or one word? For example, “IT” and “U” have the same pattern of dits and dahs, how do you distinguish between them?
Reception will involve static, interfering signals, fading. Not to mention human-induced variations in sending (each human has a characteristic “fist” that’s different from another’s)
As I understand existing code readers, they can do a pretty good job, but not perfect. I haven’t seen one in person.
This would be a good project for a neural network. It’s relatively simple but not trivial.
A computer is not needed – a state machine that walks a tree will do. This is how the Navy did it with AN/UGA-3 Morse to TTY converters back in the mid 1960s. They did not work that well when trying to convert sloppy code, but apparently did for clean code.
I understand the machines did not last long in service, as the good old human ear and brain is so much better (and it gives the radioman something to do).
I’d meant to mention that in Ham Radio magazine around 1971 someone wrote about their morse decoder, using transistors and maybe some logic ICs. Very complicated at the time, but yes, done without a computer. I think he did some preprocessing to more easily determine dots from dashes.
These things are worth studying to see if they offer a different path to bring into the computer age.
It proves that you can teach a NN to recognise any related modulation scheme and symbol encoding method. The next step would be to do unsupervised learning where the patterns are automatically clustered and classified so you are extracting symbols from an unknown protocol, then you just need to decide what the symbols mean, which another NN could do by looking at the relationship between symbols in the stream. AI is like that, many derps doth make the genius.
Morse code is well defined. The problem is it’s usually sent by people, not machine, so it varies from person to person, and maybe even varies as one person sends.
So machines can get confused by that.
I think it was the eighties but don’t remember an exact date, 73 Magazine had an article about a CW processor. Not just a filter for the tone, and detector to turn it into DC, but a few opamps to try to decide what was a dot and dash before it’s sent into the computer.
The comouter was then presented with dots, dashes and I think spaces, and just had to translate them into characters. It made sense, thiugh I guess with computers today they could as easily do the preprocessing themselves.
-6 dB signal-to-noise ratio at an unspecified bandwidth, eh?
The code that adds the noise to the audio is here https://github.com/ag1le/LSTM_morse/blob/master/MorseDecoder.py#L293-L314. In this experiment I used 8 kHz sampling frequency so the audio bandwidth is 4 kHz. When demodulating the audio I use 3rd order Butterworth lowpass filter with cutoff frequency of 25 Hz – see code in here https://github.com/ag1le/LSTM_morse/blob/master/MorseDecoder.py#L81-L98
>> “Conventional wisdom holds that the best way to learn a new language is immersion… ”
In my experience, this is not true. I read 7 languages, and speak several of them, so I’ve thought about this a lot.
I’ve seen people try the sink-or-swim method, and many times they drown. It’s very frustrating and people don’t have the tools to make sense of the new language.
In machine learning, I would guess that the system needs to be told which parts of the input are valid, otherwise you get false matches (e.g. the system might pay attention to irrelevant patterns in the background of an image). You could call this “modified immersion.”
There *are* some cases where immersion learning works okay (e.g. young children or when you only need a limited vocabulary, for example bargaining at a market).
Indeed. What people forget is that adults do not get exposed to the same basic level of interactions that kids do. People are also less helpful or patient when asking for unknown words or explanations.
The amount of necessary data and correlations is just not there, the information is way too “high-level” and specific to learn just by “sink-or-swim”.
An adult learns much better by “compressed learning” or difference learning. Using similarities, key concepts, and basic differences, then using that theoretical model to teach yourself an intuitive model. Then you can hone the details by practicing in the every day world of an adult.
If you struggle with basic communication, barely anyone will provide you with the necessary information or rich enough data. So many concepts exist already, are different in other languages, that you need to make sense of them first, and make a new coherent whole out of it all.
Bilingual kids take longer to speak because of this very reason. But as an adult you start with much more developed and detailed concepts, so you cannot just start over and relearn from scratch.
But with rudimentary skills, you can improve your language by using the vast (now useful information) to fine tune, and make it trickle down slowly into your already existing intuitive models.
The images that AG1LE shows only seem to drop the contrast rather than increase the noise level. To those above that think that reading Morse is easy (I could never get the hang of it, even on noiseless training tapes), it isn’t that hard when the signal is clear, but if the signal is coming from around the world it has picked up all sorts of noise and interference, the signal levels fade and noise sources pop so AI might be able to read a low SNR signal that a human would struggle to figure out if that last pulse was a dot or dash or pop. So I see a lot of value in this. The test images that are shown just seem to be a drop in contrast (i.e. signal above a noise floor) rather than having random noise increase to cover the signals, for -6dB I think realistically it should look like static with faint lines in the background barely visible. Impressive work but I’m not sure his test data is a realistic representation of RF noise!
The images are created from noisy audio files. As explained in the article there are a few lines of Python code takes 4 seconds sample from an existing WAV audio file, finds the signal peak frequency, de-modulates and decimates the data so that we get a (1,256) vector that we re-shape to (128, 32) and write into a PNG file. You are correct that the test data is not realistic representation of RF noise – I am using Gaussian white noise mixed with the CW signals. There are other software packages such as Dave W1HKJ’s excellent linsim package (http://www.w1hkj.com/files/test_suite/guide.html) that can produce more realistic RF propagation noise and interference on audio files.
Related survey of methods here: https://www.hindawi.com/journals/wcmc/2019/5629572/
There is also a project on github somewhere that does signal type recognition.
converting the time series to a 2D ‘barcode’ representation strikes me as odd; it seems that this simply repeats the horizontal dimension along the newly added vertical dimension (and hence does not really add anything new). Maybe it was done to allow using an out-of-box model expecting 2d, but I think this only results in requiring more computation for the same performance.
As explained in the article I took a handwriting recognition ML network and used it to train / decode Morse code. I’m sure there are a lot of performance optimizations that can be done… this was just a proof-of-concept that decoding Morse code from noisy audio files is possible using deep learning ML models.
Please be kind and respectful to help make the comments section excellent. (Comment Policy)