There’s No AI In A Markov Chain, But They’re Fun To Play With

Amid all the hype about AI it sometimes seems as though the world has lost sight of the fact that software such as ChatGPT contains no intelligence. Instead it’s an extremely sophisticated system for extracting plausible machine generated content from the corpus on which it is trained. There’s a long history behind machine generated text, and perhaps the simplest example comes in the form of a Markov chain. [Ben Hoyt] takes us through how these work, and provides some Python code so that you can roll your own.

If you’re uncertain what a Markov chain is, consider the predictive text on your phone. It works by offering the statistically most likely next word in your sentence, and should you accept all of its choices it will deliver sentences which are superficially readable but otherwise complete nonsense. He demonstrates with very simple short source texts how a collocate probability map is generated for two-word phrases, and how from that a likely next word can be extracted. It’s not AI, but it can be a lot of fun to play with and it opens the door to the entire field of computational linguistics. We haven’t set one loose on Hackaday’s archive yet but we suspect it would talk a lot about the Arduino.

We’re talking about Markov chains here with respect to language, but it’s also worth remembering that they work for music too.

Header: Bad AI image with Dall-E prompt, “Ten thousand monkeys with typewriters”.

Native Alaskan Language Reshapes Mathematics

The languages we speak influence the way that we see the world, in ways most of us may never recognize. For example, researchers report seeing higher savings rates among people whose native language has limited capacity for a future tense, and one Aboriginal Australian language requires precise knowledge of cardinal directions in order to speak at all. And one Alaskan Inuit language called Iñupiaq is using its inherent visual nature to reshape the way children learn and use mathematics, among other things.

Arabic numerals are widespread and near universal in the modern world, but except perhaps for the number “1”, are simply symbols representing ideas. They require users to understand these quantities before being able to engage with the underlying mathematical structure of this base-10 system. But not only are there other bases, but other ways of writing numbers. In the case of the Iñupiaq language, which is a base-20 system, the characters for the numbers are expressed in a way in which information about the numbers themselves can be extracted from their visual representation.

This leads to some surprising consequences, largely that certain operations like addition and subtraction and even long division can be strikingly easy to do since the visual nature of the characters makes it obvious what each answer should be. Often the operations can be seen as being done to the characters themselves, instead of in the Arabic system where the idea of each number must be known before it can be manipulated in this way.

This project was originally started as a way to make sure that the Iñupiaq language and culture wasn’t completely lost after centuries of efforts to eradicate it and other native North American cultures. But now it may eventually get its own set of Unicode characters, meaning that it could easily be printed in textbooks and used in computer programming, opening up a lot of doors not only for native speakers of the language but for those looking to utilize its unique characteristics to help students understand mathematics rather than just learn it.

The Curious Etymology Of The Elements

It’s not often that the worlds of lexicography and technology collide, but in a video by the etymologist [RobWords] we may have found a rare example. In a fascinating 16-minute video he takes us through the origins of the names you’ll find in the periodic table. Here’s a word video you don’t have to be on the staff of a dictionary to appreciate!

Etymology is a fascinating study, in which the scholar must disentangle folk etymologies and mistaken homophones to find the true root of a word. Fortunately in the case of most elements they bear a name bestowed on them by the scientists who discovered them, so their etymologies are rarely in dispute.

The etymologies split neatly into categories, with among them such distinctions as Latin or Greek descriptions, places including the Swedish village of Ytterby which has more elements named after it than anywhere else, elements named for mythological figures, and those named for people.

He artfully skates over the distinction between aluminium and a curiously similar metal the Americans call aluminum, because etymologists are used to deflecting controversy when language differences colour, or even color, people’s emotions. Thank you, Noah Webster!

It’s an entertaining diversion for anyone with a love of both science and of language, and should remind us that the study of language has just as much scientific rigour in its research as any of those elements.

Continue reading “The Curious Etymology Of The Elements”

Create A Compiler Step-By-Step

While JavaScript might not be the ideal language to write a production compiler, you might enjoy the “Create Your Own Compiler” tutorial that does an annotated walkthrough of “The Super Tiny Compiler” and teaches you the basics of writing a compiler from scratch.

The super tiny compiler itself is about 200 lines of code. The source code is well, over 1,000 but that’s because of the literate programming comments. The fancy title comments are about half as large as the actual compiler.

The compiler’s goal is to take Lisp-style functions and convert them to equivalent C-style function calls. For example: (add 5 (subtract 3 1) would become add(5,subtract(3,1)).

Of course, there are several shortcut methods you could use to do this pretty easily, but the compiler uses a structure like most full-blown modern compilers. There is a parser, an abstract representation phase, and code generation.

Continue reading “Create A Compiler Step-By-Step”

English Words In French Gaming? Non Merci!

Are you a gamer? If you’re French, it seems that you shouldn’t be using so much English in pursuit of your goals.

It’s a feature of an active language, that it will readily assimilate words from others. Pizza, karaoke, vuvuzela, parka, gateau, schadenfreude, they have all played their part in bringing a little je ne sais quoi into our everyday speech. This happens as a natural process as whatever the word is describing becomes popular, and sometimes these new words cause a backlash from those who see themselves as the language’s defenders.

Often this is a fringe activity such as the British politician who made a fool of himself in a radio interview by insisting on the now-archaic Wade-Giles “Peking” rather than the vastly more common Pinyin “Beijing”, but for some tongues it’s no laughing matter. Nowhere is this more the case than in the Francophone world, in which the Academie Francaise and the French and Quebecquois governments see themselves as very much the official guardians of French. And now it seems that the French ministry of culture have turned their eyes upon gamers.

It’s nothing new for words associated with technology to fall under this scrutiny, a quarter century ago in the CD-ROM business it was de rigeur for localized discs to talk about le logicel, l’ordinateur, and telecharger instead of program, computer, and download. The talk of the industry was that Sony refused to do this for PlayStation consoles sold in Quebec during the 1990s, and thus all their sales in the province had to be under-the-counter. But there’s a sense from reading the reports that this intervention is a little clumsy; while it’s easy to say logicel we’re not so sure that jeu video de competition  or video game competition for e-sports and joueur-animateur en direct or live player-animator for streamer aren’t just too much of a mouthful for easy adoption. For the first one, we can’t help remembering that sport is also an everyday French word, so couldn’t they have come up with something less clumsy such as reseau-sports or network-sports?

Here at Hackaday more than one of us are unrepentant Francophiles, so the evolution of French words in our field is of interest to us. Habitez-vous en France ou Quebec? Donnez-nous votres idees dans les commentaires! (mais en Anglais s’il vous plait pour les Americains, excusez-nous)

Header image: Christopher Macsurak, CC BY-SA 4.0.

A Programming Language To Express Programming Frustration

Programming can be a frustrating endeavor. Certainly we’ve all had moments, such as forgetting punctuation in C or messing up whitespace in Python. Even worse, an altogether familiar experience is making a single change to a program that should have resulted in a small improvement but instead breaks the program. Now, though, there’s a programming language that can put these frustrations directly into the code itself into a cathartic, frustration-relieving syntax. The language is called AHHH and it’s quite a scream.

While it may not look like it on the surface, the language is Turing complete and can be used just like any other programming language. The only difference is that there are only 16 commands in this language which are all variants of strings of four capital- or lower-case-H characters. The character “A” in the command “AHHH” starts the program, and from there virtually anything can be coded as a long, seemingly unending scream. The programming language is loosely related to COW which uses various “moos” to create programs instead of screams, and of course is also distantly related to brainfuck which was an esoteric programming language created in order to have the smallest possible compiler.

We can’t really recommend that beginner programmers start to learn this language instead of something more practical like Python, esoteric languages like these can teach us a lot about the way that computers work. This language, for example, lets you code in pixels instead of characters. Others are more for fun such as this language which turns your code into an ’80s rock ballad.

Thanks to [Kyle F] for the tip!

Greeking Out With Arduinos

Learning a new language is hard work, but they say that the best way to learn something is to teach it. [Angeliki Beyko] is learning Greek, and what better way to teach than to build a vocabulary flash-card game from Arduinos, color screens, 1602 text screens, and arcade buttons? After the break, we have a video from the creator talking about how to play, the hardware she chose, and what to expect in the next version.

Pegboard holds most of the hardware except the color screens, which are finicky when it comes to their power source. The project is like someone raided our collective junk drawers and picked out the coolest bits to make a game. Around the perimeter are over one hundred NeoPixels to display the game progress and draw people like a midway game. Once invested, you select a category on the four colored arcade buttons by looking at the adjacent LCD screens’ titles. An onboard MP3 shield reads a pseudo-random Greek word and displays it on the top-right 1602 screen in English phonetics. After that, it is multiple choice with your options displaying in full-color on four TFT monitors. A correct choice awards you a point and moves to the next word, but any excuse to mash on arcade buttons is good enough for us.

[Angeliki] does something we see more often than before, she’s covering what she learned, struggled with, would do differently, and how she wants to improve. We think this is a vital sign that the hacker community is showcasing what we already knew; hackers love to share their knowledge and improve themselves.

Typing Greek with a modern keyboard will have you reaching for an alt-code table unless you make a shortcut keyboard, and if you learn Greek, maybe you can figure out what armor they wore to battle.

Continue reading “Greeking Out With Arduinos”