The range of characters that can be represented by Unicode is truly bewildering. If there’s a symbol that was ever used to represent a sound or a concept anywhere in the world, chances are pretty good that you can find it somewhere in Unicode. But can many of us recall the proper keyboard calisthenics needed to call forth a particular character at will? Probably not, which is where this Unicode binary input terminal may offer some relief.
“Surely they can’t be suggesting that entering Unicode characters as a sequence of bytes using toggle switches is somehow easier than looking up the numpad shortcut?” we hear you cry. No, but we suspect that’s hardly [Stephen Holdaway]’s intention with this build. Rather, it seems geared specifically at making the process of keying in Unicode harder, but cooler; after all, it was originally his intention to enter this in last year’s Odd Inputs and Peculiar Peripherals contest. [Stephen] didn’t feel it was quite ready at the time, but now we’ve got a chance to give this project a once-over.
The idea is simple: a bank of eight toggle switches (with LEDs, of course) is used to compose the desired UTF-8 character, which is made up of one to four bytes. Each byte is added to a buffer with a separate “shift/clear” momentary toggle, and eventually sent out over USB with a flick of the “send” toggle. [Stephen] thoughtfully included a tiny LCD screen to keep track of the character being composed, so you know what you’re sending down the line. Behind the handsome brushed aluminum panel, a Pi Pico runs the show, drawing glyphs from an SD card containing 200 MB of True Type Font files.
At the end of the day, it’s tempting to look at this as an attractive but essentially useless project. We beg to differ, though — there’s a lot to learn about Unicode, and [Stephen] certainly knocked that off his bucket list with this build. There’s also something wonderfully tactile about this interface, and we’d imagine that composing each codepoint is pretty illustrative of how UTF-8 is organized. Sounds like an all-around win to us.
That thing looks beautiful. Really well done! I want two of them.
Absolutely beautiful!
I love it!
Art I would put in my living room.
Beautiful
“If there’s a symbol that was ever used to represent a sound or a concept anywhere in the world, chances are pretty good that you can find it somewhere in Unicode.”
Does anyone know about the utf-8 code for cuneiform 20? I found U+12399. That doesn’t want to display on my work PC (Win 10).
It’ll only display if you use a font that has the appropriate glyph. Many run of the mill ones don’t carry the more obscure stuff like that.
Yah. It’s weird though. 1-9, 10, 30, 40 and 50 are good. It’s base 60 so those plus 20 are all you need. At least it is assuming 2 characters per place value. But it’s only 20 that apparently isn’t in the default font. But there are a lot of variants of each number, I was just picking the ones that looked most like the cuneiform tutorials I found online. But I only found the one non-working 20. I thought maybe there is another 20 that is more commonly included in fonts but wasn’t on the lists I found.
You might want to install one of the Noto fonts (search for “noto fonts” on wikipedia).
Be aware that any font or fontset that covers a major portion of the whole unicode space is going to be HUGE, and can contribute noticeable slowdown photoshop and cad software simply by being installed (they often try to scan all characters of all fonts to build up an idea of character geometry, etc, and that’s a LOT of glyphs). Better software processes and caches that metadata once, but since fonts weren’t traditionally that large, a lot of software just brute-forces it every time it needs the info. A machine with fonts for comprehensive unicode can result in a significant delay every time the application does anything related to fonts.
As of April 2021, the Google-sponsored noto font kit covers 95% of all non-CJK glyphs, and 32% of all CJK glyphs, for a combined coverage of 54% of the glyphs defined in the Unicode 13 standard.
Since nobody normally needs ALL of the unicode glyphspace, there are tools to let you or rebuild a noto font to cover just the glyphs you think you will need. This can often dramatically reduce the size and slowdowns, while giving you every glyph you would ever actually use.
Oh, wow. This is great, thank you!
This reminds me of the projects that use switches to drive an HD44780U-compatible 1602 LCD.
Why is the example character a wolfsangel?
I looked it up for you: https://en.wikipedia.org/wiki/Nuosu_language. Yi Syllable Xyx
It’s still a very red flag-y character to choose when showcasing your project
Agreed, of all the characters they chose to showcase they chose this.
oy vey
Hey, a few people have pointed out similarities to other symbols. This Yi character I picked at random is from a writing system used by 2 million people to communicate, but we’re very good at pattern recognition. To keep comments focused on the build, I re-shot the main image on the project page, which was updated shortly after your comment
Can’t tell the size from the photo, but it’d be even better if it fit into a 5 1/4″ drive bay :)
It’s just shy of 8″ so would need some rejiggering, but I love the idea! Drive bay accessories from the 2000s vibes
Huh, really nice. Then again, I’ve got a salvaged panel with a set of thumbwheel switches — two blocks of eight wheels each, in hexadecimal —and I’ve been wondering what exactly I should use them in for several years. Selecting Unicode characters or sequences might be a nice application. One block would be enough to select a character using UTF-32, though it might be more fun to use both blocks and treat them as eight bytes of a UTF-8 stream, and just make room on the display for up to eight characters.
b̴̡̧̻̝̺̗̤͓̺͍̞̝̼̟̊̓̉̏͋̿͆̄̈́̚͜ͅǔ̷̡̱̖̻̠̯̎͒͛̀̐t̷̨̢̩̟̊͛͛̓̿͊̽̈̏͋̏͘͝͝͝ ̶̗̟͎̼̹͎̝̱̺͍͚̠̜̓̈́͋͐̀̄̎̀ç̷̮̺̮͚̪̺̙͈͎̟͈̻̖̔̃͆̔́̌̒̑̈́̓̋̐̀̚͘͜͜͝ȧ̷̡̨̙̞̖̹̖̥̯̱̹̥̖̞̃̕̕͘͜n̵̡̢̢̻̰͙̗̋̎̐́̓͋̀̈́̓͛̿̿͋͜͠ ̶̧͈̺̱͎͛̈́͝į̵̢̛̺̖̼̭̦̇̐̀͐̈́̒̄ͅt̸̹̲͈̿͊̂̄̆͊̈́̌̀̚͝ ̴̡̲̱͔̩͖͚̟̓̃́̾́̇̈̓̽̆͛̍͜r̵̨̢̛͕̠̯͚͖̺̫̭̦̬̙̰͒̀̈͆̓͗̑̆͒̾̒́̅̐͘e̷̜͊̽͆͐̑́ņ̵̰̝̮͓̖͍̊̒̅d̸̢̛̝̣͓̹̐̍̇ę̶̱̳͇͙̯͕͆̇̊́̃̓̌̐̊̊̌͛͜ŗ̴̨͈̫̦͓͚̞̜̩̤̙̣̪͙̀͂̂̌̉̈̍̋̊ͅ ̵̼̹̠̬̮̿̑͛͗̄̂͆̑̆̐̆̕͝͝͠͝Z̴̡̞̪̦̦̤̬̗̮͍͆̊̍͒̏̊̒̆͐̽̚̕͜͠ͅä̷̧̠͚́́̉͗͝ļ̴́͋̏̈́̃̔̃̋̾̈́̔͆͑͌̍̊g̵̛̛̲͐̈́̔̉̉́̑͜͝͠o̴̝̱͇̙̜̒͑͒͋̀̀̓̀̌̌͋̊͝͠͠ ̶̡̺͔̩̣̄̐̈́̀͛̌͘̕͠͝T̸̗͚̍̿e̶̡̫̦̯̲͇̙͑̈́̓͑̄̑͋̓͐͠x̷̨̛̺̰̬̰̱̱̦̌̓̌̈́̑̄̉͛̈́̕͝ţ̵̧̛͙̣͖͖̖̬͖̲̗̻̲̩̽̉͗̔͐͝͠
Why? So you can generate Capchas?
No, thanks, no fancy Unicode for me, I will stick to Alt codes.
Using a regex to parse XML, good use of UTF-8 in a stack exchange answer:
https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454
> Using a regex to parse XML
as soon as you hear that sequence of words, run, run fast! Nothing good can ever come from trying to do that.
You can create a unicode input with little more than a LED array. The first step is to turn it into an input device that can detect your finger tip so that you can draw the character you want. The next step is to use a K210 chip to run a neural network for recognising the character you want.
Why UTF-8? With only 24 more switches (and wouldn’t THAT be great?), you could have pure Unicode.
Why so many switches? With just a single switch, you can say anything.
It appears we hold mutually incompatible values.
I like your thinking. Going one step further, I nerd-sniped myself wondering how big a keyboard with standard spacing would need to be to cover every codepoint and still be physically possible to use. If you arranged it as a circlular panel someone sits in the middle of, and assume they can reach 1m across and 1.5m vertically, the whole console would be ~12 metres in diameter with 80 rows of keys at a 40° slant. Seems practical
Beautiful result and nicely documented. I liked seeing the different things tried and the justifications behind it. The 1980s style technical manual is a nice touch!
what’s unicode :-)
It is a unique code.
Or is it a unique ode.
I forget…
> “Surely they can’t be suggesting that entering Unicode characters as a sequence of bytes using toggle switches is somehow easier than looking up the numpad shortcut?”
That’s for Windows. Unix & Plan9 side of things got Compose Keys since the 80’s which allows for much more rememberable sequences (say compose; e; ‘ to get é): https://en.wikipedia.org/wiki/Compose_Key