Llama.ttf Is AI, In A Font

It’s a great joke, and like all great jokes it makes you think. [Søren Fuglede Jørgensen] managed to cram a 15 M parameter large language model into a completely valid TrueType font: llama.ttf. Being an LLM-in-a-font means that it’ll do its magic across applications – in your photo editor as well as in your text editor.

What magic, we hear you ask? Say you have some text, written in some non-AI-enabled font. Highlight that, and swap over to llama.ttf. The first thing it does is to change all “o” characters to “ø”s, just like [Søren]’s parents did with his name. But the real magic comes when you type a length of exclamation points. In any normal font, they’re just exclamation points, but llama.ttf replaces them with the output of the TinyStories LLM, run locally in the font. Switching back to another font reveals them to be exclamation points after all. Bønkers!

This is all made possible by the HarfBuzz font extensions library. In the name of making custom ligatures and other text shaping possible, HarfBuzz allows fonts to contain Web Assembly code and runs it in a virtual machine at rendering time. This gives font designers the flexibility to render various Unicode combinations as unique glyphs, which is useful for languages like Persian. But it can just as well turn all “o”s into “ø”s or run all exclamation points through an LLM.

Something screams mischief about running arbitrary WASM while you type, but we remind you that since PostScript, font rendering engines have been able to run code in order to help with the formatting problem. This ability was inherited by PDF, and has kept malicious PDFs in the top-10 infiltration vectors for the last fifteen years. [Citation needed.] So if you can model a CPU in PDF, why not an LLM in TTF? Or a Pokemon clone in an OpenType font?

We don’t think [Søren] was making a security point here, we think he was just having fun. You can see how much fun in his video demo embedded below.

Continue reading “Llama.ttf Is AI, In A Font”

In Future, Printer Documents You

[Jason Dookeran] reminded us of something we don’t like to think about. Your printer probably adds barely noticeable dots to everything you print. It does it on purpose, so that if you print something naughty, the good guys can figure out what printer it came from. This is the machine identification code and it has been around since the days that the US government feared that color copiers would allow wholesale counterfiting.

The technology dates back to Xerox and Canon devices from the mid-80s, but it was only publicly acknowledged in 2004. With color printers, the MIC — machine identification code — is a series of tiny yellow dots. Typically, each dot is about 10 microns across and spaced about a millimeter from each other. The pattern prints all over the page so that even a fragment of, say, a ransom note can be identified.

Continue reading “In Future, Printer Documents You”

This Week In Security: Chat Control, Vulnerability Extortion, And Emoji Malware

Way back in 2020, I actually read the proposed US legislation known as EARN IT, and with some controversy, concluded that much of the criticism of that bill was inaccurate. Well what’s old is new again, except this time it’s the European Union that’s wrestling with how to police online Child Sexual Abuse Material (CSAM). And from what I can tell of reading the actual legislation (pdf), this time it really is that bad.

The legislation lays out two primary goals, both of them problematic. The first is detection, or what some are calling “upload moderation”. The technical details are completely omitted here, simply stating that services “… take reasonable measures to mitigate the risk of their services being misused for such abuse …” The implication here is that providers would do some sort of automated scanning to detect illicit text or visuals, but exactly what constitutes “reasonable measures” is left unspecified.

The second goal is the detection order. It’s worth pointing out that interpersonal communication services are explicitly mentioned as required to implement these goals. From the bill:

Providers of hosting services and providers of interpersonal communications services that have received a detection order shall execute it by installing and operating technologies approved by the Commission to detect the dissemination of known or new child sexual abuse material or the solicitation of children…

This bill is careful not to prohibit end-to-end encryption, nor require that such encryption be backdoored. Instead, it requires that the apps themselves be backdoored, to spy on users before encryption happens. No wonder Meredith Whittaker has promised to pull the Signal app out of the EU if it becomes law. As this scanning is done prior to encryption, it’s technically not breaking end-to-end encryption.

You may wonder why that’s such a big deal. Why is it a non-negotiable for the Signal app to not look for CSAM in messages prior to encryption? For starters, it’s a violation of user trust and an intentional weakening of the security of the Signal system. But maybe most importantly, it puts a mechanism in place that will undoubtedly prove too tempting for future governments. If Signal can be forced into looking for CSAM in the EU, why not anti-government speech in China?

Continue reading “This Week In Security: Chat Control, Vulnerability Extortion, And Emoji Malware”

This Week In Security: Unicode Strikes Again, Trust No One (Redditor), And More

There’s a popular Sysadmin meme that system problems are “always DNS”. In the realm of security, it seems like “it’s always Unicode“. And it’s not hard to see why. Unicode is the attempt to represent all of Earth’s languages with a single character set, and that means there’s a lot of very similar characters. The two broad issues are that human users can’t always see the difference between similar characters, and that libraries and applications sometimes automatically convert exotic Unicode characters into more traditional text.

This week we see the resurrection of an ancient vulnerability in PHP-CGI, that allows injecting command line switches when a web server launches an instance of PHP-CGI. The solution was to block some characters in specific places in query strings, like a query string starting with a dash.

The bypass is due to a Windows feature, “Best-Fit”, an automatic down-convert from certain Unicode characters. This feature works on a per-locale basis, which means that not every system language behaves the same. The exact bypass that has been found is the conversion of a soft hyphen, which doesn’t get blocked by PHP, into a regular hyphen, which can trigger the command injection. This quirk only happens when the Windows locale is set to Chinese or Japanese. Combined with the relative rarity of running PHP-CGI, and PHP on Windows, this is a pretty narrow problem. The XAMPP install does use this arrangement, so those installs are vulnerable, again if the locale is set to one of these specific languages. The other thing to keep in mind is that the Unicode character set is huge, and it’s very likely that there are other special characters in other locales that behave similarly.

Downloader Beware

The ComfyUI project is a flowchart interface for doing AI image generation workflows. It’s an easy way to build complicated generation pipelines, and the community has stepped up to build custom plugins and nodes for generation. The thing is, it’s not always the best idea to download and run code from strangers on the Internet, as a group of ComfyUI users found out the hard way this week. The ComfyUI_LLMVISION node from u/AppleBotzz was malicious.

The node references a malicious Python package that grabs browser data and sends it all to a Discord or Pastebin. It appears that some additional malware gets installed, for continuing access to infected systems. It’s a rough way to learn. Continue reading “This Week In Security: Unicode Strikes Again, Trust No One (Redditor), And More”

This Week In Security: Recall, Modem Mysteries, And Flipping Pages

Microsoft is racing to get into the AI game as part of Windows 11 on ARM, calling it Copilot+. It’s an odd decision, but clearly aimed at competing with the Apple M series of MacBooks. Our focus of interest today is Recall, a Copilot+ feature that not only has some security problems, but also triggers a sort of visceral response from regular people: My computer is spying on me? Eww.

Yes, it really sort of is. Recall is a scheme to take screen shots of the computer display every few seconds, run them through character recognition, and store the screenshots and results in a database on the local machine hard drive. There are ways this could be useful. Can’t remember what website had that recipe you saw? Want to revisit a now-deleted tweet? Is your Google-fu failing you to find a news story you read last week? Recall saw it, and Recall remembers. But what else did Recall see? Every video you watched, ever website you visited, and probably some passwords and usernames you typed in.

Continue reading “This Week In Security: Recall, Modem Mysteries, And Flipping Pages”

Tunneling TCP By File Server

You want to pass TCP traffic from one computer to another, but there’s a doggone firewall in the way. Can they both see a shared file? Turns out, that’s all you need. Well, that and some software from [fiddyschmitt].

If you think about it, it makes sense. Unix treats most things as a file, so it is pretty easy to listen on a local TCP port and dump the data into a shared file. The other side reads the file and dumps the same data to the desired TCP port on its side. Another file handles data in the other direction. Of course, the details are a bit more than that, but that’s the basic idea.

Performance isn’t going to be wonderful, and the files keep growing until the program detects that they are bigger than 10 megabytes. When that happens, the program purges the file.

The code is written in C# and there are binaries for Windows and Linux on the release page. The examples show using shared files via Windows share and RDP, but we imagine any sort of filesystem that both computers can see would work. Having your traffic stuffed into a shared file is probably not great for security but, you know, you are already jumping a firewall, so…

Of course, no firewall can beat an air gap. Unless you can control the fans or an LED.

To the left, a breadboard with the ATMega328P being attacked. To the right, the project's display showing multiple ;) smiley faces, indicating that the attack has completed successfully.

Glitching An ATMega328P Has Never Been Simpler

Did you know just how easily you can glitch microcontrollers? It’s so easy, you really have no excuse for not having tried it out yet. Look, [lord feistel] is doing glitching attacks on an ATMega328P! All you need is an Arduino board with its few SMD capacitors removed or a bare 328P chip, a FET, and some sort of MCU to drive it. All of these are extremely generic components, and you can quickly breadboard them, following [lord feistel]’s guide on GitHub.

In the proof-of-concept, you can connect a HD44780 display to the chip, and have the victim MCU output digits onto the display in an infinite loop. Inside of the loop is a command to output a smiley face – but the command is never reachable, because the counter is reset in an if right before it. By glitching the ATMega’s power input, you can skip the if and witness the ;) on your display; it is that simple.

What are you waiting for? Breadboard it up and see for yourself, this might be the method that you hack your next device and make it do your bidding. If the FET-and-MCU glitching starts to fail you at some point, there’s fancier tools you can use, like the ChipWhisperer. As for practical examples, [scanlime]’s elegant glitching-powered firmware hack is hard to forget.