Text Compression Gets Weirdly Efficient With LLMs

It used to be that memory and storage space were so precious and so limited of a resource that handling nontrivial amounts of text was a serious problem. Text compression was a highly practical application of computing power.

Today it might be a solved problem, but that doesn’t mean it doesn’t attract new or unusual solutions. [Fabrice Bellard] released ts_zip which uses Large Language Models (LLM) to attain text compression ratios higher than any other tool can offer.

LLMs are the technology behind natural language AIs, and applying them in this way seems effective. The tradeoff? Unlike typical compression tools, the lossless decompression part isn’t exactly guaranteed when an LLM is involved. Lossy compression methods are in fact quite useful. JPEG compression, for example, is a good example of discarding data that isn’t readily perceived by humans to make a smaller file, but that isn’t usually applied to text. If you absolutely require lossless compression, [Fabrice] has that covered with NNCP, a neural-network powered lossless data compressor.

Do neural networks and LLMs sound far too serious and complicated for your text compression needs? As long as you don’t mind a mild amount of definitely noticeable data loss, check out [Greg Kennedy]’s Lossy Text Compression which simply, brilliantly, and amusingly uses a thesaurus instead of some fancy algorithms. Yep, it just swaps longer words for shorter ones. Perhaps not the best solution for every need, but between that and [Fabrice]’s brilliant work we’re confident there’s something for everyone who craves some novelty with their text compression.

[Photo by Matthew Henry from Burst]

Turning Soviet Electronics Into A Nixie Tube Clock

Sometimes you find something that looks really cool but doesn’t work, but that’s an opportunity to give it a new life. That was the case when [Davis DeWitt] got his hands on a weird Soviet-era box with four original Nixie tubes inside. He tears the unit down, shows off the engineering that went into it and explains what it took to give the unit a new life as a clock.

Each digit is housed inside a pluggable unit. If a digit failed, a technician could simply swap it out.

A lot can happen over decades of neglect. That was clear when [Davis] discovered every single bolt had seized in place and had to be carefully drilled out. But Nixie tubes don’t really go bad, so he was hopeful that the process would pay off.

The unit is a modular display of some kind, clearly meant to plug into a larger assembly. Inside the unit, each digit is housed in its own modular plug with a single Nixie tube at the front, a small neon bulb for a decimal point, and a bunch of internal electronics. Bringing up the rear is a card edge connector.

Continues after the break…

Continue reading “Turning Soviet Electronics Into A Nixie Tube Clock”

Explore FFmpeg From The Comfort Of Your Browser

If you’re looking to manipulate video, FFmpeg is one of the most powerful tools out there. But with this power comes a considerable degree of complexity, and a learning curve that looks suspiciously like a brick wall. To try and make this incredible tool a bit less obtuse, [Sam Lavigne] has developed a web interface that lets you play around with FFmpeg’s vast collection of audio and video filters.

To try out a filter, you just need to select one from the window on the left and it will pop up in the central workspace. Here, the input, output, and any enabled filters will show up as boxes that can be virtually “wired” together. Selecting a filter will populate its options on the right hand side, with sliders and input boxes that allow you to play around with their parameters. When you want to see the final result, just click “Render Preview” and wait a bit.

If there was any downside, it seems like whatever box the site is running on the overhead of running in the browser doesn’t provide it a lot of horsepower. Even with the relatively low resolution of the demo videos available, the console output at the top of the page shows FFmpeg sometimes flirts with a processing speed measured in single-digit frames per second. Still, for a filter playground, it gets the job done. Perhaps the best part of the whole tool is that you can then copy your properly formatted command right out of the browser window and into your terminal so you can put it to work on your local files.

FFmpeg is one of those programs you should really be familiar with because it often proves useful in unexpected ways. The ability to manipulate audio and video with just a few keystrokes can really come in handy, and we’ve seen this open-source tool used for everything from compressing podcasts onto floppy disks to overlaying real-time environmental data onto a video stream.

Linux On A Commodore 64

We are used to seeing Linux running on almost everything, but we were a bit taken aback to see [semu-c64] running Linux on a Commodore 64. But between the checked-out user name and the caveat that: “it runs extremely slowly and it needs a RAM Expansion Unit”, one can already start piecing together what’s happening here.

The machine running Linux is really a RISC-V32. It just so happens that the CPU is virtual, with the C64 pretending it is a bigger machine. The boot-up appears to take hours, so this is in no way practical, even though the comment is that optimization might be able to get a 10X speed up. It would still be about as slow as you can imagine.

To further add a layer of abstraction, the code hasn’t run yet on real Commodore hardware. Instead, it is running on an emulator. The emulator has “warp” mode to run faster than a real machine, and it is still slow. So think about that before you rush out to volunteer to boot this on your real hardware.

Tricks like this fall into the talking dog category. If a dog can talk, it isn’t that you think it will have something important to say. You just marvel that it can do it at all. Still, we get it. We spend a lot of time doing things at least as pointless. But at least it is fun!

Maybe emulate the whole thing in VR? Or maybe write some virtualization code for the C64 so you can emulate a Linux box and a quantum computer simultaneously.

Accurate Cycle Counting On RP2040 MicroPython

The RP2040 is a gorgeous little chip with a well-defined datasheet and a fantastic price tag. Two SDKs are even offered: one based on C and the other MicroPython. More experienced MCU wranglers will likely reach for the C variant, but Python does bring a certain speed when banging out a quick project or proof of concept. Perhaps that’s why [Jeremy Bentham] ported his RP2040-based vehicle speedometer to MicroPython.

The two things that make that difficult are that MicroPython tries to be pretty generic, which means some hackery is needed to talk to the low-level hardware, and that MicroPython doesn’t have a reputation for accurate cycle counting. In this case, the low-level hardware is the PWM peripheral. He details the underlying mechanism in more detail in the C version. On the RP2040, the PWM module can count pulse edges on an input. However, you must start and stop it accurately to calculate the amount of time captured. From there, it’s just edges divided by time. For this, the DMA system is pulled in. A DMA request can be triggered once the PWM counter rolls over. The other PWM channel acts as a timer, and when the timer expires, the DMA request turns off the counter. This works great for fast signals but is inaccurate for slow signals (below 1kHz). So, a reciprocal or time-interval system is included, where the time between edges is captured instead of counting the number of edges in a period,

What’s interesting here is how the hardware details are wrapped neatly into pico_devices.py. The uctypes module from MicroPython allows access to MMIO devices such as DMA and PWM. The code is available on GitHub. Of course, [Jeremy] is no stranger to hacking around on the RP2040, as he has previously rolled his own WiFi driver for the Pico W.

Going To Extremes To Block YouTube Ads

Many users of YouTube feel that the quality of the service has been decreasing in recent years — the platform offers up bizarre recommendations, fails to provide relevant search results, and continues to shove an increasing amount of ads into the videos themselves. For shareholders of Google’s parent company, though, this is a feature and not a bug; and since shareholder opinion is valued much more highly than user opinion, the user experience will likely continue to decline. But if you’re willing to put a bit of effort in you can stop a large chunk of YouTube ads from making it to your own computers and smartphones.

[Eric] is setting up this adblocking system on his entire network, so running something like Pi-hole on a single-board computer wouldn’t have the performance needed. Instead, he’s installing the pfSense router software on a mini PC. To start, [Eric] sets up a pretty effective generic adblocker in pfSense to replace his Pi-hole, which does an excellent job, but YouTube is a different beast when it comes to serving ads especially on Android and iOS apps. One initial attempt to at least reduce ads was to subtly send YouTube traffic through a VPN to a country with fewer ads, in this case Italy, but this solution didn’t pan out long-term.

A few other false starts later, all of which are documented in detail by [Eric] for those following along, and eventually he settled on a solution which is effectively a man-in-the-middle attack between any device on his network and the Google ad servers. His router is still not powerful enough to decode this information on the fly but his trick to get around that is to effectively corrupt the incoming advertising data with a few bad bytes so they aren’t able to be displayed on any devices on the network. It’s an effective and unique solution, and one that Google hopefully won’t be able to patch anytime soon. There are some other ways to improve the miserable stock YouTube experience that we have seen as well, like bringing back the dislike button.

Thanks to [Jack] for the tip!

GTA 6 Hacker Found To Be Teen With Amazon Fire Stick In Small Town Hotel Room

International cybercrime, as portrayed by the movies and mass media, is a high-stakes game of shadowy government agencies and state-sponsored hacking groups. Hollywood casting will wheel out a character in a black hoodie and shades, probably carrying a metallic briefcase as they board an executive jet.

These things aren’t supposed to happen in a cheap hotel room in your insignificant hometown, but the story of a British teen being nabbed leaking the closely guarded details of Grand Theft Auto 6 in a Travelodge room in Bicester, Oxfordshire brings the action from the global into the local for a Hackaday scribe. Bicester is a small town best known for a tacky outlet mall and as a commuter dormitory stop on the line to London Marylebone, it’s not exactly Vice City.

The teen in question is one [Arion Kurtaj], breathlessly reported by the BBC as part of the Lapsus$ gang, which is a sensationalist way of talking up a group of kids expert at computer infiltration but seemingly inept at being criminals. After compromising British telcos he was exposed by another group and nabbed by the authorities, before being moved to the hotel for his own safety.

Here the story becomes more interesting for Hackaday readers, because though denied access to a computer he purchased an Amazon Fire stick presumably at the Argos in the Sainsburys next door, and plugged it into the Travelodge TV. Using this he was able to access cloud services, we’re guessing a virtual Linux environment or similar, before continuing to compromise further organisations including Rockstar Games to leak that GTA 6 footage. He’s yet to be sentenced, but we’re guessing that he’ll continue to spend some time at His Majesty’s pleasure.

The moment of excitement in one’s hometown and the sensationalist reporting aside, we can’t help feeling sad that a teen with that level of talent evidently wasn’t given the support and encouragement by Oxfordshire’s education system necessary to put it to better use. Let’s hope when he’s older and wiser the teenage conviction won’t prevent him from having a useful career in the field.