Your Text Needs More JPEG

We’ve all been victims of bad memes on the Internet, but they’re not all just bad jokes gone wrong. Some are simply bad as a result of being copies-of-copies, as each reposter adds another layer of compression to an already lossy image format like JPEG. Compression can certainly be a benefit in areas like images and videos, but [Michal] had a bit of a fever dream imagining this process applied to text. Rather than let the idea escape, he built the Lossifizer to add JPEG-like compression to text.

JPEG compression uses a system similar to the fast Fourier transform (FFT) called the discrete cosine transform (DCT) to reduce the amount of data in an image by essentially removing some frequency information. The data lost is often not noticeable to the human eye, at least until it gets out of hand. [Michal]’s system performs the same transform on text instead, with a slider to control the “amount of JPEG” in the output text. The code for this script uses a “perceptual” character map, clustering similarly-looking and similarly-sounding characters next to each other, resembling “leet speak” from days of yore, although at high enough compression this quickly gets out of hand.

One of the quirks that [Michal] discovered is that certain AI chat bots have a much less difficult time interpreting this JPEG-ified text than a human probably would have, which provides a bit of insight into how some of these algorithms might be functioning under the hood. For some more insight into how JPEG actually works on images, we posted about a deep dive into the image format a while back.

Art of 3D printer in the middle of printing a Hackaday Jolly Wrencher logo

G-code Goes Binary With Proposed New Format

G-code is effective, easily edited, and nearly ubiquitous when it comes to anything CNC. The format has many strengths, but space efficiency isn’t one of them. In fact, when it comes to 3D printing in particular file sizes can get awfully large. Partly to address this, Prusa have proposed a new .bgcode binary G-code format. You can read the specification of the new (and optional) format here.

The newest version of PrusaSlicer has support for .bgcode, and a utility to convert ASCII G-code to binary (and back) is in the File menu. Want to code an interface of your own? The libbgcode repository provides everything needed to flip .gcode to .bgcode (with a huge file size savings in the process) and vice versa in a way that preserves all aspects of the data. Need to hand-edit a binary G-code file? Convert it to ASCII G-code, make your changes, then flip it right back.

Prusa are not the only ones to notice that the space inefficiency of the G-code file format is not ideal in all situations. Heatshrink and MeatPack are two other solutions in this space with their own strong points. Handily, the command-line tool in libgcode can optionally apply Heatshrink compression or MeatPack encoding in the conversion process.

In a way, G-code is the assembly language of 3D printers. G-code files are normally created when slicing software processes a 3D model, but there are some interesting tricks to be done when G-code is created directly.

Text Compression Gets Weirdly Efficient With LLMs

It used to be that memory and storage space were so precious and so limited of a resource that handling nontrivial amounts of text was a serious problem. Text compression was a highly practical application of computing power.

Today it might be a solved problem, but that doesn’t mean it doesn’t attract new or unusual solutions. [Fabrice Bellard] released ts_zip which uses Large Language Models (LLM) to attain text compression ratios higher than any other tool can offer.

LLMs are the technology behind natural language AIs, and applying them in this way seems effective. The tradeoff? Unlike typical compression tools, the lossless decompression part isn’t exactly guaranteed when an LLM is involved. Lossy compression methods are in fact quite useful. JPEG compression, for example, is a good example of discarding data that isn’t readily perceived by humans to make a smaller file, but that isn’t usually applied to text. If you absolutely require lossless compression, [Fabrice] has that covered with NNCP, a neural-network powered lossless data compressor.

Do neural networks and LLMs sound far too serious and complicated for your text compression needs? As long as you don’t mind a mild amount of definitely noticeable data loss, check out [Greg Kennedy]’s Lossy Text Compression which simply, brilliantly, and amusingly uses a thesaurus instead of some fancy algorithms. Yep, it just swaps longer words for shorter ones. Perhaps not the best solution for every need, but between that and [Fabrice]’s brilliant work we’re confident there’s something for everyone who craves some novelty with their text compression.

[Photo by Matthew Henry from Burst]

Explore FFmpeg From The Comfort Of Your Browser

If you’re looking to manipulate video, FFmpeg is one of the most powerful tools out there. But with this power comes a considerable degree of complexity, and a learning curve that looks suspiciously like a brick wall. To try and make this incredible tool a bit less obtuse, [Sam Lavigne] has developed a web interface that lets you play around with FFmpeg’s vast collection of audio and video filters.

To try out a filter, you just need to select one from the window on the left and it will pop up in the central workspace. Here, the input, output, and any enabled filters will show up as boxes that can be virtually “wired” together. Selecting a filter will populate its options on the right hand side, with sliders and input boxes that allow you to play around with their parameters. When you want to see the final result, just click “Render Preview” and wait a bit.

If there was any downside, it seems like whatever box the site is running on the overhead of running in the browser doesn’t provide it a lot of horsepower. Even with the relatively low resolution of the demo videos available, the console output at the top of the page shows FFmpeg sometimes flirts with a processing speed measured in single-digit frames per second. Still, for a filter playground, it gets the job done. Perhaps the best part of the whole tool is that you can then copy your properly formatted command right out of the browser window and into your terminal so you can put it to work on your local files.

FFmpeg is one of those programs you should really be familiar with because it often proves useful in unexpected ways. The ability to manipulate audio and video with just a few keystrokes can really come in handy, and we’ve seen this open-source tool used for everything from compressing podcasts onto floppy disks to overlaying real-time environmental data onto a video stream.

Never Too Rich Or Thin: Compress Sqlite 80%

We are big fans of using SQLite for anything of even moderate complexity where you might otherwise use a file. The advantages are numerous, but sometimes you want to be lean on file storage. [Phiresky] has a great answer to that: the sqlite-zstd extension offers transparent row-level compression for SQLite.

There are other options, of course, but as the post mentions, each of these have some drawbacks. However, by compressing each row of a table, you can retain random access without some of the drawbacks of other methods.

Continue reading “Never Too Rich Or Thin: Compress Sqlite 80%”

Squeezing A Wordle Clone Onto The Game Boy

The popular word game Wordle is both an addictive brain teaser for some and a perpetual social media annoyance for others. Its runaway success has spawned a host of clones, among them one created for the Nintendo Game Boy with a reduced vocabulary. [Alexander Pruss] took on the challenge of improving it by fitting the entire 12972-strong 5-letter-word vocabulary as well as the 2315-word answer list into a 32K cartridge along with the code. The challenge in compression on a platform of such meager resources is to devise an algorithm which does not require more computing power or memory than the device has at its disposal. His solution is both elegant and easy to understand.

Starting by dividing the words into lists by first letter such that he can ignore the letter, he can reduce each word to 20 bits as four 5-bit letters. The clever part comes when he organises the words alphabetically, meaning that the 20-bit numbers representing each word are in numerical order.

Thus instead of storing the full number he could store the difference between it and its predecessor. With a few extra tweaks he was able to get the full list down to an impressive 20186 bytes, but was still faced with not enough space. Turning to the Wordle code he found that a library function call could be switched to an alternative with a much more efficient footprint, resulting in a new ROM with all words in place and ready to play.

Of course our community have applied their minds to Wordle and we’ve featured more than one hack based upon it. Mostly they have involved automated solving, so this retro gaming version breaks new ground.

Header image: Sammlung der Medien und Wissenschaft, CC BY 4.0.

Several frames from Bad Apple

PineTime Smartwatch And Good Code Play Bad Apple

PineTime is the open smartwatch from our friends at Pine64. [TT-392] wanted to prove the hardware can play a full-motion music video, and they are correct, to a point. When you watch the video below, you should notice the monochromatic animation maintaining a healthy framerate, and there lies all the hard work. Without any modifications, video would top out at approximately eight frames per second.

To convert an MP4, you need to break it down into images, which will strip out the sound. Next, you load them into the Linux-only video processor, which looks for clusters of pixels that need changing and ignores the static ones. Relevant pixel selection takes some of the load off the data running to the display and boosts the fps since you don’t waste time reminding it that a block of black pixels should stay the way they are. Lastly, the process will compress everything to fit it into the watch’s onboard memory. Even though it is a few minutes of black and white pictures, compiling can take a couple of hours.

You will need access to the watch’s innards, so hopefully, you have the developer kit or don’t mind cracking the seal. Who are we kidding, you aren’t here for intact warranties. The video resides in the flash chip and you have to transfer blocks one at a time. Bad Apple needs fourteen, so you may want to practice on a shorter video. Lastly, the core memory needs some updating to play correctly. Now you can sit back and…watch.

Pine64 had a rough start with the single-board computers, but they’re earning our trust with things like soldering irons and Google-less Linux mobile phones.

Continue reading “PineTime Smartwatch And Good Code Play Bad Apple”