Your Text Needs More JPEG

We’ve all been victims of bad memes on the Internet, but they’re not all just bad jokes gone wrong. Some are simply bad as a result of being copies-of-copies, as each reposter adds another layer of compression to an already lossy image format like JPEG. Compression can certainly be a benefit in areas like images and videos, but [Michal] had a bit of a fever dream imagining this process applied to text. Rather than let the idea escape, he built the Lossifizer to add JPEG-like compression to text.

JPEG compression uses a system similar to the fast Fourier transform (FFT) called the discrete cosine transform (DCT) to reduce the amount of data in an image by essentially removing some frequency information. The data lost is often not noticeable to the human eye, at least until it gets out of hand. [Michal]’s system performs the same transform on text instead, with a slider to control the “amount of JPEG” in the output text. The code for this script uses a “perceptual” character map, clustering similarly-looking and similarly-sounding characters next to each other, resembling “leet speak” from days of yore, although at high enough compression this quickly gets out of hand.

One of the quirks that [Michal] discovered is that certain AI chat bots have a much less difficult time interpreting this JPEG-ified text than a human probably would have, which provides a bit of insight into how some of these algorithms might be functioning under the hood. For some more insight into how JPEG actually works on images, we posted about a deep dive into the image format a while back.

Parametric Press Unravels The JPEG Format

This is the first we’ve heard of Parametric Press — a digital magazine with some deep dives into a variety of subjects (such as particle physics, “big data” and such) that have interactive elements or simulations of various types embedded within each story.

The first one that sprung up in our news feed is a piece by [Omar Shehata] on the humble JPEG image format. In it, he explains the how and why of the JPEG encoding process, allowing the reader to play with the various concepts along the way, in real time, within the browser.

RGB colour-space subsampling doesn’t affect each component to the same degree due to the human eye cone cell response. Also, the chroma components are much less affected than the luminance.

For those not familiar with the format, the first step (which is actually optional) to JPEG encoding is to transform the image from the RGB color space, into a YCbCr (luminance, chrominance) color space. Since the human eye is far more sensitive to luminance (brightness) differences than it is to Cb (chroma relative blueness) and Cr (chroma relative redness) differences, these latter two components can be subsampled by only storing a single value for each, in every 2×2 pixel matrix. JPEG allows other matrix sizes, but 2×2 is the most common.

This sets the scene for the clever bit, that comes next and allows more of that harder-to-perceive chroma information to be discarded. It’s fun to play with the chroma sub-sampling slider and see how the different colours are not equally affected, due to the relative sensitivities of the human eye cone cells.

Next, the three YCbCr components are treated independently to a discrete cosine transform and quantization. This transforms each 8×8 pixel block into 64 discrete spatial frequencies. The JPEG compression level (which you can change) affects how many of the upper-frequency components get discarded, and thus how much of the fine spatial detail gets discarded. This is the main source of JPEG image quality loss. Finally, the compressed blocks are delta encoded, where each subsequent block is coded as the difference from the previous one. Like chroma subsampling, this doesn’t offer any compression on its own but allows the subsequent run-length encoding to be more effective, giving more (lossless) compression. Finally, the whole lot is then Huffman compressed with a unique table stored in the JPEG header. So want to play with JPEGs some more? here’s the GitHub source.

If all of this theoretical stuff is a bit useless to you, perhaps you just want to decode some JPEGs, then here is a speedy library for just that.

Video Compression Explainer — Like We’re Five-Year-Olds

[Ottverse] has an interesting series in progress to demystify video compression. The latest installment promises to explain discrete cosine transforms as though you were five years old.

We’ll be honest. At five, we probably didn’t know how to interpret this sentence:

…the Discrete Cosine Transform takes a set of N correlated (similar) data-points and returns N de-correlated (dis-similar) data-points (coefficients) in such a way that the energy is compacted in only a few of the coefficients M where M << N.

Still, the explanation is pretty clear and we really liked the analogy with the spheres and the stars in a constellation.

Continue reading “Video Compression Explainer — Like We’re Five-Year-Olds”