Video Compression Explainer — Like We’re Five-Year-Olds

August 28, 2020

[Ottverse] has an interesting series in progress to demystify video compression. The latest installment promises to explain discrete cosine transforms as though you were five years old.

We’ll be honest. At five, we probably didn’t know how to interpret this sentence:

…the Discrete Cosine Transform takes a set of N correlated (similar) data-points and returns N de-correlated (dis-similar) data-points (coefficients) in such a way that the energy is compacted in only a few of the coefficients M where M << N.

Still, the explanation is pretty clear and we really liked the analogy with the spheres and the stars in a constellation.

The example Matlab code is probably also lost on a five-year-old, but we liked it. Anyone, we think, can understand the practical result where removing too much data — high compression — resulted in a poor quality image, but the image quality was pretty good even when 75% of the data vanishes.

So while you might not want to show this to your five-year-old, you might enjoy it and even learn something. The rest of the series is pretty good, too. There are discussions of data compression, codecs, and encoders. We are sure there’s more to come, also.

If you are not into Matlab, you could probably do the same trick with SciPy pretty easily. Or, try Octave, one of several open-source projects that are similar to Matlab.

12 thoughts on “Video Compression Explainer — Like We’re Five-Year-Olds”

Krishna says:

August 28, 2020 at 7:59 pm

Hey guys – author of that blog post here. My colleague submitted it to HN and I got a lot of positive comments and constructive feedback (the way I like to look at it). I guess my core audience got it immediately, but, when exposed to HN, I learned a lot more about explaining it right.

I’ve expanded on the article using a new example of how “20 questions and the kid game “i spy with my little eye”” is in-fact a pretty good approximation of the DCT. Hope this helps everyone interested in dipping their toes into the amazing world of image & video processing, compression, and delivery.

Hope this helps, and thanks for sharing and the summarization!

Report comment

Reply
1. John says:
  
  August 29, 2020 at 12:15 am
  
  HN?
  
  Report comment
  
  Reply
  1. John says:
    
    August 29, 2020 at 12:16 am
    
    Ah, hacker news. Sorry. Always Google first.
    
    Report comment
    
    Reply
RP says:

August 28, 2020 at 8:25 pm

I’m afraid I still don’t know what DCT is. I get that it’s used in compression but since I’m not familiar with MatLab, the functional code example in the linked to article reads like:

inputPixels = ones(8,8) * 255;
dctCoeffs = MAGIC(inputPixels);
reconstructedPixels = TADA(dctCoeffs);

Report comment

Reply
1. Truth says:
  
  August 28, 2020 at 11:36 pm
  
  Images of the DCT helped me to understand: https://www.youtube.com/watch?v=Q2aEzeMDHMA
  
  Report comment
  
  Reply
2. Joe says:
  
  August 29, 2020 at 9:24 am
  
  DCT is some spiffy linear algebra, short for Discrete Cosine Transform. Transform means it just changes the way the data is represented, but (theoretically) without changing the actual data. If you feed it the graph of a sound wave over a short period of time, with a point for the amplitude for each discrete moment in time (called samples), it will spit out a graph with the intensity of each sound frequency possible in that time slice. You can reverse the transform by taking each frequency intensity and making a sine wave of that amplitude and adding them all together. The useful thing about DCT is that if you remove some of the smaller intensities (called coefficients) and just set them to zero, you still get a pretty close approximation of the original wave. The neat thing is that it extends to any dimensional space you need, like 2D for images.
  
  Report comment
  
  Reply
3. Dude says:
  
  August 29, 2020 at 10:31 am
  
  Simplest version:
  
  You can take a row of pixels and imagine that this row repeats again forever. In this way, you’ve transformed a line of the image into something that resembles a continuous tone. The repeating pixel pattern includes frequencies that modify the tone, so you find what individual frequencies are present in the data and mark down their cosine (sine) wave coefficient: amplitude, frequency, and relative phase.
  
  When you sum all the found cosine (or sine) waves together, it should form the exact same pattern. If you’ve summed enough waves, when you round it down to the resolution and bit-depth of the original image, it should return the exact same pixels back.
  
  The compression happens when you order the coefficients from largest to smallest, and discard the smallest coefficients that have the least effect on the end result. That way you end up with an approximation of the original data.
  
  There’s multiple advantages in representing the picture this way. One is that you don’t have to use the same resolution to draw the picture, because the waves are continuous – so you can freely scale up and down. The other is that you can transmit over noisy channels by sending the largest coefficients with more redundancy, so if the receiver can’t get all of the data, they’re likely to get at least a blurry version of the image instead of nothing at all.
  
  Report comment
  
  Reply
  1. Dude says:
    
    August 29, 2020 at 10:42 am
    
    Of course you don’t have to take an individual line. You can put all the lines of the image one after the other and interpret that as the repeating pattern.
    
    Or, like you have in JPEG and MPEG compression, you take 8×8 pixel blocks and calculate the DCT over each of those. That way you retain a low resolution version of the original image you can draw, like a 1/8th thumbnail of the picture, and add detail if needed.
    
    Report comment
    
    Reply
4. RP says:
  
  August 29, 2020 at 6:48 pm
  
  Thanks All. I have a much better understanding now!
  
  Report comment
  
  Reply
Greg A says:

August 29, 2020 at 12:19 pm

I haven’t parsed this article yet, but I put it on my reading list because I always appreciate a different angle on this material.

I’m just posting to promote The Scientist and Engineer’s Guide to Digital Signal Processing, by Steven W Smith (dspguide.com). You can download it for free as a PDF…I first downloaded it back in 1997 and I have worked my way very slowly through different parts of it since. It is very good, in my opinion, if you want a formal introduction to these concepts. The first time, I will say, I didn’t feel like I got a lot out of it, but each time I come back to it the concepts gel a little more thoroughly. Really an incredibly valuable resource.

Report comment

Reply
M C says:

August 29, 2020 at 6:34 pm

Now take all you’ve learned about spatial compression and spread it across time… temporal compression.
I had a fascinating conversation with the head geek about 15 years ago) at Snell & Wilcox (UK) about their clever video processing algorithms.
What I learned is that I should have paid more attention in math classes!

Report comment

Reply
Jonathan Wilson says:

September 3, 2020 at 4:16 am

I wish there was a series like this that helped understand the inner workings of MP3 audio and explained in simple language (not the theory behind how it does what it does but the complex math involved)

Report comment

Reply

Hackaday

Video Compression Explainer — Like We’re Five-Year-Olds

12 thoughts on “Video Compression Explainer — Like We’re Five-Year-Olds”

Leave a Reply to TruthCancel reply

Search

Never miss a hack

If you missed it

From PostScript To PDF

To See Within: Detecting X-Rays

Why Physical Media Deserved To Die

Supercon 2024: Exploring The Ocean With Open Source Hardware

Porting COBOL Code And The Trouble With Ditching Domain Specific Languages

Our Columns

From Good Enough To Best

Hackaday Podcast Ep 318: DIY Record Lathe, 360 Degree LIDAR, And 3D Printing Innovation Lives!

This Week In Security: XRP Poisoned, MCP Bypassed, And More

FLOSS Weekly Episode 830: Vibes

Keebin’ With Kristina: The One With The Part Picker

12 thoughts on “Video Compression Explainer — Like We’re Five-Year-Olds”

Leave a Reply to TruthCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns