What Exactly Is A Gaussian Blur?

July 21, 2021

Blurring is a commonly used visual effect when digitally editing photos and videos. One of the most common blurs used in these fields is the Gaussian blur. You may have used this tool thousands of times without ever giving it greater thought. After all, it does a nice job and does indeed make things blurrier.

Of course, we often like to dig deeper here at Hackaday, so here’s our crash course on what’s going on when you run a Gaussian blur operation.

It’s Math! It’s All Math.

A 2D Gaussian distribution shown in a 3D plot. Note the higher values towards the center, and growing smaller towards the outside in a bell curve shape.

Digital images are really just lots of numbers, so we can work with them mathematically. Each pixel that makes up a typical digital color image has three values- its intensity in red, green and blue. Of course, greyscale images consist of just a single value per pixel, representing its intensity on a scale from black to white, with greys in between.

Regardless of the image, whether color or greyscale, the basic principle of a Gaussian blur remains the same. Each pixel in the image we wish to blur is considered independently, and its value changed depending on its own value, and those of its surroundings, based on a filter matrix called a kernel.

The kernel consists of a rectangular array of numbers that follow a Gaussian distribution, AKA a normal distribution, or a bell curve.

This diagram shows the manner in which each pixel is processed. For a 3×3 kernel, the pixel of interest and all directly surrounding pixels are sampled. The kernel is then used to generate a new output pixel value based on a weighted average of the sampled pixels based on the Gaussian distribution.

Our rectangular kernel consists of values that are higher in the middle and drop off towards the outer edges of the square array, like the height of a bell curve in two dimensions. The kernel corresponds to the number of pixels we consider when blurring each individual pixel. Larger kernels spread the blur around a wider region, as each pixel is modified by more of its surrounding pixels.

For each pixel to be subject to the blur operation, a rectangular section equal to the size of the kernel is taken around the pixel of interest itself. These surrounding pixel values are used to calculate a weighted average for the original pixel’s new value based on the Gaussian distribution in the kernel itself.

A 5×5 Gaussian kernel. Note the external factor, which ensures that the total values all add up to 1. This avoids adding any intensity to the image, solely average the pixels without otherwise changing their intensity.

Thanks to the distribution, the central pixel’s original value has the highest weight, so it doesn’t obliterate the image entirely. Immediately neighboring pixels having the next highest influence on the new pixel, and so on. This local averaging smoothes out the pixel values, and that’s the blur.

Edge cases are straightforward too. Where an edge pixel is sampled, the otherwise non-existent surrounding pixels are either given the same value of their nearest neighbor, or given a value matching up with their mirror opposite pixel in the sampled area.

The same calculation is run for each pixel in the original image to be blurred, with the final output image made up of the pixel values calculated through the process. For grayscale images, it’s that simple. Color images can be done the same way, with the blur calculated separately for the red, green, and blue values of each pixel. Alternatively, you can specify the pixel values in some other color space and smooth them there.

Here we see an original image, and a version filtered with a Gaussian blur of kernel size three and kernel size ten. Note the increased blur as the kernel size increases. More pixels incorporated in the averaging results in more smoothing.

Of course, larger images require more calculations to deal with the greater number of pixels, and larger kernel sizes sample more surrounding pixels for each pixel of interest, and can thus take much longer to calculate. However, on modern computers, even blurring high-resolution images with huge kernel sizes can be done in the blink of an eye. Typically, however, it’s uncommon to use a kernel size larger than around 50 or so as things are usually already pretty blurry by that point.

The Gaussian blur is a great example of simple mathematics put to a powerful use in image processing. Now you know how it works on a fundamental level!

29 thoughts on “What Exactly Is A Gaussian Blur?”

Truth says:

July 21, 2021 at 7:33 am

You can also use kernel’s to highlight features in an image like detecting edges by suppressing everything that is not high contrast (ref: https://en.wikipedia.org/wiki/Kernel_(image_processing) )

Report comment

Reply
im-pro says:

July 21, 2021 at 7:44 am

I used a Gaussian blur for my etch a selfie project (https://hackaday.com/2019/07/30/etch-a-selfie/)
There two Gaussian blurs are subtracted and this makes it possible to convert Images into nice black-and-white images.
(ref: XDog https://users.cs.northwestern.edu/~sco590/winnemoeller-cag2012.pdf)

Report comment

Reply
1. quinor says:
  
  July 22, 2021 at 4:47 am
  
  That’s a great paper. I’ve been using a hillbilly version of that algorithm for my hanging plotter project but this one is significantly nicer! How did you stumble upon it?
  
  Report comment
  
  Reply
  1. im-pro says:
    
    July 23, 2021 at 1:59 am
    
    I don’t remember … It was years ago where I implemented an BW filter for laser.im-pro.at ….
    
    Report comment
    
    Reply
Steven Clark says:

July 21, 2021 at 8:34 am

Also the classic Sharpen filter is a RREF-style reversal of a gaussian blur kernel. Meanwhile Unsharp Mask is a gaussian blur applied in negative which is why it’s easier to parametrize: there’s no solving step.

Report comment

Reply
snarkysparky says:

July 21, 2021 at 9:28 am

now how about inverting an applied gaussian blur to recover the prior image.

Report comment

Reply
1. RustyHydrogen says:
  
  July 21, 2021 at 9:45 am
  
  Ah, the fun worlds of deconvolution, and blind deconvolution.
  
  Report comment
  
  Reply
grounded says:

July 21, 2021 at 9:39 am

And if you apply the same blur sequentially? Are two 3×3 blurs the same as a single 5×5?
I guess I should do some math…

Report comment

Reply
1. Marcus says:
  
  July 21, 2021 at 9:57 am
  
  Yep, doing the math helps here, to some degree:
  
  the convolution of a Gaussian with a Gaussian is still a Gaussian (without that, there wouldn’t be the central limit theorem, i.e. the reason why sums of independent identically distributed random variables add up to Gaussian distribution for large numbers of summands). So, yes, if we take a Gaussian with some sigma² and convolve it with itself, we get a new Gaussian with twice the sigma² (Variance).
  
  This means that if we decided the original discrete 3×3 kernel was “large” enough to cover the “interesting parts of the Gaussian because the stuff that was “cut off” from the (infinitely extending) Gaussian function was small enough for our tastes to not matter.
  
  (I’m ***very*** much at odds with the article: “larger kernels spread further”: No; you can put the same 3×3 in a 11×11 matrix and fill in with the smaller values from the actual Gaussian function. What the author means is that a higher-variance Gaussian spreads further.)
  
  But 5 is not at least twice 3… so, you’d get a worse approximation to an actual Gaussian filter with this finite kernel.
  
  However 5×5 is indeed the size you get when you, in the discrete domain, convolve two 3×3 kernels. So, by doing it that way, you get a different Gaussian kernel, but it’s a worse approximation to the Gaussian of twice the variance of the Gaussian in the 3×3 kernel.
  
  Report comment
  
  Reply
  1. Lasse says:
    
    July 21, 2021 at 10:07 am
    
    Also the Gaussian is separable, so you can do the 2D blur by doing a 1D horizontally and then vertically.
    
    Report comment
    
    Reply
Marcus says:

July 21, 2021 at 9:49 am

Is it just me that’s bothered by the logo in the post header *not* being the result of the sharp hackaday logo being filtered with a symmetric kernel,i.e. especially *not* with a Gaussian? That blur isn’t symmetrical at all!

Report comment

Reply
1. Chris Cox says:
  
  July 21, 2021 at 5:40 pm
  
  Nope, not just you. That kinda jumped out at me as well.
  
  Report comment
  
  Reply
Marcus says:

July 21, 2021 at 10:13 am

Honestly, I like when HaD does such articles!

I must still voice my disagreement on a few things: the first two are fundamental, the third I think is a bit of an overgeneralization but not a fundamentally wrong statement; my apologies.

> The kernel consists of a rectangular array of numbers that follow a Gaussian distribution,

Please be careful with the terminology here; what you write will mislead readers :(

These numbers don’t follow a Gaussian distribution. They are deterministic; they are samples taken from the *Gaussian function*, which happens to be the probability density of the Gaussian distribution.

> Larger kernels spread the blur around a wider region,

This is wrong – the size of the kernel doesn’t define the spread, just the number of pixels taken into account.

The spread is defined by the variance of the Gaussian function that gets sampled for the kernel. You can have a very narrow blur “spread” in a large filter, if you need the accuracy, or a very wide (=high variance) blur in a small filter.

The question is really just how much of the actual continuous function you cut off when you decide on your filter size.

But these are two different properties of the filter, and you’re conflating them!

> as each pixel is modified by more of its surrounding pixels.

That’s right – a larger Kernel takes more pixels into account.

> Edge cases are straightforward too. Where an edge pixel is sampled, the otherwise non-existent surrounding pixels are either given the same value of their nearest neighbor, or given a value matching up with their mirror opposite pixel in the sampled area.

These are **two** options, but they are not necessarily the right ones for any particular application. In fact, that choice can become very awkward for the pixels that are not at the very image border – the actual border pixel get their “weight” increased very much, so that their values overshadow that of their neighbors.

So, not a great generalization to make!

Report comment

Reply
echodelta says:

July 21, 2021 at 11:46 am

Now for a math routine to put shake and jiggle into stable video and stills for enhanced viewing.

Report comment

Reply
1. Nick says:
  
  August 11, 2021 at 2:39 pm
  
  Oh god, don’t give them ideas! We already have fake film lines and blemishes, and fake artefacts and glitches. Overuse of those is bad enough.
  
  Report comment
  
  Reply
Ian Dobbie says:

July 21, 2021 at 1:57 pm

I would question the large kernels taking significant processing time. Applying the blur is a convolution and almost any sensible algorithm will actually compute this as a multiplication in Fourier space. Take your kernel and image, FFT both, multiple them together and then do in the inverse FFT. Almost certainly massively faster than a direct real space convolution where you generate a new image, multiple every pixel by the kernel and sum the the relevant pixels into the results image.

Report comment

Reply
1. Chris Cox says:
  
  July 21, 2021 at 5:29 pm
  
  ROFL. Nope. Gaussian kernels are separable, and can use the central limit theorem, plus some linear and constant time algorithms, to do the work much more quickly than you could do a 2D FFT (much less the multiply and reverse FFT). Plus the separability makes the operation much more cache friendly than an FFT. (see also: bicubic and sinc based resampling kernels)
  If you are doing a convolution that isn’t separable, then an FFT (or other frequency space transform) might be a good idea. But please talk to algorithms and performance experts first, and measure, measure, measure.
  
  Report comment
  
  Reply
  1. snarkysparky says:
    
    July 23, 2021 at 5:06 am
    
    FFT is faster for large data.
    
    https://ccrma.stanford.edu/~jos/ReviewFourier/FFT_Convolution_vs_Direct.html
    
    Report comment
    
    Reply
    1. intheoryonly says:
      
      July 23, 2021 at 11:35 am
      
      The source you linked to only considers theoretical number of operations. In practice, memory access order and caches are far more important. The crossover point where FFT-based convolution is faster than separable convolution will only happen for much larger inputs.
      
      Report comment
      
      Reply
    2. Chris Cox says:
      
      July 23, 2021 at 1:55 pm
      
      Yeah, that completely ignores better algorithms and cache effects. It’s using the number of operations for a direct convolution – not separable, not box filters (certainly not the constant time algorithms for box filters).
      Next time, you probably should consult with someone who actually knows algorithms and performance.
      
      Report comment
      
      Reply
      1. snarkysparky says:
        
        July 24, 2021 at 4:00 am
        
        These people at Analog Devices are idiots. Shouldn’t consult them at all.
        
        “”This chapter presents two important DSP techniques, the overlap-add method, and FFT
        convolution. The overlap-add method is used to break long signals into smaller segments for
        easier processing. FFT convolution uses the overlap-add method together with the Fast Fourier
        Transform, allowing signals to be convolved by multiplying their frequency spectra. For filter
        kernels longer than about 64 points, FFT convolution is faster than standard convolution, while
        producing exactly the same result. “”
        
        https://www.analog.com/media/en/technical-documentation/dsp-book/dsp_book_Ch18.pdf
        
        Mr Algorithm guru would you be so kind as to point me to where I can gain understanding about this from the REAL experts?
        
        Report comment
      2. Chris Cox says:
        
        July 24, 2021 at 1:42 pm
        
        Yes sparky, you are in way over your head. FFTs are good in some places, and not in many others. In real world image processing, an FFT for gaussian blur would be among the slowest algorithms (only exceeded by brute force convolution). You are confusing theory (written by people who don’t know the practice and who are talking entirely in generalities) with actual practice. I’m talking about actual code in major applications that has been highly optimized, measured, copied by other applications, and hits the limits for throughput on a variety of architectures and inputs (image size, kernel size, etc.).
        How can you gain experience? Listen to the experts and stop trying to claim you know more than they do by quoting other people who know even less.
        Here’s a hint: click on the link in my name above.
        
        Report comment
      3. snarkysparky says:
        
        July 25, 2021 at 12:40 pm
        
        Ok smartie Chris. You win. All i did was google fft vs convolution and the first 63455 articles all went with fft as faster once past a very minimal kernel size. I looked for something agreeing with you but couldn’t find it. One thing i have always noticed is that when someone makes the boneheaded claim about the difference between theory and practice they don’t actually know the theory. What they have in brain is a conglomeration of anecdotes that support their view.
        
        Report comment
      4. Chris Cox says:
        
        July 25, 2021 at 1:43 pm
        
        Again, sparky, you are in way over your head.
        You thought you could “one up” the experts by looking things up on Google. But you got the theory (and ignored all the warning signs that it was just theory), with zero practice and zero experience.
        Again, you probably should click on the link in my name here to learn who it is that you are addressing. Yes, I know the theory, and I lecture on it a few times a year at the larger universities here in the SF Bay Area. I also spent many years on the practice: measuring everything, improving the algorithms until they were limited by the speed of RAM instead of the CPU, squeezing out a bit more performance with significant cache optimization, then re-optimizing for each new major chip architecture. There is a reason why the chip and motherboard makers consult with me about their latest designs and how their changes will affect real world throughput…
        Next time, try asking questions instead of claiming the experts are wrong based on your 5 seconds of Google “research”.
        
        Report comment
      5. im-pro says:
        
        July 25, 2021 at 11:55 pm
        
        Hallo Chris,
        
        I really appreciate your input to this topic. I’m now myself working in a research institute and my experience is not really good. We have an high incentive to write publications and there the quality is often not optimal. Furthermore when doing research I takes a lot of effort to filter out bad papers and find the practical problems.
        I hob if somebody searches about FFT for convolution this will show up to save them a lot of time to go throw with the implementation and find in the end that it will not be faster for real applications.
        
        Thanks Chris!
        
        @sparky here is one link about chis you should open : http://www.photoshophalloffame.com/chris-cox
        
        Report comment
Chris Maple says:

July 21, 2021 at 2:45 pm

The article would be greatly improved if the actual formula for pixel weighting was provided.

Report comment

Reply
1. Chris Cox says:
  
  July 21, 2021 at 5:36 pm
  
  Yeah, it is lacking a bit in detail.
  See https://en.wikipedia.org/wiki/Gaussian_filter
  
  Report comment
  
  Reply
brandon says:

July 25, 2021 at 11:45 am

you guys are gaussian blurring and im still reticulating splines :(

Report comment

Reply
Alan says:

July 27, 2021 at 4:29 am

If you have any interest in stitching images together (to make a panorama) then it helps to know about Gaussian blur.
Blurring to different kernel sizes, then subtracting (Difference of Gaussians), is one of the steps in keypoint detection algorithms like SIFT and SURF.

https://en.wikipedia.org/wiki/Image_stitching#Keypoint_detection

Report comment

Reply