FFT On The Raspi’s GPU

fft

The Raspberry Pi has been around for two years now, and still there’s little the hardware hacker can actually do with the integrated GPU. That just changed, as the Raspberry Pi foundation just announced a library for Fourier transforms using the GPU.

For those of you who haven’t yet taken your DSP course, fourier transforms take a function (or audio signal, radio signal, or what have you) and output the fundamental frequency. It’s damn useful for everything from software defined radios to guitar pedals, and the new GPU_FFT library is about ten times faster at this task than the Raspi’s CPU.

You can get a copy of  the GPU_FFT library by running rpi-update on your pi. If you happen to build anything interesting – something with a software defined radio or even a guitar pedal – you’re more than welcome to send it in to the Hackaday tips line. We’d love to see what you’re up to.

17 thoughts on “FFT On The Raspi’s GPU

  1. So was this made by the foundation or someone else (using the reverse engineered videocore info?). It’s not really clear in the article…

    1. I think it was made by the foundation, if the firmware is being pushed using rpi-update. Anyway, will this make real time fft possible?

    2. Broadcom employee. Binary blob, as far as I can see. If you want something a bit more exciting, the NEON coprocessor on more modern arm chips (like the ones found on the similarly priced beaglebone black) is quite exciting, and open and documented. I know people who’ve done some real quite impressive real-time computer vision, software defined radio, and other bits and bobs with it.

      1. Can confirm Neon power. One important note: inspect your assembly file (-S compiler option for gcc) to check whether the compiler actually used your intrinsics the way you thought it would. Doesn’t require intimate arm asm knowledge.

    1. NO, runs on the Videocore natively. The VC4 has FFT HW IIRC, and a couple of vector scaler processors with their own architecture which is used to pass data and instructions to the various HW blocks, as well as doing their own 16 way SIMD processing.

  2. Very nice! It would be great if some programming tools and specs come out on how to program the VideoCore IV. I don’t like the way broadcom handles drivers and interfaces…
    I’m having a nasty memory leak with multithreading over their EGL implementation and couldn’t take a look to any line of code to understand if it is a fault of mine or a bad API implementation (maybe you would take a look: http://www.raspberrypi.org/forum/viewtopic.php?f=67&t=66313)

  3. Fourier transforms don’t “output the fundamental frequency”, they convert the signal from the time domain to the frequency domain and thus show the full frequency spectrum for the source signal.

    1. God, that’s the exact math definition of FT. FFT is an algorithm to fasten the calculations using some shortcuts and restrictions (buffer size power of 2). FT produces complex numbers (a+jb), usually the imaginary part is discarded in most applications (Vu Meters). I agree with GK, the definition of FT in the post isn’t accurate

    2. Actually, the ‘fundamental frequency’ in an INPUT to the Fourier transform. It is the sampling frequency divided by the analysis window length. Of that the FT (e.g. FFT) solves the harmonic composition. Often the result is misused (and misunderstood) to present the frequency content of the original signal, which it only more or less approximates.

  4. >by running rpi-update on you pi.
    I think you mean “your pi.”

    There’s some interesting stuff in the comments of the linked piece as well. Eben says:

    >Hopefully we can provide accelerators for a few more common operations in the future. 1d and 2d convolutions, and FIR, IIR filters are obvious possibilities. We’re still looking to see whether we can make the QPUs available for third parties to write code on.

    Dom also points out:

    >These timings are with a pi on stock frequency. With my overclock settings, I get about 30% faster results.

    >Disabling hdmi output (tvservice -o) saves a few % if you are running headless. >Reducing resolution or depth with fbset helps a little if you need the display.

  5. hey guys, i need some help/advice. i am working on a project that involves comparing sounds from a bmw x5 hatch spindle to ones in a database. each spindle sound from the database has a number 1-10 associated with it based on the sound it makes when opening and closing the hatch. i’d love to use RPi and the fft function to listen to a spindle while being run then comparing it to the database then output a number 1-10. can this be done? i’m a mechanical engineering student and would appreciate direction since all this is new to me. i have been able to do something similar in MATLAB but want to see if it can be done with just the RPi, Mic, and display. Thanks!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s