Running OpenCL on a Raspberry Pi GPU

This is an interesting development for media users and machine learning hackers: [doe300] has implemented OpenCL on the Raspberry Pi 3 Model B+called VCFCL That’s big news because the Pi 3+ has a Graphics Processing Unit (GPU) built into the processor that has been generally underutilized. The VideoCore IV GPU is built into the Broadcom BCM2837B0 and is surprisingly capable for a low-power chip. Although this GPU is well documented, it hasn’t been used that widely because you have to code specifically for this class of GPU. Adding in support for a high-level framework like OpenCL will make it much easier to run and adapt existing packages.

This OpenCL is the end result of a masters these by Daniel Steadelmann at Nurenberg Tech, and this implementation supports the embedded profile for OpenCL 1.2. This includes only a subset of the full OpenCL commands. It does support an installable client decoder (ICD) though, which means that you could run another OpenCL implementation at the same time. That’s a neat trick if you want to run OpenCL tasks on the GPU and CPU at the same time using a CPU implementation like POCL.

The performance of the VideoCore IV GPU won’t exactly set the world on fire: the author estimates the maximum performance at about 24 GFLOPS. Contrast that with the 8200 GFLOPs that an Nvidia GTX1080 can manage, and you can appreciate that you might not get far when mining Etherium. However, that would be enough to make running programs like Plex and Kodi on a Raspberry Pi more realistic if implemented, as it is enough to support real-time transcoding a video stream.

28 thoughts on “Running OpenCL on a Raspberry Pi GPU

    1. I’ve rarely seen realtime transcoding using OCL – dedicated encoder/decoder hardware is usually more power-efficient and cheaper and the pi already has hardware H.264 encode/decode capability, I’m fairly certain it’ll go up to 1080p

    1. A decoder that makes use of OpenCL would also be nice for my Nvidia ION board that is unable to play 10 bit H.264 and H.265 at 1080p. That GPU has 54 GFLOPs according to Wikipedia.

  1. Worth pointing out that the Raspberry Pi is already a pretty good Kodi platform. It has hardware 1080p decode for H.264 as standard (with decent 1080i deinterlacing on the Quad core models), MPEG2 and VC-1 hardware decode with the correct licences, H.264 hardware encode (no licence needed, comes as standard)

    For 1080p MPEG2/H.264/VC-1 Kodi playback duties the Pi is already a good solution.

    There are video hardware transcode solutions already based on the Pi (leveraging the hardware decoder and encoder)

    There are also GPU accelerated HEVC optimised decoders now in recent Kodi/ffmpeg builds I believe, which will play some 1080p 10-bit HEVC/h.265 content.

  2. One issue not mentioned in the article is that programs using this OpenCL implementation must be run as root, as the DMA interface for the VC4 does not have a memory management unit and thus can access any part of system memory(!!) It’s a major security concern.

    1. Hi,

      There are some applications in “VC4CL/test”.

      But when I try to compile them “cmake .”, I get an error (“CMake Error at CMakeLists.txt:10 (ExternalProject_Get_Property): Unknown CMake command “ExternalProject_Get_Property”.”)
      Grr .. :-(


    1. OpenCL is an open GPU programmig framework. Without OpenCL, rpi will never get good GPU accelerated computational libraries. At the moment RPi is nearly useless for deep learning because there is very little GPU support for gpu acceleration, and the CPUs are way too slow for most applications. As soon as someone comes up with a similarly priced product that can do gpu accelerated machine learning without the complexity of custom gpu programming, raspberry is going to die because everyone will want to switch. That’s why raspberry desperately needs OpenCV.

    2. OpenCL is short for Open Computing Language. GPU’s typically have lots and lots of very small processing elements that can do basic maths operation in parallel. So they can be used to accelerate video encoding, audio encoding, cryptography hashing functions, Fast Fourier transforms, matrix vector multiplication, … and lots more. Basically they can be used to accelerate any mathematical operations that benefit from parallel operations.

      1. Hi,

        Looking at the specs of these GPUs, it is mentioned that these devices can do quite a lot of different things, 2D graphics (openVG), 3D graphics (openGL / openGL ES), real-time video encoding and decoding.

        What I never understood is this:
        Is this actually all done by the same hardware? Are these all the same processors / processing elements accessed throu differened APIs? Or are these all different pieces on the die all specialised to do their own thing?


  3. Would love to see the RPi as the new Amiga/C64 of the demoscene and see how far they could go.

    A 5watts device throwing crazy 3d animation in 1080p 60fps or finding a way to go even higher.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.