This is an interesting development for media users and machine learning hackers: [doe300] has implemented OpenCL on the Raspberry Pi 3 Model B+called VCFCL That’s big news because the Pi 3+ has a Graphics Processing Unit (GPU) built into the processor that has been generally underutilized. The VideoCore IV GPU is built into the Broadcom BCM2837B0 and is surprisingly capable for a low-power chip. Although this GPU is well documented, it hasn’t been used that widely because you have to code specifically for this class of GPU. Adding in support for a high-level framework like OpenCL will make it much easier to run and adapt existing packages.
This OpenCL is the end result of a masters these by Daniel Steadelmann at Nurenberg Tech, and this implementation supports the embedded profile for OpenCL 1.2. This includes only a subset of the full OpenCL commands. It does support an installable client decoder (ICD) though, which means that you could run another OpenCL implementation at the same time. That’s a neat trick if you want to run OpenCL tasks on the GPU and CPU at the same time using a CPU implementation like POCL.
The performance of the VideoCore IV GPU won’t exactly set the world on fire: the author estimates the maximum performance at about 24 GFLOPS. Contrast that with the 8200 GFLOPs that an Nvidia GTX1080 can manage, and you can appreciate that you might not get far when mining Etherium. However, that would be enough to make running programs like Plex and Kodi on a Raspberry Pi more realistic if implemented, as it is enough to support real-time transcoding a video stream.
” masters these ”
Those are nice to have. ;-D
Maybe its the plural of thesis :P
Real time transcoding, to what codec? I’d love to have a dinky little piece of hardware that can crank through 1080p video to HEVC in real time or faster.
I suspect you’ll need a hardware video encoder for that. Consider for a moment that dual xeon e5-2670s are similar in performance to the raspi’s GPU and those CPUs cannot handle real time HEVC (maybe they’d do it without all the bells and whistles of x265 turned on :P).
Nvidia NVENC + Mini-ITX ?
I’ve rarely seen realtime transcoding using OCL – dedicated encoder/decoder hardware is usually more power-efficient and cheaper and the pi already has hardware H.264 encode/decode capability, I’m fairly certain it’ll go up to 1080p
Makes just far more sense to use something like the NeTV2 for serious real-time transcoding. You’re not doing real-time anything on a microprocessor architecture.
does it work even with the new VC4 driver or it relies on the old proprietary firmware?
Does it work even with the new architecture VC4 or it relies on the old proprietary firmware?
Maybe not transcoding, but adding new video decoders not supported in hardware could be a good idea.
A decoder that makes use of OpenCL would also be nice for my Nvidia ION board that is unable to play 10 bit H.264 and H.265 at 1080p. That GPU has 54 GFLOPs according to Wikipedia.
Why only Pi 3+ if older BCM chips also have VideoCore IV?
Article is misleading, in source repository there is no mention about any model.
Says on the GitHub site that it’s for all Pi’s.
Can this be used for retro gaming frontends? (Particularly to help along the N64 emulation)
That’s exactly what I was wondering.
Worth pointing out that the Raspberry Pi is already a pretty good Kodi platform. It has hardware 1080p decode for H.264 as standard (with decent 1080i deinterlacing on the Quad core models), MPEG2 and VC-1 hardware decode with the correct licences, H.264 hardware encode (no licence needed, comes as standard)
For 1080p MPEG2/H.264/VC-1 Kodi playback duties the Pi is already a good solution.
There are video hardware transcode solutions already based on the Pi (leveraging the hardware decoder and encoder)
There are also GPU accelerated HEVC optimised decoders now in recent Kodi/ffmpeg builds I believe, which will play some 1080p 10-bit HEVC/h.265 content.
Doesn’t the RaspberryPi still require a binary blob to be loaded in order to use? Meh.
And those blobs will remain until at least 2025, when the codec patent have expired in every country (ref: https://www.raspberrypi.org/forums/viewtopic.php?t=201449 )
One issue not mentioned in the article is that programs using this OpenCL implementation must be run as root, as the DMA interface for the VC4 does not have a memory management unit and thus can access any part of system memory(!!) It’s a major security concern.
That is an interesting caveat, but I think given the Pi’s single board nature and the use cases for this most people won’t mind giving it root (even though it’s not the right thing to do.)
It would be good to have some examples of how to write and run code for these. Transcoding videos aside, this looks like a great way to do digital signal processing for things such as software defined radio.
Hi,
There are some applications in “VC4CL/test”.
But when I try to compile them “cmake .”, I get an error (“CMake Error at CMakeLists.txt:10 (ExternalProject_Get_Property): Unknown CMake command “ExternalProject_Get_Property”.”)
Grr .. :-(
Kristoff
Read entire thing, still have no idea what OpenCL is or why I’d want it. Come on!
OpenCL is an open GPU programmig framework. Without OpenCL, rpi will never get good GPU accelerated computational libraries. At the moment RPi is nearly useless for deep learning because there is very little GPU support for gpu acceleration, and the CPUs are way too slow for most applications. As soon as someone comes up with a similarly priced product that can do gpu accelerated machine learning without the complexity of custom gpu programming, raspberry is going to die because everyone will want to switch. That’s why raspberry desperately needs OpenCV.
OpenCL is short for Open Computing Language. GPU’s typically have lots and lots of very small processing elements that can do basic maths operation in parallel. So they can be used to accelerate video encoding, audio encoding, cryptography hashing functions, Fast Fourier transforms, matrix vector multiplication, … and lots more. Basically they can be used to accelerate any mathematical operations that benefit from parallel operations.
Hi,
Looking at the specs of these GPUs, it is mentioned that these devices can do quite a lot of different things, 2D graphics (openVG), 3D graphics (openGL / openGL ES), real-time video encoding and decoding.
What I never understood is this:
Is this actually all done by the same hardware? Are these all the same processors / processing elements accessed throu differened APIs? Or are these all different pieces on the die all specialised to do their own thing?
Kr.
It’s the same hardware for everything. What changes from one api to another is the way they manage data.
You could technically do machine learning on OpenGL but you would need to do very whacky stuff to fit your data to a format that the api accepts, like converting it into textures. Also, the api has limitations that would make doing anything that is not graphics very difficult if not impossible.
Since quite a few generations from now GPU’s have been designed to be number crunching machines instead of specialized graphics processors so they can be used in different fields.
And cue the totally impractical and money losing coin miner project in 3.. 2.. .1…
Would love to see the RPi as the new Amiga/C64 of the demoscene and see how far they could go.
A 5watts device throwing crazy 3d animation in 1080p 60fps or finding a way to go even higher.