Import GPU: Python Programming With CUDA

February 25, 2025 by Bryan Cockfield 17 Comments

Every few years or so, a development in computing results in a sea change and a need for specialized workers to take advantage of the new technology. Whether that’s COBOL in the 60s and 70s, HTML in the 90s, or SQL in the past decade or so, there’s always something new to learn in the computing world. The introduction of graphics processing units (GPUs) for general-purpose computing is perhaps the most important recent development for computing, and if you want to develop some new Python skills to take advantage of the modern technology take a look at this introduction to CUDA which allows developers to use Nvidia GPUs for general-purpose computing.

Of course CUDA is a proprietary platform and requires one of Nvidia’s supported graphics cards to run, but assuming that barrier to entry is met it’s not too much more effort to use it for non-graphics tasks. The guide takes a closer look at the open-source library PyTorch which allows a Python developer to quickly get up-to-speed with the features of CUDA that make it so appealing to researchers and developers in artificial intelligence, machine learning, big data, and other frontiers in computer science. The guide describes how threads are created, how they travel along within the GPU and work together with other threads, how memory can be managed both on the CPU and GPU, creating CUDA kernels, and managing everything else involved largely through the lens of Python.

Getting started with something like this is almost a requirement to stay relevant in the fast-paced realm of computer science, as machine learning has taken center stage with almost everything related to computers these days. It’s worth noting that strictly speaking, an Nvidia GPU is not required for GPU programming like this; AMD has a GPU computing platform called ROCm but despite it being open-source is still behind Nvidia in adoption rates and arguably in performance as well. Some other learning tools for GPU programming we’ve seen in the past include this puzzle-based tool which illustrates some of the specific problems GPUs excel at.

What Kind Of GPU Are You?

July 22, 2021 by Al Williams 9 Comments

In the old days, big computers often had some form of external array processor. The idea is you could load a bunch of numbers into the processor and then do some math operations on all of the numbers in parallel. These days, you are more likely to turn to your graphics card for number crunching support. You’ll usually use some library to help you do that, but things are always better when you understand what’s going on under the hood. That’s why we enjoyed [RasterGrid’s] post on GPU architecture types.

If you can tell the difference between IMR (immediate mode) and TBR (tile-based) rendering this might not be the post for you. But while we knew the terms, we found a lot of interesting detail including some graphics and pseudo code that clarified the key differences.

Continue reading “What Kind Of GPU Are You?” →

GPU Programming For Easy & Fast Image Processing

May 30, 2012 by Brian Benchoff 15 Comments

If you ever need to manipulate images really fast, or just want to make some pretty fractals, [Reuben] has just what you need. He developed a neat command line tool to send code to a graphics card and generate images using pixel shaders. Opposed to making these images with a CPU, a GPU processes every pixel in parallel, making image processing much faster.

All the GPU coding is done by writing a bit of code in GLSL. [Reuben]’s command line utility takes that code, sends it to the graphics card, and returns the image calculated by the GPU. It’s very simple for to make pretty Mandebrolt set images and sine wave interference this way, but [Reuben]’s project can do much more than that. By sending an image to the GPU and performing a few operations, [Reuben] can do very fast edge detection and other algorithmic processing on pre-existing images.

So far, [Reuben] has tested his software with a few NVIDIA graphics cards under Windows and Linux, although it should work with any graphics card with pixel shaders.

Although [Reuben] is sending code to his GPU, it’s not quite on the level of the NVIDIA CUDA parallel computing platform; [Reuben] is only working with images. Cleverly written software could get around that, though. Still, even if [Reuben]’s project is only used for image processing, it’s still much faster than any CPU-bound method.

You can grab a copy of [Reuben]’s work over on GitHub.