Leveraging The GPU To Accelerate The Linux Kernel

December 14, 2012

Powerful graphics cards are pretty affordable these days. Even though we rarely do high-end gaming on our daily machine we still have a GeForce 9800 GT. That goes to waste on a machine used mainly to publish posts and write code for microcontrollers. But perhaps we can put the GPU to good use when it comes compile time. The KGPU package enlists your graphics card to help the kernel do some heavy lifting.

This won’t work for just any GPU. The technique uses CUDA, which is a parallel computing package for NVIDIA hardware. But don’t let lack of hardware keep you from checking it out. [Weibin Sun] is one of the researchers behind the technique. He posted a whitepaper (PDF) on the topic over at his website.

Add this to the growing list of non-graphic applications for today graphics hardware.

UPDATE: Looks like we won’t be trying this out after all. Your GPU must support CUDA 2.0 or higher. We found ours on this list and it’s only capable of CUDA 1.0.

[Thanks John]

58 thoughts on “Leveraging The GPU To Accelerate The Linux Kernel”

wowme@wtf.com says:

December 14, 2012 at 5:09 pm

I don’t have nvidia

Report comment

Reply
1. Gdogg says:
  
  December 14, 2012 at 5:37 pm
  
  Good to know.
  
  Report comment
  
  Reply
2. Hirudinea says:
  
  December 14, 2012 at 6:09 pm
  
  Don’t worry, I’m sure support for the Hercules will be coming out soon.
  
  Report comment
  
  Reply
3. Whatnot says:
  
  December 15, 2012 at 6:23 am
  
  Indeed, fuck nvidia’s propriety crap when there is a damn open standard, CUDA on linux? That’s just insulting in my view.
  
  Report comment
  
  Reply
  1. rasz says:
    
    December 15, 2012 at 8:46 am
    
    Nvidia because he got PAID by Nvidia to do his “research”.
    
    Report comment
    
    Reply
    1. xorpunk says:
      
      December 15, 2012 at 10:40 am
      
      Irony is it’s buggy as hell and it’s considered a success…
      
      I actually use nvidia over ATI because of driver stability and performance. ATI actually has more ALUs and still can’t compete, and that’s saying something…
      
      Report comment
      
      Reply
      1. pcf11 says:
        
        December 15, 2012 at 11:59 am
        
        I have been using nVidia’s binary drivers myself ever since I bought myself an MX200. In the very beginning there were a couple of rough patches but overall I have had very few issues with nVidia’s binary driver’s stability. ATI is a mess.
        
        Report comment
      2. Whatnot says:
        
        December 15, 2012 at 6:40 pm
        
        I’ve been through nvidia driver hell more than once, for years, but the same for ATI though, they both are very experimental and beta, both in software as well as hardware, but that’s the name of the game, but nvidia said they would embrace OpenCL (and directcompute but that’s a windows thing), and seeing OpenCL is the open standard that belongs to the open source linux it’s annoying if they try to get the CUDA nonsense (which they at some point said they’d get rid of even) on linux. It’s like they would encourage people to make DirectX stuff for linux.. you’d also be WTF??!?
        
        Report comment
  2. jc says:
    
    December 15, 2012 at 1:23 pm
    
    Right. At least their linux drivers aren’t discontinued. And of course we all know how usable the open source alternatives are. In my view, nvidia is currently the only way to go on linux if you want decent 3d and/or gpu acceleration.
    
    Report comment
    
    Reply
  3. draeath says:
    
    December 17, 2012 at 2:23 pm
    
    Great, so we should toss all of the work out?
    
    You’re sure none of the work or ideas could be ported over to OpenCL or something?
    
    Report comment
    
    Reply
Ralph says:

December 14, 2012 at 6:01 pm

So, what is the cheapest 4xx, 5xx, or 6xx gpu ard out there? And, is the cheapest card good enough to do useful work?

Report comment

Reply
1. fluffles says:
  
  December 14, 2012 at 6:24 pm
  
  Which series? There are quite a number of them. Going for cheap n good you can get high end GT series cards with CUDA 2.1 for 60 dollars. What is ‘useful work’? What do you want to do? What can you do now? Have you integrated with CUDA systems before? All the cards will improve your power, by how much varies based on, well, everything.
  
  Depending on the code and the system you can get 40x+ improvement. See here for some well documented examples:
  http://www.nvidia.com/object/cuda_showcase_html.html
  
  Here to benchmark your own:
  http://code.google.com/p/cuda-benchmarking/
  
  Report comment
  
  Reply
  1. Ralph says:
    
    December 15, 2012 at 3:42 pm
    
    Any series. I thought it might be useful to practice programming a GPU. I have no project in mind, other than learning. I left the definition of useful work up to people who might answer. I have not interfaced with Cuda before. I have stayed away from nvidia cards for years, since nvidia would not produce open drivers. I don’t play games(unless learning is a game), Intel and ATI video has done everythingg I needed. I’ll probably start with the author’s encrypted filesystem drivers using GPUs and some benchmarking of code to factor integers.
    Based on fluffers comment, I ordered a GT 610 based card today from Amazon.
    
    Report comment
    
    Reply
    1. Ralph says:
      
      December 15, 2012 at 4:55 pm
      
      I mwant fluffles.
      
      Report comment
      
      Reply
  2. sa_penguin says:
    
    December 15, 2012 at 5:11 pm
    
    I have a 9800 GT myself – its best feature is that its passive [no fan noise]. To reach CUDA 2.0 or higher, but stay passive, I’m looking at the ZOTAC GT 640. Lousy for fancy games, fine for a little CUDA powered processing oomph.
    
    Report comment
    
    Reply
2. fluffles says:
  
  December 14, 2012 at 6:49 pm
  
  Nevermind. Shellshocker currently, GT 610 for 35 dollars:
  http://www.newegg.com/Special/ShellShocker.aspx?cm_sp=ShellShocker-_-14-130-849-_-12142012_4
  
  Report comment
  
  Reply
3. Kenneth Finnegan says:
  
  December 14, 2012 at 11:53 pm
  
  I’ve gotten significant improvement (orders of magnitude) porting personal projects to CUDA on the GeForce 8400. It’s a different computing architecture, and with the proper work load, 8 cuda cores can beat the crap out of a single x86 core.
  
  Report comment
  
  Reply
4. six677 says:
  
  December 15, 2012 at 4:55 am
  
  you can get a brand new GTX640 for about £75 which will support CUDA 2.1. I run a GTX460 which is a hair over £100 normally, it also runs 2.1. I have no idea which is the more powerful card, the x60 models are intended for games whereas the x40 models are intended for media but then the 4xx is older than the 5xx or 6xx so it isn’t unreasonable to suggest that it is slower (mine is factory overclocked though and I am thinking about cranking it further).
  
  It seems the 610 also supports CUDA 2.1 so that might also be a cheap option.
  
  As for useful work, define useful…. CUDA will be faster than CPU any day but the gaming performance of the cheap cards will barely beat using the old motherboard/CPU graphics. The 640 isn’t a bad card, not intended for gaming but I imagine most things will at least run on it. The 460 although old copes very nicely with newer games, I run skyrim with most settings maxed (a few knocked a notch or 2 down) and with Fraps being used to test framerate over a 1 hour period I got a minimum framerate of 59 and a max of 61 so nicely V-Synced as it should be.
  
  Just bare in mind: laptops cannot have their GPU’s upgraded and thats not an old hackaday overcomeable limitation of laptops (exception to the rule, thunderbolt ports now have external PCIe x8 ports available for them, many compatible with running GPU’s). Also keep note of what your computers power supply can kick out as it may need upgrading before a GPU upgrade.
  
  Report comment
  
  Reply
  1. six677 says:
    
    December 15, 2012 at 4:56 am
    
    http://www.videocardbenchmark.net/gpu.php?gpu=GeForce+GT+640
    
    http://www.videocardbenchmark.net/video_lookup.php?gpu=GeForce+GTX+460
    
    Seems to suggest the older 460 is indeed more powerful than the newer 640.
    
    Report comment
    
    Reply
MrX says:

December 14, 2012 at 6:22 pm

Me either, I only have a crappy mobile 8400GS (CUDA 1.1). I hope this gathers some kernel dev.’s attention and make them rewrite the idea using OpenCL instead.

Report comment

Reply
Mildly Impressed says:

December 14, 2012 at 6:26 pm

EVGA GTX 680 Classified /w Backplate

* using windows : ( *

Report comment

Reply
sdfsdfsdfdsf says:

December 14, 2012 at 6:44 pm

Two problems:

1)GPUs are not energy-efficient. Not even remotely. Intel is cranking out server CPUs with boatloads of cores that use less than 50W. A GTX480 uses upwards of 200W-250W.

2)It has long been the stance of many Linux folks that things like network and crypto accelerators are a waste of money – that the money is better spent on a slightly faster CPU or better overall architecture – because when you’re not using the extra capacity for network or crypto (for example), it’s available for everything else to go a bit faster.

A pretty nice graphics card is about $200. That’s more than the difference between the bottom end and top end Intel desktop CPUs on Newegg…

Report comment

Reply
1. Ravyne says:
  
  December 15, 2012 at 4:37 pm
  
  Except that, with the right problem, A GPU performs *orders of magnitude* faster than a CPU — I don’t know what definition of efficiency you use, but 5x power draw for 10x, 100x, even 1000x performance seems to fit the definition I know.
  
  On your second point, modern GPUs are not the same as crypto or network accelerators because they stopped being single-purpose circuits many years ago. In point of fact, GPUs can be used to accelerate both crypto and network functions, audio processing, image processing — any application where you do the same basic operation over a large amount of data — think the map portion of map-reduce.
  
  Spending $200 on a GPU gives you something on the order of 1 or 2 *thousand* simple CPU cores, plus a gig or two of high-bandwidth RAM, and the memory architecture to make it all work nicely. Spending $200 on a CPU upgrade buys you a few hundred Mhz and *maybe* another couple cores, depending where your baseline CPU is.
  
  Its very simple really, a CPU is optimized for low-latency processing — to give you a single result as quickly as it can. GPUs are optimized for high throughput processing — to give you many results reasonably fast, essentially by utilizing economies of scale at a silicon and software level (which, to achieve you sacrifice some ISA capability, though that will change in the next round or two of GPUs.)
  
  Finally, your point falls flat because many people, indeed most, already have a GPU that’s capable of performing compute functions, and that number is only going to grow as even integrated GPUs provide some pretty non-trivial grunt. Integrated GPUs in some cases even perform better than either CPUs *or* high-end GPUs because the problem isn’t bottle-necked by the PCIe bus.
  
  Honestly, if you went out and bought/built a computer in the last 3 years, you literally have to go out of your way to come out with a system that’s not capable of any GPU computing.
  
  Report comment
  
  Reply
rasz says:

December 14, 2012 at 7:05 pm

TLDR version:
-accelerated crypto
-tried accelerated Raid 6 recovery, was on par with CPUs

Report comment

Reply
1. rasz says:
  
  December 14, 2012 at 7:32 pm
  
  oh, forgot to mention latency:
  -shitty
  
  :)
  
  Cuda doesnt run all of your kernels in parallel, instead it glues them into bundles called “warps” and runs them in series :/ You need a whole lot of tehm to get decent performance. Oh , did I mention how branches make whole “warps” invalid? if you have a branch inside a warp GPU will run WHOLE “warp” twice per every branch.
  
  Cuda is a nightmare.
  
  Report comment
  
  Reply
  1. Brooks Moses says:
    
    December 14, 2012 at 8:39 pm
    
    Where did you get the idea that CUDA runs all the warps in series? Cuda bundles the kernels into warps, yes, and then the bundles within a warp run in lockstep on a set of cores — but that’s only using 32 cores. The cores are massively hardware-multithreaded (like hyperthreading but much more so!), so several warps are running in parallel on that set of 32 cores, and there are dozens and dozens of sets of cores — so you can easily have a couple of hundred warps running in parallel.
    
    This, of course, is why you need to create so many parallel kernels at the same time to get decent performance.
    
    Also, if you have a branch inside a warp, CUDA will NOT run the whole warp twice; that wasn’t true even on the early versions. All it does is run both sides of the branch, in series (but it doesn’t run code outside the branch twice), and it only even does that if some threads in the warp take one side and some the other. If all the threads take the same side of the branch, there’s no slowdown, and if the branched code is small, it’s only a minor issue. The only way it will effectively “run the whole warp twice” is if you’ve got the entire kernel wrapped in a big if statement and some threads in the warp take the “true” branch and some take the “false” branch.
    
    Report comment
    
    Reply
  2. cptfalcon says:
    
    December 15, 2012 at 12:49 am
    
    This isnt a very accurate post. Firstly, the GPU has ~10x greater memory throughput than a single socket CPU. Secondly, submitting literally thousands of threads to the CUDA cores is a mechanism to mask latency while threads are busy waiting for memory to move around. It’s not a very serial operation. If you want fast serial code, use x86. Which brings me to your complaint about warp branching…
    
    On x86 one of the heuristics to improve serial code execution performance is something called “speculative execution”. When a processor sees a branch, it speculates on the path taken, and continues executing code along that path before fully deciding which path to take! The purpose of this is to keep the pipeline full while waiting for a potentially lengthy conditional statement. It’s part of the reason an x86 processor uses more power than a simple embedded processor. On some RISC processors, this latency hiding was accomplished with a “delay slot”. So thirdly, the GPU uses the immense hardware parallelism to execute both sides of a branch, and pick the one that is correct after the conditional. There is a limit to the depth which can be done, just as with speculative execution.
    
    However, latency to send data back and forth from the GPU and CPU is one of the big bottlenecks that makes me scratch my head about this. Ideally you would have some sort of task that you want to run over a large chunk of data before returning, such as crypto.
    
    CUDA is actually a very nice framework and toolchain, compared to other tools for high performance massively parallel architectures. Take a look at programming for the Cell architecture. Without even optimizing for branch prediction, it is fairly easy to port data parallel C code to CUDA and achieve 10x or more speedup. I would say that your complaints towards a GPU architecture applies even more so for x86, which is by far uglier, with an instruction set to legitimately have nightmares over…
    
    Report comment
    
    Reply
    1. Brooks says:
      
      December 15, 2012 at 2:12 am
      
      cptfalcon: “So thirdly, the GPU uses the immense hardware parallelism to execute both sides of a branch, and pick the one that is correct after the conditional.”
      
      No; this is not speculative execution, nor “immense hardware parallelism”, nor anything relating to hiding latency. There is no speculation involved. It’s much more like lane-masking on SIMD vector operations.
      
      What’s happening is that, for the 32-“core” warp, there is only one instruction decoder, and so all 32 cores must be executing the same instruction at the same time. Thus, if there is a branch where some cores go one way and some go the other, the whole set needs to execute the instructions for both branches. However, the instructions from each branch are “masked” so that, when they are executed on a core that supposed to be taking the other branch, they don’t write their results into the registers. This gives the effect of one set of cores taking one branch and the others taking the other branch — but a key point is that the mask is known before the instruction executes.
      
      There are no direct performance advantages for this, unlike with speculative execution on CPUs. The performance advantage is indirect: If you only have one instruction scheduler for every 32 compute units, you can put many more compute units in the same size-and-power budget.
      
      Report comment
      
      Reply
      1. cptfalcon says:
        
        December 15, 2012 at 11:18 am
        
        Thanks for the clarification. The Kepler architecture seems to have 2 dispatch units per warp scheduler (http://www.nvidia.com/content/PDF/kepler/NVIDIA-Kepler-GK110-Architecture-Whitepaper.pdf p10) which would allow for 2 different instructions run within one warp scheduler at once. However, this feature does not seem to be used by branches in any way.
        
        Report comment
Oli Stabilize says:

December 14, 2012 at 7:40 pm

The likely hood of most people running higher end GPU’s and using Linux enough to warrant kernel compilation is rather slim as gaming on Linux while fully achievable is a head ache at best. That being said I’m tinkering with facial recognition and CUDA using the Linux platform.

Report comment

Reply
1. jc says:
  
  December 15, 2012 at 1:48 pm
  
  I don’t think so. I have a GTX570 and I even write kernel code. And I know more people that do too. I think that the picture that kernel developers only use intel cards is some sort of a stereotype.
  And then there are tons of people who use Gentoo and derivatives. I think most of us compile our kernels. I think most people multiboot for gaming. Using windows for anything besides gaming is a headache.
  
  Report comment
  
  Reply
  1. Oli says:
    
    December 16, 2012 at 6:40 am
    
    I was coming from the average computer tinkerer deciding to see the performance differences in CPU vs GPU (aka me). Now, I’m no fan boy of either OS but I do use windows as my main platform as I do a lot of audio/visual work and I really do like coding in C#.
    
    Unfortunately my main laptop has switchable graphics and there is just no hope of ever getting it to work effectively on Linux :(. I have done some kernel mods and looked into kernel coding but I have never needed to reinvent the wheel for most of my needs.
    
    A lot of my friends and majority of my work peers use lower end GPU’s in the there Linux based systems because they feel that it is sufficient for there needs. Noise is also a big factor with GPU’s.
    
    I think its just interesting getting different views on things :).
    
    Report comment
    
    Reply
    1. jc says:
      
      December 19, 2012 at 12:40 am
      
      If it is nvidia optimus you’re talking about just look at ‘bumblebee’. It’s working perfectly now (if you don’t mind wrapping all apps that need it with ‘optirun’)
      
      Report comment
      
      Reply
pcf11 says:

December 14, 2012 at 8:28 pm

I only want my nVidia card to play Quake.

Report comment

Reply
Alan says:

December 15, 2012 at 4:19 am

I gather the NEON graphics built into my Beaglebone isn’t good enough, then?

Report comment

Reply
1. six677 says:
  
  December 15, 2012 at 5:02 am
  
  NEON graphics do not have any CUDA cores or support at all.
  
  Report comment
  
  Reply
neatbasis says:

December 15, 2012 at 4:35 am

I’ve been waiting for this to happen for a while now. I can’t wait for someone to implement this with radeon cards as well. I believe this is the future of software RAID, crypto and many other application not yet imagined as well.

I’m hyped. I rarely get hyped.

Report comment

Reply
1. rasz says:
  
  December 15, 2012 at 8:54 am
  
  The future of software raid is using $200 GPU to archive same performance as CPU?
  
  Report comment
  
  Reply
  1. edumacation says:
    
    December 15, 2012 at 12:45 pm
    
    No. Technically if you are doing a number of small parallel threads the GPU will be much, much faster. RAID stands for Redundant Array of Independent Disks. Meaning all RAID does is increase storage reliability or storage speed. RAID (disk I/O) and processing speed are not related. If your program is not multiprocessing aware it will only run on one of the many computational units inside a GPU, just like if it was only running on your CPU. If you want faster software RAID then buy many of the latest SATA solid state drives and set them up for striping. CUDA will do nothing for RAID performance just as putting a more powerful engine in your car wont make the suspension better.
    
    Report comment
    
    Reply
    1. pcf11 says:
      
      December 15, 2012 at 5:03 pm
      
      I detect a little newspeak in your comment, “RAID stands for Redundant Array of Independent Disks.” The I in RAID is supposed to stand for inexpensive. I can see how independent would suit some better though.
      
      Report comment
      
      Reply
    2. sa_penguin says:
      
      December 15, 2012 at 5:26 pm
      
      The bottleneck in RAID 5 is parity generation & checking. If any of that can be offloaded (to GPU or another CPU on a RAID card) the throughput should be higher.
      
      Report comment
      
      Reply
six677 says:

December 15, 2012 at 5:01 am

I wonder if it would be possible to modify the linux kernel itself to be OpenCL/CUDA accelerated. Would certainly be an interesting experiment

Report comment

Reply
1. Brooks Moses says:
  
  December 15, 2012 at 1:27 pm
  
  My guess is that there’s not much in the kernel that would be worth accelerating — because GPUs are designed for high data throughput at the expense of latency, and nearly everything in an operating system kernel needs to have very low latency and doesn’t involve a lot of data.
  
  I did recently see an interesting research paper that was using GPU acceleration to implement automated cache prefetch, but it involved using something closer to an AMD Fusion core with the GPU on the same chip as the CPU, and I think it also involved some specialized hardware to make it really work properly. And even then it wasn’t clear to me that it was really doing anything that wasn’t just “let’s power up these GPUs that aren’t running at the moment and use lots of energy for a marginal speedup of our CPU code.”
  
  Report comment
  
  Reply
lwatcdr says:

December 15, 2012 at 6:22 am

I would have liked to see openCL used but Cuda is more mature and hey I didn’t write the code.

Report comment

Reply
1. rasz says:
  
  December 15, 2012 at 8:55 am
  
  Problem with OpenCL is AMD dropping the ball. NVIDIA is showering researchers with money, AMD ignores them :(
  
  Report comment
  
  Reply
Brandon says:

December 15, 2012 at 8:55 am

For those who might be interested, Coursera is currently running a course on Heterogeneous Parallel Programming which touches on CUDA.

https://class.coursera.org/hetero-2012-001

Report comment

Reply
The Timmy says:

December 15, 2012 at 12:17 pm

As a Linux ‘noob’, the sounds interesting. The general consensus I get from reading this post and the comments is “more processing power”

But what, if anything, does it mean to a user that doesn’t do much more in linux than: internet, XP in vmware E-mail and chat?

I *do* like over-powered without overclocking…

Report comment

Reply
1. six677 says:
  
  December 15, 2012 at 1:21 pm
  
  for you, very little, changes would need to be done at a much much lower level to offer any benefit to those tasks.
  
  However several internet browsers can make use of your GPU for drawing now, chrome is one which will use GPU acceleration (without installing this package).
  
  Report comment
  
  Reply
jc says:

December 15, 2012 at 1:27 pm

Pretty awesome. I wonder how resource management works though? For example what happens when many applications (and the kernel) ask for more resources from the GPU than it can provide? Will their compute time get prioritized fairly? Or will new requests just fail outright?

Report comment

Reply
1. Brooks Moses says:
  
  December 15, 2012 at 1:35 pm
  
  The GPU kernel driver is responsible for marshalling all of those requests into a single execution queue that gets fed to the GPU, so the answer to most of those questions depends on how the driver is written. However, there’s no task pre-emption once things are actually running on the GPU (with current technology), so once a task is running, it will run to completion unless completely cancelled; there’s nothing the driver can do about it.
  
  Since it’s a queue rather than a “do this now” programming model, there’s no real sense of “more resources from the GPU than it can provide”; the queue just keeps getting longer and tasks wait longer to execute. I assume at some point you get an error when the driver runs out of queue space, but that may only happen when the whole machine is out of memory.
  
  Report comment
  
  Reply
old greg says:

December 15, 2012 at 2:33 pm

So what I am wonder is does this mean it will be easier for slightly more than average user to brute force a password?

Report comment

Reply
1. Bill Gander says:
  
  December 16, 2012 at 3:41 am
  
  *touches finger to nose* Ya notice how no one has even mentioned what they “daily” use this for too, eh? Otherwise, HaD is full of cancer/aids researchers lol.
  
  Report comment
  
  Reply
hardcorefs says:

December 15, 2012 at 7:35 pm

Yep that is what bitcoin is about

Report comment

Reply
Necromant says:

December 16, 2012 at 6:10 am

Hm, so it’s just a hack to run cuda code from the kernelspace, you still have to write your own implementations to get them cuda-accelerated.

Report comment

Reply
Ervin Kosch (@ErvinKosch) says:

December 16, 2012 at 6:26 am

What I could see if this code could be convtered to work on something like the Parallella. That way you leverage the extra processors without having to write a bunch of extra/custom code.

Report comment

Reply
forex says:

January 14, 2013 at 1:06 pm

Felix Homogratus, Dimitri Chavkerov Rules! You pay us we post good about us!!

Report comment

Reply
Curtis L. says:

September 18, 2014 at 3:24 pm

what would be nice is a modified version of the Linux kernel that support AMD GPU processing automatically, not all of us are lucky to have a Nvidia video chip-sets on our laptops ! So it would be nice to see some AMD GPU programming :P

Report comment

Reply
Forex Fury says:

September 24, 2017 at 9:04 am

I need more speed! :)

Report comment

Reply

Hackaday

Leveraging The GPU To Accelerate The Linux Kernel

58 thoughts on “Leveraging The GPU To Accelerate The Linux Kernel”

Leave a ReplyCancel reply

Search

Never miss a hack

If you missed it

Pokemon Go Had Players Capturing More Than They Realized

German Fireball’s 15 Minutes Of Fame

The “Tin Blimp” Was A Neither Tin Nor A Blimp: The Detroit ZMC-2 Story

Secure Communication, Buried In A News App

New Artemis Plan Returns To Apollo Playbook

Our Columns

Blood Tests Could Provide Early Warning Of Alzheimers Disease

Ask Hackaday: What Will An LLM Be Good For In The Plateau Of Productivity?

Hackaday Links: March 8, 2026

Choice, Control, And Interruption

Hackaday Podcast Episode 360: Cool Rubber Bands, Science-y Stuff, And The Whys Of Office Supplies

58 thoughts on “Leveraging The GPU To Accelerate The Linux Kernel”

Leave a ReplyCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns