Startup Claims It Can Boost CPU Performance By 2-100X

Although Moore’s Law has slowed at bit as chip makers reach the physical limits of transistor size, researchers are having to look to other things other than cramming more transistors on a chip to increase CPU performance. ARM is having a bit of a moment by improving the performance-per-watt of many computing platforms, but some other ideas need to come to the forefront to make any big pushes in this area. This startup called Flow Computing claims it can improve modern CPUs by a significant amount with a slight change to their standard architecture.

It hopes to make these improvements by adding a parallel processing unit, which they call the “back end” to a more-or-less standard CPU, the “front end”. These two computing units would be on the same chip, with a shared bus allowing them to communicate extremely quickly with the front end able to rapidly offload tasks to the back end that are more inclined for parallel processing. Since the front end maintains essentially the same components as a modern CPU, the startup hopes to maintain backwards compatibility with existing software while allowing developers to optimize for use of the new parallel computing unit when needed.

While we’ll take a step back and refrain from claiming this is the future of computing until we see some results and maybe a prototype or two, the idea does show some promise and is similar to some ARM computers which have multiple cores optimized for different tasks, or other computers which offload non-graphics tasks to a GPU which is more optimized for processing parallel tasks. Even the Raspberry Pi is starting to take advantage of external GPUs for tasks like these.

Retrogadgets: The Ageia PhysX Card

Old computers meant for big jobs often had an external unit to crunch data in specific ways. A computer doing weather prediction, for example, might have an SIMD (single instruction multiple data) vector unit that could multiply a bunch of numbers by a constant in one swoop. These days, there are many computers crunching physics equations so you can play your favorite high-end computer game. Instead of vector processors, we have video cards. These cards have many processing units that can execute “kernels” or small programs on large groups of data at once.

Awkward Years

However, there was that awkward in-between stage when personal computers needed fast physics simulation, but it wasn’t feasible to put array processing and video graphics on the same board. Around 2006, a company called Ageia produced the PhysX card, which promised to give PCs the ability to do sophisticated physics simulations without relying on a video card.

Keep in mind that when this was built, multi-core CPUs were an expensive oddity and games were struggling to manage everything they needed to with limited memory and compute resources. The PhysX card was a “PPU” or Physics Processor Unit and used the PCI bus. Like many companies, Ageia made the chips and expected other companies — notably Asus — to make the actual board you’d plug into your computer.

Continue reading “Retrogadgets: The Ageia PhysX Card”

A standard-compliant MXM card installed into a laptop, without heatsink

MXM: Powerful, Misused, Hackable

Today, we’ll look into yet another standard in the embedded space: MXM. It stands for “Mobile PCI Express Module”, and is basically intended as a GPU interface for laptops with PCIe, but there’s way more to it – it can work for any high-power high-throughput PCIe device, with a fair few DisplayPort links if you need them!

You will see MXM sockets in older generations of laptops, barebones desktop PCs, servers, and even automotive computers – certain generations of Tesla cars used to ship with MXM-socketed Nvidia GPUs! Given that GPUs are in vogue today, it pays to know how you can get one in low-profile form-factor and avoid putting a giant desktop GPU inside your device.

I only had a passing knowledge of the MXM standard until a bit ago, but my friend, [WifiCable], has been playing with it for a fair bit now. On a long Discord call, she guided me through all the cool things we should know about the MXM standard, its history, compatibility woes, and hackability potential. I’ve summed all of it up into this article – let’s take a look!

This article has been written based on info that [WifiCable] has given me, and, it’s also certainly not the last one where I interview a hacker and condense their knowledge into a writeup. If you are interested, let’s chat!

Continue reading “MXM: Powerful, Misused, Hackable”

Homebrew GPU Tackles Quake

Have you ever wondered how a GPU works? Even better, have you ever wanted to make one? [Dylan] certainly did, because he made FuryGPU — a fully custom graphics card capable of playing Quake at over 30 frames per second.

As you might have guessed, FuryGPU isn’t in the same league as modern graphics card — those are made of thousands of cores specialized in math, which are then programmed with whatever shaders you want. FuryGPU is a more “traditional” GPU, it has dedicated hardware for all the functions the GPU needs to perform and doesn’t support “shader code” in the same way an AMD or NVIDIA GPU does. According to [Dylan], the hardest part of the whole thing was writing Windows drivers for it.

On his blog, [Dylan] tells us all about how he went from the obligatory [Ben Eater] breadboard CPU to playing with FPGAs to even larger FPGAs to bear the weight of this mighty GPU. While this project isn’t exactly revolutionary in the GPU world, it certainly is impressive and we impatiently wait to see what comes next.

Continue reading “Homebrew GPU Tackles Quake

The Raspberry Pi 5 Can Use External Graphics Cards Now

The Raspberry Pi line is full of capable compact computers, but they’ve never been the strongest in the bunch when it comes to graphical output. Nor have they been particularly expandable in that regard. However, that’s all beginning to change, with [Jeff Geerling] reporting success getting external GPUs to work on the Raspberry Pi 5.

Unlike previous Raspberry Pis, the Raspberry Pi 5 has a less quirky implementation for its PCI Express bus. Previous editions have thrown up issues when trying to work with GPUs, but [Jeff] has found much more success this time around. He’s gotten an AMD RX 460 to work with the setup, and has got it running quite a bit of the glmark2 test regime. He’s working on a variety of other AMD cards too, but suspects NVidia parts could be harder due to some initialization issues that are proving difficult to quash.

It still takes some funky adapters and a lot of work, but finally GPUs are starting to work with the platform. Keep up with his list of card trials on the PiPCI website. We’ve seen [Jeff]’s work with earlier iterations of the Raspberry Pi before, too. Video after the break.

Continue reading “The Raspberry Pi 5 Can Use External Graphics Cards Now”

Here’s Why GPUs Are Deep Learning’s Best Friend

If you have a curiosity about how fancy graphics cards actually work, and why they are so well-suited to AI-type applications, then take a few minutes to read [Tim Dettmers] explain why this is so. It’s not a terribly long read, but while it does get technical there are also car analogies, so there’s something for everyone!

He starts off by saying that most people know that GPUs are scarily efficient at matrix multiplication and convolution, but what really makes them most useful is their ability to work with large amounts of memory very efficiently.

Essentially, a CPU is a latency-optimized device while GPUs are bandwidth-optimized devices. If a CPU is a race car, a GPU is a cargo truck. The main job in deep learning is to fetch and move cargo (memory, actually) around. Both devices can do this job, but in different ways. A race car moves quickly, but can’t carry much. A truck is slower, but far better at moving a lot at once. Continue reading “Here’s Why GPUs Are Deep Learning’s Best Friend”

The Tale Of The Final EVGA GPU Overclocking Record

It’s not news that EVGA is getting out of the GPU card game, after a ‘little falling out’ with Nvidia. It’s sad news nonetheless, as this enthusiastic band of hardware hackers has a solid following in certain overclocking and custom PC circles. The Games Nexus gang decided to fly over to meet up with the EVGA team in Zhonghe, Taiwan, and follow them around a bit as they tried for one last overclocking record on the latest (unreleased, GTX4090-based) GPU card. As you will note early on in the video, things didn’t go smoothly, with their hand-lapped GPU burning out the PCB after a small setup error. Continue reading “The Tale Of The Final EVGA GPU Overclocking Record”