A plugged-in 12VHPWR cable, with two thermistors inserted into the connector shell, monitoring for heat

12VHPWR Watchdog Protects You From Nvidia Fires

The 12VHPWR connector is a hot topic once again – Nvidia has really let us down on this one. New 5080 and 500 GPUs come with this connector, and they’re once again fire-prone. Well, what if you’re stuck with a newly-built 5080, unwilling to give it up, still hoping to play the newest games or run LLMs locally? [Timo Birnschein] has a simple watchdog solution for you, and it’s super easy to build.

All it takes is an Arduino, three resistors, and three thermistors. Place the thermistors onto the connector’s problematic spots, download the companion software from GitHub, and plug the Arduino into your PC. If a temperature anomaly is detected, like one of the thermistors approaching 100C, the Arduino will simply shut down your PC. The software also includes a tray icon, temperature graphing, and stability features.  All is open-source — breadboard it, flash it. You can even add more thermistors to the mix if you’d like!

This hack certainly doesn’t just help protect you from Nvidia’s latest creation – it can help you watch over any sort of potentially hot mod, and it’s very easy to build. Want to watch over connectors on your 3D printer? Build one of these! We’ve seen 12VHPWR have plenty of problems in the past on Nvidia’s cards – it looks like there are quite a few lessons Nvidia is yet to learn.

Import GPU: Python Programming With CUDA

Every few years or so, a development in computing results in a sea change and a need for specialized workers to take advantage of the new technology. Whether that’s COBOL in the 60s and 70s, HTML in the 90s, or SQL in the past decade or so, there’s always something new to learn in the computing world. The introduction of graphics processing units (GPUs) for general-purpose computing is perhaps the most important recent development for computing, and if you want to develop some new Python skills to take advantage of the modern technology take a look at this introduction to CUDA which allows developers to use Nvidia GPUs for general-purpose computing.

Of course CUDA is a proprietary platform and requires one of Nvidia’s supported graphics cards to run, but assuming that barrier to entry is met it’s not too much more effort to use it for non-graphics tasks. The guide takes a closer look at the open-source library PyTorch which allows a Python developer to quickly get up-to-speed with the features of CUDA that make it so appealing to researchers and developers in artificial intelligence, machine learning, big data, and other frontiers in computer science. The guide describes how threads are created, how they travel along within the GPU and work together with other threads, how memory can be managed both on the CPU and GPU, creating CUDA kernels, and managing everything else involved largely through the lens of Python.

Getting started with something like this is almost a requirement to stay relevant in the fast-paced realm of computer science, as machine learning has taken center stage with almost everything related to computers these days. It’s worth noting that strictly speaking, an Nvidia GPU is not required for GPU programming like this; AMD has a GPU computing platform called ROCm but despite it being open-source is still behind Nvidia in adoption rates and arguably in performance as well. Some other learning tools for GPU programming we’ve seen in the past include this puzzle-based tool which illustrates some of the specific problems GPUs excel at.

Laptop GPU Upgrade With Just A Little Reballing

Modern gaming laptops are in an uncomfortable spot – often too underpowered for newest titles, but too bulky to be genuinely portable. It doesn’t help they’re not often upgradeable, so you’re stuck with what you’ve bought – unless, say, you’re a hacker equipped some tools for PCB reflow? If that’s the case, welcome to [TechModLab]’s video showing you the process of upgrading a laptop’s soldered-on NVIDIA GPU, replacing the 3070 chip with a 3080.

You don’t need much – the most exotic tool is a BGA rework station, holding the mainboard steady&stiff and heating a specific large chip on the board with an infrared lamp from above. This one is definitely a specialty tool, but we’ve seen hackers build their own. From there, some general soldering tools like flux and solder wick, a stencil for your chip, BGA balls, and a $20 USB-C hotplate are instrumental for reballing chips – tools you ought to have.

Reballing was perhaps the hardest step of the journey – instrumental for preparing the GPU before the transplant. Afterwards, only a few steps were needed – poking a BGA ball that didn’t connect, changing board straps to adjust for the new VRAM our enterprising hacker added alongside the upgrade, and playing with the driver process install a little. Use this method to upgrade from a lower-end binned GPU you’re stuck with, or perhaps to repair your laptop if artifacts start appearing – it’s a worthwhile reminder about methods that laptop repair shops use on the daily.

Itching to learn more about BGAs? You absolutely should read this article series by our own [Robin Kearey]. We’ve mostly seen reballing used for upgrading RAM on laptop and Raspberry Pi boards, but seeing it being used for an entire laptop is nice – it’s the same technique, just scaled up, and you always can start by practicing at a smaller scale. Now, it might feel like we’ve left the era of upgradable GPUs on laptops, and today’s project might not necessarily help your worries – but the Framework 16 definitely bucks the trend.

Continue reading “Laptop GPU Upgrade With Just A Little Reballing”

Hacking An NVIDIA CMP 170HX Crypto GPU For EM Sim Work

A few years back NVIDIA created a dedicated cryptocurrency mining GPU, the CMP 170HX. This was a heavily restricted version of its flagship A100 datacenter accelerator, using the same GA100 chip. It was intended for accelerating Ethash, the Etherium proof-of-work algorithm, and nothing else. [niconiconi] bought one to use for accelerating PCB electromagnetic simulations and put a lot of effort into repairing the card, converting it to water-cooling, and figuring out how best to use this nobbled GPU.

Typically, the GA100 silicon sits in the center of the mighty A100 GPU card and would be found in a server rack, cooled by forced air. This was not an option at home, so an off-the-shelf water-cooling block was wedged in. During this process, [niconconi] found that the board wouldn’t power on, so they went on a deep dive into the power supply tree with the help of a leaked A100 schematic. The repair and modifications can be found in the appendix, right down to the end of the article. It is a long read to get there.

Continue reading “Hacking An NVIDIA CMP 170HX Crypto GPU For EM Sim Work”

A standard-compliant MXM card installed into a laptop, without heatsink

MXM: Powerful, Misused, Hackable

Today, we’ll look into yet another standard in the embedded space: MXM. It stands for “Mobile PCI Express Module”, and is basically intended as a GPU interface for laptops with PCIe, but there’s way more to it – it can work for any high-power high-throughput PCIe device, with a fair few DisplayPort links if you need them!

You will see MXM sockets in older generations of laptops, barebones desktop PCs, servers, and even automotive computers – certain generations of Tesla cars used to ship with MXM-socketed Nvidia GPUs! Given that GPUs are in vogue today, it pays to know how you can get one in low-profile form-factor and avoid putting a giant desktop GPU inside your device.

I only had a passing knowledge of the MXM standard until a bit ago, but my friend, [WifiCable], has been playing with it for a fair bit now. On a long Discord call, she guided me through all the cool things we should know about the MXM standard, its history, compatibility woes, and hackability potential. I’ve summed all of it up into this article – let’s take a look!

This article has been written based on info that [WifiCable] has given me, and, it’s also certainly not the last one where I interview a hacker and condense their knowledge into a writeup. If you are interested, let’s chat!

Continue reading “MXM: Powerful, Misused, Hackable”

NVIDIA Trains Custom AI To Assist Chip Designers

AI is big news lately, but as with all new technology moves, it’s important to pierce through the hype. Recent news about NVIDIA creating a custom large language model (LLM) called ChipNeMo to assist in chip design is tailor-made for breathless hyperbole, so it’s refreshing to read exactly how such a thing is genuinely useful.

ChipNeMo is trained on the highly specific domain of semiconductor design via internal code repositories, documentation, and more. The result is a vast 43-billion parameter LLM running on a single A100 GPU that actually plays no direct role in designing chips, but focuses instead on making designers’ jobs easier.

For example, it turns out that senior designers spend a lot of time answering questions from junior designers. If a junior designer can ask ChipNeMo a question like “what does signal x from memory unit y do?” and that saves a senior designer’s time, then NVIDIA says the tool is already worth it. In addition, it turns out another big time sink for designers is dealing with bugs. Bugs are extensively documented in a variety of ways, and designers spend a lot of time reading documentation just to grasp the basics of a particular bug. Acting as a smart interface to such narrowly-focused repositories is something a tool like ChipNeMo excels at, because it can provide not just summaries but also concrete references and sources. Saving developer time in this way is a clear and easy win.

It’s an internal tool and part research project, but it’s easy to see the benefits ChipNeMo can bring. Using LLMs trained on internal information for internal use is something organizations have experimented with (for example, Mozilla did so, while explaining how to do it for yourself) but it’s interesting to see a clear roadmap to assisting developers in concrete ways.

Here’s Why GPUs Are Deep Learning’s Best Friend

If you have a curiosity about how fancy graphics cards actually work, and why they are so well-suited to AI-type applications, then take a few minutes to read [Tim Dettmers] explain why this is so. It’s not a terribly long read, but while it does get technical there are also car analogies, so there’s something for everyone!

He starts off by saying that most people know that GPUs are scarily efficient at matrix multiplication and convolution, but what really makes them most useful is their ability to work with large amounts of memory very efficiently.

Essentially, a CPU is a latency-optimized device while GPUs are bandwidth-optimized devices. If a CPU is a race car, a GPU is a cargo truck. The main job in deep learning is to fetch and move cargo (memory, actually) around. Both devices can do this job, but in different ways. A race car moves quickly, but can’t carry much. A truck is slower, but far better at moving a lot at once. Continue reading “Here’s Why GPUs Are Deep Learning’s Best Friend”