Hacking An NVIDIA CMP 170HX Crypto GPU For EM Sim Work

A few years back NVIDIA created a dedicated cryptocurrency mining GPU, the CMP 170HX. This was a heavily restricted version of its flagship A100 datacenter accelerator, using the same GA100 chip. It was intended for accelerating Ethash, the Etherium proof-of-work algorithm, and nothing else. [niconiconi] bought one to use for accelerating PCB electromagnetic simulations and put a lot of effort into repairing the card, converting it to water-cooling, and figuring out how best to use this nobbled GPU.

Typically, the GA100 silicon sits in the center of the mighty A100 GPU card and would be found in a server rack, cooled by forced air. This was not an option at home, so an off-the-shelf water-cooling block was wedged in. During this process, [niconconi] found that the board wouldn’t power on, so they went on a deep dive into the power supply tree with the help of a leaked A100 schematic. The repair and modifications can be found in the appendix, right down to the end of the article. It is a long read to get there.

Continue reading “Hacking An NVIDIA CMP 170HX Crypto GPU For EM Sim Work”

A standard-compliant MXM card installed into a laptop, without heatsink

MXM: Powerful, Misused, Hackable

Today, we’ll look into yet another standard in the embedded space: MXM. It stands for “Mobile PCI Express Module”, and is basically intended as a GPU interface for laptops with PCIe, but there’s way more to it – it can work for any high-power high-throughput PCIe device, with a fair few DisplayPort links if you need them!

You will see MXM sockets in older generations of laptops, barebones desktop PCs, servers, and even automotive computers – certain generations of Tesla cars used to ship with MXM-socketed Nvidia GPUs! Given that GPUs are in vogue today, it pays to know how you can get one in low-profile form-factor and avoid putting a giant desktop GPU inside your device.

I only had a passing knowledge of the MXM standard until a bit ago, but my friend, [WifiCable], has been playing with it for a fair bit now. On a long Discord call, she guided me through all the cool things we should know about the MXM standard, its history, compatibility woes, and hackability potential. I’ve summed all of it up into this article – let’s take a look!

This article has been written based on info that [WifiCable] has given me, and, it’s also certainly not the last one where I interview a hacker and condense their knowledge into a writeup. If you are interested, let’s chat!

Continue reading “MXM: Powerful, Misused, Hackable”

NVIDIA Trains Custom AI To Assist Chip Designers

AI is big news lately, but as with all new technology moves, it’s important to pierce through the hype. Recent news about NVIDIA creating a custom large language model (LLM) called ChipNeMo to assist in chip design is tailor-made for breathless hyperbole, so it’s refreshing to read exactly how such a thing is genuinely useful.

ChipNeMo is trained on the highly specific domain of semiconductor design via internal code repositories, documentation, and more. The result is a vast 43-billion parameter LLM running on a single A100 GPU that actually plays no direct role in designing chips, but focuses instead on making designers’ jobs easier.

For example, it turns out that senior designers spend a lot of time answering questions from junior designers. If a junior designer can ask ChipNeMo a question like “what does signal x from memory unit y do?” and that saves a senior designer’s time, then NVIDIA says the tool is already worth it. In addition, it turns out another big time sink for designers is dealing with bugs. Bugs are extensively documented in a variety of ways, and designers spend a lot of time reading documentation just to grasp the basics of a particular bug. Acting as a smart interface to such narrowly-focused repositories is something a tool like ChipNeMo excels at, because it can provide not just summaries but also concrete references and sources. Saving developer time in this way is a clear and easy win.

It’s an internal tool and part research project, but it’s easy to see the benefits ChipNeMo can bring. Using LLMs trained on internal information for internal use is something organizations have experimented with (for example, Mozilla did so, while explaining how to do it for yourself) but it’s interesting to see a clear roadmap to assisting developers in concrete ways.

Here’s Why GPUs Are Deep Learning’s Best Friend

If you have a curiosity about how fancy graphics cards actually work, and why they are so well-suited to AI-type applications, then take a few minutes to read [Tim Dettmers] explain why this is so. It’s not a terribly long read, but while it does get technical there are also car analogies, so there’s something for everyone!

He starts off by saying that most people know that GPUs are scarily efficient at matrix multiplication and convolution, but what really makes them most useful is their ability to work with large amounts of memory very efficiently.

Essentially, a CPU is a latency-optimized device while GPUs are bandwidth-optimized devices. If a CPU is a race car, a GPU is a cargo truck. The main job in deep learning is to fetch and move cargo (memory, actually) around. Both devices can do this job, but in different ways. A race car moves quickly, but can’t carry much. A truck is slower, but far better at moving a lot at once. Continue reading “Here’s Why GPUs Are Deep Learning’s Best Friend”

A Dedicated GPU For Your Favorite SBC

The Raspberry Pi is famous for its low cost, versatile and open Linux environment, and plentiful I/O, making it a perfect device not only for its originally-intended educational purposes but for basically every hobbyist from gardeners to roboticists to amateur radio operators. Most builds tend to make use of the GPIO pins which allow easy connections to various peripherals and sensors, but the Pi also supports PCI devices which means that, in theory, it could use a GPU in much the same way that a modern computer would. After plenty of testing and development, [Jeff Geerling] brings us this custom graphics card interface for the Raspberry Pi.

The testing for all of these graphics cards has been done with a Pi Compute Module 4 and the end result is an interface device which looks much like a graphics card itself. It splits the PCI bus out onto a more familiar x16 slot connector and adds physical connections for power, USB, and Ethernet. When plugged into the carrier board, the Compute Module can be attached to any of a number of graphics cards, including the latest and highest-end of Nvidia and AMD offerings.

Perhaps unsurprisingly, though, the 4090 and 7900 cards don’t work with the Raspberry Pi. This is partially due to the 32-bit limitations of the Pi and other memory mapping issues, but even after attempting some workarounds Nvidia’s cards aren’t open-source enough to test properly (although the card is recognized by the Pi) and AMD’s drivers crash the system even after compiling a custom kernel. [Jeff] did find an Nvidia card that worked, although it requires using the USB interface and second-hand cards are selling for around $3000 USD. For a more economical choice there are some other graphics cards that he was eventually able to get working, albeit not with perfect performance, including some of the ones we’ve seen him test already.

Continue reading “A Dedicated GPU For Your Favorite SBC”

The Tale Of The Final EVGA GPU Overclocking Record

It’s not news that EVGA is getting out of the GPU card game, after a ‘little falling out’ with Nvidia. It’s sad news nonetheless, as this enthusiastic band of hardware hackers has a solid following in certain overclocking and custom PC circles. The Games Nexus gang decided to fly over to meet up with the EVGA team in Zhonghe, Taiwan, and follow them around a bit as they tried for one last overclocking record on the latest (unreleased, GTX4090-based) GPU card. As you will note early on in the video, things didn’t go smoothly, with their hand-lapped GPU burning out the PCB after a small setup error. Continue reading “The Tale Of The Final EVGA GPU Overclocking Record”

showing the connector after its torn down from the side of the wire solder points, showing how thin are the metal pads, and also that one wire has already broken off

NVIDIA Power Cables Are Melting, This May Be Why

NVIDIA has recently released their lineup of 40-series graphics cards, with a novel generation of power connectors called 12VHPWR. See, the previous-generation 8-pin connectors were no longer enough to satiate the GPU’s hunger. Once cards started getting into the hands of users, surprisingly, we began seeing pictures of melted 12VHPWR plugs and sockets online — specifically, involving ATX 8-pin GPU power to 12VHPWR adapters that NVIDIA provided with their cards.

Now, [Igor Wallossek] of igor’sLAB proposes a theory about what’s going on, with convincing teardown pictures to back it up. After an unscheduled release of plastic-scented magic smoke, one of the NVIDIA-provided connectors was destructively disassembled. Turned out that these connectors weren’t crimped like we’re used to, but instead, the connectors had flat metal pads meant for wires to solder on. For power-carrying connectors, there are good reasons this isn’t the norm. That said, you can make it work, but chances are not in favor of this specific one.

The metal pads in question seem to be far too thin and structurally unsound, as one can readily spot, their cross-section is dwarfed by the cross-section of cables soldered to them. This would create a segment of increased resistance and heat loss, exacerbated by any flexing of the thick and unwieldy cabling. Due to the metal being so thin, the stress points seem quite flimsy, as one of the metal pads straight up broke off during disassembly of the connector.

If this theory is true, the situation is a blunder to blame on NVIDIA. On the upside, the 12VHPWR standard itself seems to be viable, as there are examples of PSUs with native 12HPWR connections that don’t exhibit this problem. It seems, gamers with top-of-the-line GPUs can now empathize with the problems that we hackers have been seeing in very cheap 3D printers.