Train A GPT-2 LLM, Using Only Pure C Code

April 28, 2024 by Donald Papp 7 Comments

[Andrej Karpathy] recently released llm.c, a project that focuses on LLM training in pure C, once again showing that working with these tools isn’t necessarily reliant on sprawling development environments. GPT-2 may be older but is perfectly relevant, being the granddaddy of modern LLMs (large language models) with a clear heritage to more modern offerings.

LLMs are fantastically good at communicating despite not actually knowing what they are saying, and training them usually relies on PyTorch deep learning library, itself written in Python. llm.c takes a simpler approach by implementing the neural network training algorithm for GPT-2 directly. The result is highly focused and surprisingly short: about a thousand lines of C in a single file. It is a highly elegant process that does the same thing the bigger, clunkier methods accomplish. It can run entirely on a CPU, or it can take advantage of GPU acceleration, where available.

This isn’t the first time [Andrej Karpathy] has bent his considerable skills and understanding towards boiling down these sorts of concepts into bare-bones implementations. We previously covered a project of his that is the “hello world” of GPT, a tiny model that predicts the next bit in a given sequence and offers low-level insight into just how GPT (generative pre-trained transformer) models work.

AI Makes Linux Do What You Mean, Not What You Say

April 20, 2021 by Al Williams 30 Comments

We are always envious of the Star Trek Enterprise computers. You can just sort of ask them a hazy question and they will — usually — figure out what you want. Even the automatic doors seemed to know the difference between someone walking into a turbolift versus someone being thrown into the door during a fight. [River] decided to try his new API keys for the private beta of an AI service to generate Linux commands based on a description. How does it work? Watch the video below and find out.

Some examples work fairly well. In response to “email the Rickroll video to Jeff Bezos,” the system produced a curl command and an e-mail to what we assume is the right place. “Find all files in the current directory bigger than 1 GB” works, too.

Continue reading “AI Makes Linux Do What You Mean, Not What You Say” →

I’m Sorry Dave, You Shouldn’t Write Verilog

September 4, 2020 by Al Williams 54 Comments

We were always envious of Star Trek, for its computers. No programming needed. Just tell the computer what you want and it does it. Of course, HAL-9000 had the same interface and that didn’t work out so well. Some researchers at NYU have taken a natural language machine learning system — GPT-2 — and taught it to generate Verilog code for use in FPGA systems. Ironically, they called it DAVE (Deriving Automatically Verilog from English). Sounds great, but we have to wonder if it is more than a parlor trick. You can try it yourself if you like.

For example, DAVE can take input like “Given inputs a and b, take the nor of these and return the result in c.” Fine. A more complex example from the paper isn’t quite so easy to puzzle out:

Write a 6-bit register ‘ar’ with input
defined as ‘gv’ modulo ‘lj’, enable ‘q’, synchronous
reset ‘r’ defined as ‘yxo’ greater than or equal to ‘m’,
and clock ‘p’. A vault door has three active-low secret
switch pressed sensors ‘et’, ‘lz’, ‘l’. Write combinatorial
logic for a active-high lock ‘s’ which opens when all of
the switches are pressed. Write a 6-bit register ‘w’ with
input ‘se’ and ‘md’, enable ‘mmx’, synchronous reset
‘nc’ defined as ‘tfs’ greater than ‘w’, and clock ‘xx’.

Continue reading “I’m Sorry Dave, You Shouldn’t Write Verilog” →

Hackaday

GPT-2

3 Articles

Train A GPT-2 LLM, Using Only Pure C Code

AI Makes Linux Do What You Mean, Not What You Say

I’m Sorry Dave, You Shouldn’t Write Verilog

Search

Never miss a hack

If you missed it

Back To The Future, 40 Years Old, Looks Like The Past

Why The Latest Linux Kernel Won’t Run On Your 486 And 586 Anymore

One Laptop Manufacturer Had To Stop Janet Jackson Crashing Laptops

The 2025 Iberian Peninsula Blackout: From Solar Wobbles To Cascade Failures

Field Guide To The North American Weigh Station

Our Columns

Hackaday Podcast Episode 327: A Ploopy Knob, Rube-Goldberg Book Scanner, Hard Drives And Power Grids Oscillating Out Of Control

Last Chance: 2025 Hackaday Supercon Still Wants You!

FLOSS Weekly Episode 839: I Want To Get Paid Twice

South Korea Brought High-Rise Fire Escape Solutions To The Masses

C++ Encounters Of The Rusty Zig Kind

Search

Never miss a hack

Subscribe

If you missed it

Our Columns