Train A GPT-2 LLM, Using Only Pure C Code

[Andrej Karpathy] recently released llm.c, a project that focuses on LLM training in pure C, once again showing that working with these tools isn’t necessarily reliant on sprawling development environments. GPT-2 may be older but is perfectly relevant, being the granddaddy of modern LLMs (large language models) with a clear heritage to more modern offerings.

LLMs are fantastically good at communicating despite not actually knowing what they are saying, and training them usually relies on PyTorch deep learning library, itself written in Python. llm.c takes a simpler approach by implementing the neural network training algorithm for GPT-2 directly. The result is highly focused and surprisingly short: about a thousand lines of C in a single file. It is a highly elegant process that does the same thing the bigger, clunkier methods accomplish. It can run entirely on a CPU, or it can take advantage of GPU acceleration, where available.

This isn’t the first time [Andrej Karpathy] has bent his considerable skills and understanding towards boiling down these sorts of concepts into bare-bones implementations. We previously covered a project of his that is the “hello world” of GPT, a tiny model that predicts the next bit in a given sequence and offers low-level insight into just how GPT (generative pre-trained transformer) models work.

AI Makes Linux Do What You Mean, Not What You Say

We are always envious of the Star Trek Enterprise computers. You can just sort of ask them a hazy question and they will — usually — figure out what you want. Even the automatic doors seemed to know the difference between someone walking into a turbolift versus someone being thrown into the door during a fight. [River] decided to try his new API keys for the private beta of an AI service to generate Linux commands based on a description. How does it work? Watch the video below and find out.

Some examples work fairly well. In response to “email the Rickroll video to Jeff Bezos,” the system produced a curl command and an e-mail to what we assume is the right place. “Find all files in the current directory bigger than 1 GB” works, too.

Continue reading “AI Makes Linux Do What You Mean, Not What You Say”

I’m Sorry Dave, You Shouldn’t Write Verilog

We were always envious of Star Trek, for its computers. No programming needed. Just tell the computer what you want and it does it. Of course, HAL-9000 had the same interface and that didn’t work out so well. Some researchers at NYU have taken a natural language machine learning system — GPT-2 — and taught it to generate Verilog code for use in FPGA systems. Ironically, they called it DAVE (Deriving Automatically Verilog from English). Sounds great, but we have to wonder if it is more than a parlor trick. You can try it yourself if you like.

For example, DAVE can take input like “Given inputs a and b, take the nor of these and return the result in c.” Fine. A more complex example from the paper isn’t quite so easy to puzzle out:

Write a 6-bit register ‘ar’ with input
defined as ‘gv’ modulo ‘lj’, enable ‘q’, synchronous
reset ‘r’ defined as ‘yxo’ greater than or equal to ‘m’,
and clock ‘p’. A vault door has three active-low secret
switch pressed sensors ‘et’, ‘lz’, ‘l’. Write combinatorial
logic for a active-high lock ‘s’ which opens when all of
the switches are pressed. Write a 6-bit register ‘w’ with
input ‘se’ and ‘md’, enable ‘mmx’, synchronous reset
‘nc’ defined as ‘tfs’ greater than ‘w’, and clock ‘xx’.

Continue reading “I’m Sorry Dave, You Shouldn’t Write Verilog”