Why LLaMa Is A Big Deal

You might have heard about LLaMa or maybe you haven’t. Either way, what’s the big deal? It’s just some AI thing. In a nutshell, LLaMa is important because it allows you to run large language models (LLM) like GPT-3 on commodity hardware. In many ways, this is a bit like Stable Diffusion, which similarly allowed normal folks to run image generation models on their own hardware with access to the underlying source code. We’ve discussed why Stable Diffusion matters and even talked about how it works.

LLaMa is a transformer language model from Facebook/Meta research, which is a collection of large models from 7 billion to 65 billion parameters trained on publicly available datasets. Their research paper showed that the 13B version outperformed GPT-3 in most benchmarks and LLama-65B is right up there with the best of them. LLaMa was unique as inference could be run on a single GPU due to some optimizations made to the transformer itself and the model being about 10x smaller. While Meta recommended that users have at least 10 GB of VRAM to run inference on the larger models, that’s a huge step from the 80 GB A100 cards that often run these models.

While this was an important step forward for the research community, it became a huge one for the hacker community when [Georgi Gerganov] rolled in. He released llama.cpp on GitHub, which runs the inference of a LLaMa model with 4-bit quantization. His code was focused on running LLaMa-7B on your Macbook, but we’ve seen versions running on smartphones and Raspberry Pis. There’s even a version written in Rust! A rough rule of thumb is anything with more than 4 GB of RAM can run LLaMa. Model weights are available through Meta with some rather strict terms, but they’ve been leaked online and can be found even in a pull request on the GitHub repo itself. Continue reading “Why LLaMa Is A Big Deal”

Understanding AI Chat Bots With Stanford Online

The news is full of speculation about chatbots like GPT-3, and even if you don’t care, you are probably the kind of person that people will ask about it. The problem is, the popular press has no idea what’s going on with these things. They aren’t sentient or alive, despite some claims to the contrary. So where do you go to learn what’s really going on? How about Stanford? Professor [Christopher Potts] knows a lot about how these things work and he shares some of it in a recent video you can watch below.

One of the interesting things is that he shows some questions that one chatbot will answer reasonably and another one will not. As a demo or a gimmick, that’s not a problem. But if you are using it as, say, your search engine, getting the wrong answer won’t amuse you. Sure, you can do a conventional search and find wrong things, but it will be embedded in a lot of context that might help you decide it is wrong and, hopefully, some other things that are not wrong. You have to decide.
Continue reading “Understanding AI Chat Bots With Stanford Online”

Hackaday Links Column Banner

Hackaday Links: January 22, 2023

The media got their collective knickers in a twist this week with the news that Wyoming is banning the sale of electric vehicles in the state. Headlines like that certainly raise eyebrows, which is the intention, of course, but even a quick glance at the proposed legislation might have revealed that the “ban” was nothing more than a non-binding resolution, making this little more than a political stunt. The bill, which would only “encourage” the phase-out of EV sales in the state by 2035, is essentially meaningless, especially since it died in committee before ever coming close to a vote. But it does present a somewhat lengthy list of the authors’ beefs with EVs, which mainly focus on the importance of the fossil fuel industry in Wyoming. It’s all pretty boneheaded, but then again, outright bans on ICE vehicle sales by some arbitrary and unrealistically soon deadline don’t seem too smart either. Couldn’t people just decide what car works best for them?

Speaking of which, a man in neighboring Colorado might have some buyer’s regret when he learned that it would take five days to fully charge his brand-new electric Hummer at home. Granted, he bought the biggest battery pack possible — 250 kWh — and is using a standard 120-volt wall outlet and the stock Hummer charging dongle, which adds one mile (1.6 km) to the vehicle’s range every hour. The owner doesn’t actually seem all that surprised by the results, nor does he seem particularly upset by it; he appears to know enough about the realities of EVs to recognize the need for a Level 2 charger. That entails extra expense, of course, both to procure the charger and to run the 240-volt circuit needed to power it, not to mention paying for the electricity. It’s a problem that will only get worse as more chargers are added to our creaky grid; we’re not sure what the solution is, but we’re pretty sure it’ll be found closer to the engineering end of the spectrum than the political end.

Continue reading “Hackaday Links: January 22, 2023”

Giving An Old Typewriter A Mind Of Its Own With GPT-3

There was an all-too-brief period in history where typewriters went from clunky, purely mechanical beasts to streamlined, portable electromechanical devices. But when the 80s came around and the PC revolution started, the typewriting was on the wall for these machines, and by the 90s everyone had a PC, a printer, and Microsoft Word. And thus the little daisy-wheel typewriters began to populate thrift shops all over the world.

That’s fine with us, because it gave [Arvind Sanjeev] a chance to build “Ghostwriter”, an AI-powered automatic typewriter. The donor machine was a clapped-out Brother electronic typewriter, which needed a bit of TLC to bring it back to working condition. From there, [Arvind] worked out the keyboard matrix and programmed an Arduino to drive the typewriter, both read and write. A Raspberry Pi running the OpenAI Python API for GPT-3 talks to the Arduino over serial, which basically means you can enter a GPT writing prompt with the keyboard and have the machine spit out a dead-tree version of the results.

To mix things up a bit, [Arvind] added a pair of pots to control the creativity and length of the response, plus an OLED screen which seems only to provide some cute animations, which we don’t hate. We also don’t hate the new paint job the typewriter got, but the jury is still out on the “poetry” that it typed up. Eye of the beholder, we suppose.

Whatever you think of GPT’s capabilities, this is still a neat build and a nice reuse of otherwise dead-end electronics. Need a bit more help building natural language AI into your next project? Our own [Donald Papp] will get you up to speed on that.

Continue reading “Giving An Old Typewriter A Mind Of Its Own With GPT-3”

What’s Old Is New Again: GPT-3 Prompt Injection Attack Affects AI

What do SQL injection attacks have in common with the nuances of GPT-3 prompting? More than one might think, it turns out.

Many security exploits hinge on getting user-supplied data incorrectly treated as instruction. With that in mind, read on to see [Simon Willison] explain how GPT-3 — a natural-language AI —  can be made to act incorrectly via what he’s calling prompt injection attacks.

This all started with a fascinating tweet from [Riley Goodside] demonstrating the ability to exploit GPT-3 prompts with malicious instructions that order the model to behave differently than one would expect.

Continue reading “What’s Old Is New Again: GPT-3 Prompt Injection Attack Affects AI”

Blog Title Optimizer Uses AI, But How Well Does It Work?

[Max Woolf] sometimes struggles to create ideal headlines for his blog posts, and decided to apply his experience with machine learning to the problem. He asked: could an AI be trained to optimize his blog titles? It is a fascinating application of natural language processing, and [Max] explains all about what it does and how it works.

The machine learning framework [Max] uses is GPT-3, a language model that works with natural-seeming human language that is capable of being tweaked in different ways. [Max] uses OpenAI’s GPT-3 API (which, by the way, is much easier to experiment with than one might think) and here is the basic workflow for his title optimizer:

  1. The optimizer takes as input a blog post title to optimize.
  2. OpenAI’s pre-trained GPT-3 engine is used to generate six alternate titles.
  3. For each of those alternate titles, a fine-tuned version of GPT-3 is consulted to judge how “good” they are based on custom training data. (“Good” in this context means “similar to titles of successful submissions on Hacker News“, but more on that in a moment.)
  4. Print the results.

Continue reading “Blog Title Optimizer Uses AI, But How Well Does It Work?”

AI Creates Your Spreadsheets, Sometimes

We’ve been interested in looking at how AI can process things other than silly images. That’s why the “Free AI Bot that Generates the Excel Formula for Any Problem” caught our eye. Based on GPT-3, it supposedly transforms your problem description into a formula suitable for Excel or Google Sheets.

Our first prompt didn’t work out very well. But that was sort of our fault. When they say “Excel formula” they mean that quite literally. So trying to describe the actual result you want in terms of columns or rows seems to be beyond it. Not realizing that, we asked:

If the sum of column H is greater than 50, multiply column A by 0.33

And got:

=IF(SUM(H:H)>50,A*0.33,0)

A Better Try

Which is close, but not really how anyone even mildly proficient with Excel would interpret that request. But that’s not fair. It really needs to be a y=f(x) sort of problem, we suppose.

Continue reading “AI Creates Your Spreadsheets, Sometimes”