The Right Benchmark For GPT

July 29, 2023 by Elliot Williams 41 Comments

Dan Maloney wanted to design a part for 3D printing. OpenSCAD is a coding language for generating 3D objects. ChatGPT can write code. What could possibly go wrong? You should go read his article because it’s enlightening and hilarious, but the punchline is that it ran afoul of syntax errors, but also gave him enough of a foothold that he could teach himself enough OpenSCAD to get the project done anyway. As with many people who have asked the AI to create some code, Dan finds that it’s not as good as asking someone who knows what they’re doing, but that it’s also better than nothing.

And this is where I start grumbling. When you type your desires into the word-follower machine, your alternative isn’t nothing. Your alternative is to fire up a search engine instead and type “openscad tutorial”. That, for nearly any human endeavor, will get you a few good guides, written by humans who are probably expert in the subject in question, and which are aimed at teaching you the thing that you want to learn. It doesn’t get better than that. You’ll be up and running with your design in no time.

Indeed, if you think about the relevant source material that the LLM was trained on, it’s exactly these tutorials. It can’t possibly do better than the best of them, although the resulting average tutorial might be better than the worst you’ll find. (Some have speculated on what happens when the entire Internet is filled with these generated texts – what will future AIs learn from?)

In Dan’s case, though, he didn’t necessarily want to learn OpenSCAD – he just wanted the latch designed. But in the end, he had to learn enough OpenSCAD to get the AI code compiling without error. He spent an hour learning OpenSCAD and now he’s good to go on his next project too.

So the next time you hear someone say that they got an answer back from a large language model that wasn’t perfect, but it was “better than nothing”, think critically if “nothing” is really the right benchmark.

Do you really want to learn nothing? Do you really have no resources to get started with? I would claim that we have the most amazing set of tutorial resources the world has ever known at our fingertips. Compared to the ability to teach millions of humans to achieve their own goals, that makes the LLM party tricks look kinda weak, in my opinion.

ChatGPT, The Worst Summer Intern Ever

July 26, 2023 by Dan Maloney 71 Comments

Back when I used to work in the pharma industry, I had the opportunity to hire summer interns. This was a long time ago, long enough that the fresh-faced college students who applied for the gig are probably now creeping up to retirement age. The idea, as I understood it, was to get someone to help me with my project, which at the time was standing up a distributed data capture system with a large number of nodes all running custom software that I wrote, reporting back to a central server running more of my code. It was more work than I could manage on my own, so management thought they’d take mercy on me and get me some help.

The experience didn’t turn out quite like I expected. The interns were both great kids, very smart, and I learned a lot from them. But two months is a very tight timeframe, and getting them up to speed took up most of that time. Add in the fact that they were expected to do a presentation on their specific project at the end of the summer, and the whole thing ended up being a lot more work for me than if I had just done the whole project myself.

I thought about my brief experience with interns recently with a project I needed a little help on. It’s nothing that hiring anyone would make sense to do, but still, having someone to outsource specific jobs to would be a blessing, especially now that it’s summer and there’s so much else to do. But this is the future, and the expertise and the combined wisdom of the Internet are but a few keystrokes away, right? Well, maybe, but as you’ll see, even the power of large language models has its limit, and trying to loop ChatGPT in as a low-effort summer intern leaves a lot to be desired.

Continue reading “ChatGPT, The Worst Summer Intern Ever” →

Text-to-Speech Model Can Do Music, Background Noises, And Sound Effects

July 24, 2023 by Donald Papp 8 Comments

Bark is a universal text-to-audio model that can not only create realistic speech, it can incorporate music, background noises, and sound effects. It can even include non-speech sounds like laughter, sighs, throat clearings, and similar elements. But despite the fact that it can deliver such complex results, it’s important to understand some of the peculiarities.

The model takes a prompt and generates the resulting sound from scratch. Results might sometimes be unexpected.

Bark is not a conventional text-to-speech program, and how it works has a lot more in common with large language model AI chatbots. This means that results can deviate from expectations, and outputs aren’t necessarily going to be studio-quality speech. As the project’s README points out, “(generated outputs can) be anything from perfect speech to multiple people arguing at a baseball game recorded with bad microphones.” That being said, there is some support for voice presets as a way to help guide the model with some consistency.

Bark was designed by a company called Suno for research purposes and is available under the MIT License. It can be installed and run locally, and has some demos available as well as an online implementation.

The ability to install and run Bark locally is promising territory for incorporating it into projects. And should you be more interested in speech-to-text instead, don’t forget about this plain C/C++ implementaion of AI-powered speech recognition.

Bridging A Gap Between LLMs And Programming With TypeChat

July 22, 2023 by Bryan Cockfield 33 Comments

By now, large language models (LLMs) like OpenAI’s ChatGPT are old news. While not perfect, they can assist with all kinds of tasks like creating efficient Excel spreadsheets, writing cover letters, asking for music references, and putting together functional computer programs in a variety of languages. One thing these LLMs don’t do yet though is integrate well with existing app interfaces. However, that’s where the TypeChat library comes in, bridging the gap between LLMs and programming.

TypeChat is an experimental MIT-licensed library from Microsoft which sits in between a user and a LLM and formats responses from the AI that are type-safe so that they can easily be plugged back in to the original interface. It does this by generating JSON responses based on user input, making it easier to take the user input directly, run it through the LLM, and then use the output directly in another piece of code. It can be used for things like prototyping prompts, validating responses, and handling errors. It’s also not limited to a single LLM and can be fairly easily modified to work with many of the existing models.

The software is still in its infancy but does hope to make it somewhat easier to work between user inputs within existing pieces of software and LLMs which have quickly become all the rage in the computer science world. We expect to see plenty more tools like this become available as more people take up using these new tools, which have plenty of applications beyond just writing code.

ChatGPT V. The Legal System: Why Trusting ChatGPT Gets You Sanctioned

May 29, 2023 by Maya Posch 67 Comments

Recently, an amusing anecdote made the news headlines pertaining to the use of ChatGPT by a lawyer. This all started when a Mr. Mata sued the airline where years prior he claims a metal serving cart struck his knee. When the airline filed a motion to dismiss the case on the basis of the statute of limitations, the plaintiff’s lawyer filed a submission in which he argued that the statute of limitations did not apply here due to circumstances established in prior cases, which he cited in the submission.

Unfortunately for the plaintiff’s lawyer, the defendant’s counsel pointed out that none of these cases could be found, leading to the judge requesting the plaintiff’s counsel to submit copies of these purported cases. Although the plaintiff’s counsel complied with this request, the response from the judge (full court order PDF) was a curt and rather irate response, pointing out that none of the cited cases were real, and that the purported case texts were bogus.

The defense that the plaintiff’s counsel appears to lean on is that ChatGPT ‘assisted’ in researching these submissions, and had assured the lawyer – Mr. Schwartz – that all of these cases were real. The lawyers trusted ChatGPT enough to allow it to write an affidavit that they submitted to the court. With Mr. Schwartz likely to be sanctioned for this performance, it should also be noted that this is hardly the first time that ChatGPT and kin have been involved in such mishaps.

Continue reading “ChatGPT V. The Legal System: Why Trusting ChatGPT Gets You Sanctioned” →

Leaked Internal Google Document Claims Open Source AI Will Outcompete Google And OpenAI

May 5, 2023 by Maya Posch 37 Comments

In the world of large language models (LLM), the focus has for the longest time been on proprietary technologies from companies such as OpenAI (GPT-3 & 4, ChatGPT, etc.) as well as increasingly everyone from Google to Meta and Microsoft. What’s remained underexposed in this whole discussion about which LLM will do more things better are the efforts by hobbyists, unaffiliated researchers and everyone else you may find in Open Source LLM projects. According to a leaked document from a researcher at Google (anonymous, but apparently verified), Google is very worried that Open Source LLMs will wipe the floor with both Google’s and OpenAI’s efforts.

According to the document, after the open source community got their hands on the leaked LLaMA foundation model, motivated and highly knowledgeable individuals set to work to take a fairly basic model to new levels where it could begin to compete with the offerings by OpenAI and Google. Major innovations are the scaling issues, allowing these LLMs to work on far less powerful systems (like a laptop or even smartphone).

An important factor here is Low-Rank adaptation (LoRa), which massively cuts down the effort and resources required to train a model. Ultimately, as this document phrases it, Google and in extension OpenAI do not have a ‘secret sauce’ that makes their approaches better than anything the wider community can come up with. Noted is also that essentially Meta has won out here by having their LLM leak, as it has meant that the OSS community has been improving on the Meta foundations, allowing Meta to benefit from those improvements in their products.

The dire prediction is thus that in the end the proprietary LLMs by Google, OpenAI and others will cease to be relevant, as the open source community will have steamrolled them into fine, digital dust. Whether this will indeed work out this way remains to be seen, but things are not looking up for proprietary LLMs.

(Thanks to [Mike Szczys] for the tip)

ChatGPT Makes A 3D Model: The Secret Ingredient? Much Patience

May 4, 2023 by Donald Papp 44 Comments

ChatGPT is an AI large language model (LLM) which specializes in conversation. While using it, [Gil Meiri] discovered that one way to create models in FreeCAD is with Python scripting, and ChatGPT could be encouraged to create a 3D model of a plane in FreeCAD by expressing the model as a script. The result is just a basic plane shape, and it certainly took a lot of guidance on [Gil]’s part to make it happen, but it’s not bad for a tool that can’t see what it is doing.

The first step was getting ChatGPT to create code for a 10 mm cube, and plug that in FreeCAD to see the results. After that basic workflow was shown to work, [Gil] asked it to create a simple airplane shape. The resulting code had objects for wing, fuselage, and tail, but that’s about all that could be said because the result was almost — but not quite — completely unlike a plane. Not an encouraging start, but at least the basic building blocks were there. Continue reading “ChatGPT Makes A 3D Model: The Secret Ingredient? Much Patience” →