Robot Races A Little Smarter To Go Faster

[Steven Gong] is attending the University of Waterloo and found himself with a 1/10th scale F1TENTH autonomous RC car. What better use of a fast RC car with some smarts than to race itself around your computer science building?

Onboard is an Nvidia Jetson NX (not the new Nvidia Jetson Orin), a lidar module, and a depth camera. The code runs on top of ROS2, and the results were impressive. [Steven] mapped out the fifth floor of his building at 6 am using SLAM and the onboard sensors. With a map, he created a rough track for his car to follow. First, the car needs to know when to brake and when to hit the gas. With the basics out of the way, [Steven] moved on to the fun part. He wrote code to generate a faster racing line. Every turn has an optimal speed and approach, but each turn affects the next turn, which turns it into a rather exciting optimization problem.

Along the way, [Steven] fixed the gearbox, tuned the PID steering loop, and removed the software speed limits. It’s impressive engineering, and we love seeing the car zoom around faster and faster. The car eventually hit 25km/h, which seems pretty fast for indoors. The code and more details are up on GitHub.

However, if you’re curious about playing around with self-driving, perhaps a much smaller scale Pi Zero-based racer might be more your speed. Video after the break.

Continue reading “Robot Races A Little Smarter To Go Faster”

Wolfram Alpha With ChatGPT Looks Like A Killer Combo

Ever looked at Wolfram Alpha and the development of Wolfram Language and thought that perhaps Stephen Wolfram was a bit ahead of his time? Well, maybe the times have finally caught up because Wolfram plus ChatGPT looks like an amazing combo. That link goes to a long blog post from Stephen Wolfram that showcases exactly how and why the two make such a wonderful match, with loads of examples. (If you’d prefer a video discussion, one is embedded below the page break.)

OpenAI’s ChatGPT is a large language model (LLM) neural network, or more conventionally, an AI system capable of conversing in natural language. Thanks to a recently announced plugin system, ChatGPT can now interact with remote APIs and therefore use external resources.

ChatGPT’s natural language processing ability enables some pretty impressive interactions with Wolfram, enabling the kind of exchange you see here (click to enlarge.)

This is meaningful because LLMs are very good at processing natural language and generating plausible-sounding output, but whether or not the output is factually correct can be another matter. It’s not so much that ChatGPT is especially prone to confabulation, it’s more that the nature of an LLM neural network makes it difficult to ask “why exactly did you come up with your answer, and not something else?” In addition, asking ChatGPT to do things like perform nontrivial calculations is a bit of a square peg and round hole situation.

So how does the Wolfram plugin change that? When asked to produce data or perform computations, ChatGPT can now hand it off to Wolfram Alpha instead of attempting to generate the answer by itself.  Both sides use their strengths in this arrangement. First, ChatGPT interprets the user’s question and formulates it as a query, which is then sent to Wolfram Alpha for computation, and ChatGPT structures its response based on what it got back. In short, ChatGPT can now ask for help to get data or perform a computation, and it can show the receipts when it does.

Continue reading “Wolfram Alpha With ChatGPT Looks Like A Killer Combo”

Need To Pick Objects Out Of Images? Segment Anything Does Exactly That

Segment Anything, recently released by Facebook Research, does something that most people who have dabbled in computer vision have found daunting: reliably figure out which pixels in an image belong to an object. Making that easier is the goal of the Segment Anything Model (SAM), just released under the Apache 2.0 license.

The online demo has a bank of examples, but also works with uploaded images.

The results look fantastic, and there’s an interactive demo available where you can play with the different ways SAM works. One can pick out objects by pointing and clicking on an image, or images can be automatically segmented. It’s frankly very impressive to see SAM make masking out the different objects in an image look so effortless. What makes this possible is machine learning, and part of that is the fact that the model behind the system has been trained on a huge dataset of high-quality images and masks, making it very effective at what it does.

Continue reading “Need To Pick Objects Out Of Images? Segment Anything Does Exactly That”

The Hello World Of GPT?

Someone wants to learn about Arduino programming. Do you suggest they blink an LED first? Or should they go straight for a 3D laser scanner with galvos, a time-of-flight sensor, and multiple networking options? Most of us need to start with the blinking light and move forward from there. So what if you want to learn about the latest wave of GPT — generative pre-trained transformer — programs? Do you start with a language model that looks at thousands of possible tokens in large contexts? Or should you start with something simple? We think you should start simple, and [Andrej Karpathy] agrees. He has a workbook that makes a tiny GPT that can predict the next bit in a sequence. It isn’t any more practical than a blinking LED, but it is a manageable place to start.

The simple example starts with a vocabulary of two. In other words, characters are 1 or 0. It also uses a context size of 3, so it will look at 3 bits and use that to infer the 4th bit. To further simplify things, the examples assume you will always get a fixed-size sequence of tokens, in this case, eight tokens. Then it builds a little from there.

Continue reading “The Hello World Of GPT?”

Wolverine Gives Your Python Scripts The Ability To Self-Heal

[BioBootloader] combined Python and a hefty dose of of AI for a fascinating proof of concept: self-healing Python scripts. He shows things working in a video, embedded below the break, but we’ll also describe what happens right here.

The demo Python script is a simple calculator that works from the command line, and [BioBootloader] introduces a few bugs to it. He misspells a variable used as a return value, and deletes the subtract_numbers(a, b) function entirely. Running this script by itself simply crashes, but using Wolverine on it has a very different outcome.

In a short time, error messages are analyzed, changes proposed, those same changes applied, and the script re-run.

Wolverine is a wrapper that runs the buggy script, captures any error messages, then sends those errors to GPT-4 to ask it what it thinks went wrong with the code. In the demo, GPT-4 correctly identifies the two bugs (even though only one of them directly led to the crash) but that’s not all! Wolverine actually applies the proposed changes to the buggy script, and re-runs it. This time around there is still an error… because GPT-4’s previous changes included an out of scope return statement. No problem, because Wolverine once again consults with GPT-4, creates and formats a change, applies it, and re-runs the modified script. This time the script runs successfully and Wolverine’s work is done.

LLMs (Large Language Models) like GPT-4 are “programmed” in natural language, and these instructions are referred to as prompts. A large chunk of what Wolverine does is thanks to a carefully-written prompt, and you can read it here to gain some insight into the process. Don’t forget to watch the video demonstration just below if you want to see it all in action.

While AI coding capabilities definitely have their limitations, some of the questions it raises are becoming more urgent. Heck, consider that GPT-4 is barely even four weeks old at this writing.

Continue reading “Wolverine Gives Your Python Scripts The Ability To Self-Heal”

Tired Of Web Scraping? Make The AI Do It

[James Turk] has a novel approach to the problem of scraping web content in a structured way without needing to write the kind of page-specific code web scrapers usually have to deal with. How? Just enlist the help of a natural language AI. Scrapeghost relies on OpenAI’s GPT API to parse a web page’s content, pull out and classify any salient bits, and format it in a useful way.

What makes Scrapeghost different is how data gets organized. For example, when instantiating scrapeghost one defines the data one wishes to extract. For example:

from scrapeghost import SchemaScraper
scrape_legislators = SchemaScraper(
schema={
"name": "string",
"url": "url",
"district": "string",
"party": "string",
"photo_url": "url",
"offices": [{"name": "string", "address": "string", "phone": "string"}],
}
)

The kicker is that this format is entirely up to you! The GPT models are very, very good at processing natural language, and scrapeghost uses GPT to process the scraped data and find (using the example above) whatever looks like a name, district, party, photo, and office address and format it exactly as requested.

It’s an experimental tool and you’ll need an API key from OpenAI to use it, but it has useful features and is certainly a novel approach. There’s a tutorial and even a command-line interface, so check it out.

How Much Programming Can ChatGPT Really Do?

By now we’ve all seen articles where the entire copy has been written by ChatGPT. It’s essentially a trope of its own at this point, so we will start out by assuring you that this article is being written by a human. AI tools do seem poised to be extremely disruptive to certain industries, though, but this doesn’t necessarily have to be a bad thing as long as they continue to be viewed as tools, rather than direct replacements. ChatGPT can be used to assist in plenty of tasks, and can help augment processes like programming (rather than becoming the programmer itself), and this article shows a few examples of what it might be used for.

AI comments are better than nothing…probably.

While it can write some programs on its own, in some cases quite capably, for specialized or complex tasks it might not be quite up to the challenge yet. It will often appear extremely confident in its solutions even if it’s providing poor or false information, though, but that doesn’t mean it can’t or shouldn’t be used at all.

The article goes over a few of the ways it can function more as an assistant than a programmer, including generating filler content for something like an SQL database, converting data from one format to another, converting programs from one language to another, and even help with a program’s debugging process.

Some other things that ChatGPT can be used for that we’ve been able to come up with include asking for recommendations for libraries we didn’t know existed, as well as asking for music recommendations to play in the background while working. Tools like these are extremely impressive, and while they likely aren’t taking over anyone’s job right now, that might not always be the case.