Training A Neural Network To Play A Driving Game

Often, when we think of getting a computer to complete a task, we contemplate creating complex algorithms that take in the relevant inputs and produce the desired behaviour. For some tasks, like navigating a car down a road, the sheer multitude of input data and its relationship to the desired output is so complex that it becomes near-impossible to code a solution. In these cases, it can make more sense to create a neural network and train the computer to do the job, as one would a human. On a more basic level, [Gigante] did just that, teaching a neural network to play a basic driving game with a genetic algorithm.

The game consists of a basic top-down 2D driving game. The AI is given the distance to the edge of the track along five lines at different angles projected from the front of the vehicle. The AI also knows its speed and direction. Given these 7 numbers, it calculates the outputs for steering, braking and acceleration to drive the car.

To train the AI, [Gigante] started with 650 AIs, and picked the best performer, which just barely managed to navigate the first two corners. Marking this AI as the parent of the next generation, the AIs were iterated with random mutations. Each generation showed some improvement, with [Gigante] picking the best performers each time to parent the next generation. Within just four iterations, some of the cars are able to complete a full lap. With enough training, the cars are able to complete the course at great speed without hitting the walls at all.

It’s a great example of machine learning and the use of genetic algorithms to improve fitness over time. [Gigante] points out that there’s no need for a human in the loop either, if the software is coded to self-measure the fitness of each generation. We’ve seen similar techniques used to play Mario, too. Video after the break.

Continue reading “Training A Neural Network To Play A Driving Game”

Attempting To Generate Photorealistic Video With Neural Networks

Over the past decade, we’ve seen great strides made in the area of AI and neural networks. When trained appropriately, they can be coaxed into generating impressive output, whether it be in text, images, or simply in classifying objects. There’s also much fun to be had in pushing them outside their prescribed operating region, as [Jon Warlick] attempted recently.

[Jon]’s work began using NVIDIA’s GauGAN tool. It’s capable of generating pseudo-photorealistic images of landscapes from segmentation maps, where different colors of a 2D image represent things such as trees, dirt, or mountains, or water. After spending much time toying with the software, [Jon] decided to see if it could be pressed into service to generate video instead.

The GauGAN tool is only capable of taking in a single segmentation map, and outputting a single image, so [Jon] had to get creative. Experiments were undertaken wherein a video was generated and exported as individual frames, with these frames fed to GauGAN as individual segmentation maps. The output frames from GauGAN were then reassembled into a video again.

The results are somewhat psychedelic, as one would expect. GauGAN’s single image workflow means there is only coincidental relevance between consecutive frames, creating a wild, shifting visage. While it’s not a technique we expect to see used for serious purposes anytime soon, it’s a great experiment at seeing how far the technology can be pushed. It’s not the first time we’ve seen such technology used to create full motion video, either. Video after the break.

Continue reading “Attempting To Generate Photorealistic Video With Neural Networks”

Tube Amp Is Modeled With The Power Of AI

There is a certain magic and uniqueness to hardware, particularly when it comes to audio. Tube amplifiers are well-known and well-loved by audio enthusiasts and musicians alike. However, that uniqueness also comes with the price of the fact that gear takes up space and cannot be configured outside the bounds of what it was designed to do. [keyth72] has decided to take it upon themselves to recreate the smooth sound of the Fenders Blues Jr. small tube guitar amp. But rather than using hardware or standard audio software, the magic of AI was thrown at it.

In some ways, recreating a transformation is exactly what AI is designed for. There’s a clear and recordable input with a similar output. In this case, [keyth72] recorded several guitar sessions with the guitar audio sent through the device they wanted to recreate. Using WaveNet, they created a model that applies the transform to input audio in real-time. The Gain and EQ knobs were handled outside the model itself to keep things simple. Instructions on how to train your own model are included on the GitHub page.

While the model is simply approximating the real hardware, it still sounds quite impressive, and perhaps the next time you need a particular sound of your home-built amp or guitar pedal, you might reach for your computer instead.
Continue reading “Tube Amp Is Modeled With The Power Of AI”

Community Testing Suggests Bias In Twitter’s Cropping Algorithm

With social media and online services are now huge parts of daily life to the point that our entire world is being shaped by algorithms. Arcane in their workings, they are responsible for the content we see and the adverts we’re shown. Just as importantly, they decide what is hidden from view as well.

Important: Much of this post discusses the performance of a live website algorithm. Some of the links in this post may not perform as reported if viewed at a later date. 

The initial Zoom problem that brought Twitter’s issues to light.

Recently, [Colin Madland] posted some screenshots of a Zoom meeting to Twitter, pointing out how Zoom’s background detection algorithm had improperly erased the head of a colleague with darker skin. In doing so, [Colin] noticed a strange effect — although the screenshot he submitted shows both of their faces, Twitter would always crop the image to show just his light-skinned face, no matter the image orientation. The Twitter community raced to explore the problem, and the fallout was swift.

Continue reading “Community Testing Suggests Bias In Twitter’s Cropping Algorithm”

Let This Crying Detecting Classifier Offer Some Much Needed Reprieve

Baby monitors are cool, but [Ish Ot Jr.] wanted his to only transmit sounds that required immediate attention and filter any non-emergency background noise. Posed with this problem, he made a baby monitor that would only send alerts when his baby was crying.

For his project, [Ish] used an Arduino Nano 33 BLE Sense due to its built-in microphone, sizeable RAM for storing large chunks of data, and it’s BLE capabilities for later connecting with an app. He began his project by collecting background noise using Edge Impulse Studio’s data acquisition functionality. [Ish] really emphasized that Edge Impulse was really doing all the work for him. He really just needed to collect some test data and that was mostly it on his part. The work needed to run and test the Neural Network was taken care of by Edge Impulse. Sounds handy, if you don’t mind offloading your data to the cloud.

[Ish] ended up with an 86.3% accurate classifier which he thought was good enough for a first pass at things. To make his prototype a bit more “finished”, he added some status LEDs, providing some immediate visual feedback of his classifier and to notify the caregiver. Eventually, he wants to add some BLE support and push notifications, alerting him whenever his baby needs attention.

We’ve seen a couple of baby monitor projects on Hackaday over the years. [Ish’s] project will most certainly be a nice addition to the list.

The Interactive Storytelling Radio

[8BitsAndAByte] are back and this time they’re using AI to create an interactive storyteller. With the help of a Raspberry Pi, they upcycled an old Cold War era radio they dug up and the results are pretty impressive.

The main controller board of the radio was intact, so it was easy to use all the preexisting hardware to control the speaker and to trigger a few of the Pi’s GPIO using the buttons and switches on the radio’s front panel. To add some artificial intelligence, they used Google’s AIY Voice Kit, allowing them to tap into Google’s seemingly endless artificial intelligence platform. This could be a “tables have turned moment,” but we’re probably being a bit too hopeful.

Anyway, they used a pretty interesting piece of software called Dialogflow that creates a somewhat natural conversational interaction akin to a chatbox. Dialogflow processes speech to text, as you would expect, but can also interpret contextual speech and provide contextual responses. Pretty neat…but maybe also a little creepy. Who knows? The jury is still out.

Anyway, if you’re like us and sometimes in need of a break from humans, then this project just might be for you.

Continue reading “The Interactive Storytelling Radio”

OAK Vision Modules Help You See The Forest And The Trees

OpenCV is an open source library of computer vision algorithms, its power and flexibility made many machine vision projects possible. But even with code highly optimized for maximum performance, we always wish for more. Which is why our ears perk up whenever we hear about a hardware accelerated vision module, and the latest buzz is coming out of the OpenCV AI Kit (OAK) Kickstarter campaign.

There are two vision modules launched with this campaign. The OAK-1 with a single color camera for two dimensional vision applications, and the OAK-D which adds stereo cameras for that third dimension. The onboard brain is a Movidius Myriad X processor which, according to team members who have dug through its datasheet, have been massively underutilized in other products. They believe OAK modules will help the chip fulfill its potential for vision applications, delivering high performance while consuming low power in a small form factor. Reading over the spec sheet, we think it’s fair to call these “Ultimate Myriad X Dev Boards” but we must concede “OpenCV AI Kit” sounds better. It does not provide hardware acceleration for the entire OpenCV library (likely an impossible task) but it does cover the highly demanding subset suitable for Myriad X acceleration.

Since the campaign launched a few weeks ago, some additional information have been released to help assure backers that this project has real substance. It turns out OAK is an evolution of a project we’ve covered almost exactly one year ago that became a real product DepthAI, so at least this is not their first rodeo. It is also encouraging that their invitation to the open hardware community has already borne fruit. Check out this thread discussing OAK for robot vision, where a question was met with an honest “we don’t have expertise there” from the OAK team, but then ArduCam pitched in with their camera module experience to help.

We wish them success for their planned December 2020 delivery. They have already far surpassed their funding goals, they’ve shipped hardware before, and we see a good start to a development community. We look forward to the OAK-1 and OAK-D joining the ranks of other hacking friendly vision modules like OpenMV, JeVois, StereoPi, and AIY Vision.