Google's Piano Genie

Piano Genie Trained A Neural Net To Play 88-Key Piano With 8 Arcade Buttons

Want to sound great on a Piano using only your coding skills? Enter Piano Genie, the result of a research project from Google AI and DeepMind. You press any of eight buttons while a neural network makes sure the piano plays something cool — compensating in real time for what’s already been played.

Almost anyone new to playing music who sits down at a piano will produce a sound similar to that of a cat chasing a mouse through a tangle of kitchen pots. Who can blame them, given the sea of 88 inexplicable keys sitting before them? But they’ll quickly realize that playing keys in succession in one direction will produce sounds with consistently increasing or decreasing pitch. They’ll also learn that pressing keys for different lengths of times can improve the melody. But there’s still 88 of them and plenty more to learn, such as which keys will sound harmonious when played together.

Piano Genie training architectureWith Pinao Genie, gone are the daunting 88 keys, replaced with a 3D-printed box of eight arcade-style buttons which they made by following this Adafruit tutorial. A neural network maps those eight buttons to something meaningful on the 88-key piano keyboard. Being a neural network, the mapping isn’t a fixed one-to-one or even one-to-many. Instead, it’s trained to play something which should sound good taking into account what was play previously and won`t necessarily be the same each time.

To train it they use data from the approximately 1400 performances of the International Piano e-Competition. The result can be quite good as you can see and hear in the video below. The buttons feed into a computer but the computer plays the result on an actual piano.

For training, the neural network really consists of two networks. One is an encoder, in this case a recurrent neural network (RNN) which takes piano sequences and learns to output a vector. In the diagram, the vector is in the middle and has one element for each of the eight buttons. The second network is the decoder, also an RNN. It’s trained to turn that eight-element vector back into the same music which was fed into the encoder.

Once trained, only the decoder is used. The eight-button keyboard feeds into the vector, and the decoder outputs suitable notes. The fact that they’re RNNs means that rather than learning a fixed one-to-many mapping, the network takes into account what was previously played in order to come up with something which hopefully sounds pleasing. To give the user a little more creative control, they also trained it to realize when the user is playing a rising or falling melody and to output the same. See their paper for how the turned polyphonic sound into monophonic and back again.

If you prefer a different style of music you can train it on a MIDI collection of your own choosing using their open-sourced model. Or you can try it out as is right now through their web interface. I’ll admit, I started out just banging on it, producing the same noise I would get if I just hammered away randomly on a piano. Then I switched to thinking of making melodies and the result started sounding better. So some music background and practice still helps. For the video below, the researcher admits to having already played for a few hours.

This isn’t the first project we’ve covered by these Google researchers. Another was this music synthesizer again using neural networks but this time with a Raspberry Pi. And if our discussion of recurrent neural networks went a bit over your head, check out our overview of neural networks.

Continue reading “Piano Genie Trained A Neural Net To Play 88-Key Piano With 8 Arcade Buttons”

Jump Into AI With A Neural Network Of Your Own

One of the difficulties in learning about neural networks is finding a problem that is complex enough to be instructive but not so complex as to impede learning. [ThomasNield] had an idea: Create a neural network to learn if you should put a light or dark font on a particular colored background. He has a great video explaining it all (see below) and code in Kotlin.

[Thomas] is very interested in optimization, so his approach is very much based on mathematics and algorithms of optimization. One thing that’s handy is that there is already an algorithm for making this determination. He found it on Stack Exchange, but we’re sure it’s in a textbook or paper somewhere. The existing algorithm makes the neural network really impractical, but it makes training easy since you can algorithmically develop a training set of data.

Once trained, the neural network works well. He wrote a small GUI and you can even select among various models.

Don’t let the Kotlin put you off. It is a derivative of Java and uses the same JVM. The code is very similar, other than it infers types and also adds functional program tools. However, the libraries and the principles employed will work with Java and, in many cases, the concepts will apply no matter what you are doing.

If you want to hardware accelerate your neural networks, there’s a stick for that. If you prefer C and you want something lean and mean, try TINN.

Continue reading “Jump Into AI With A Neural Network Of Your Own”

Artistic Collaboration With AI

Ever since Google’s Deep Dream results were made public several years ago, there has been major interest in the application of AI and neural network technologies to artistic endeavors. [Helena Sarin] has been experimenting in just this field, exploring the possibilities of collaborating with the ghost in the machine.

This image was generated with a landscape model using a dataset containing covers of Japanese poetry books.

The work is centered around the use of Generative Adversarial Networks, or GANs. [Helena] describes using a GAN to create artworks as a sort of game. An apprentice attempts to create new works in the style of their established master, while a critic attempts to determine whether the artworks are created by the master or the apprentice. As the apprentice improves, the critic must become more discerning; as the critic becomes more discerning, the apprentice must improve further. It is through this mechanism that the model improves itself.

[Helena] has spent time experimenting with CycleGAN in the artistic realm after first using it in a work project, and has primarily trained it on her own original artworks to create new pieces with wild and exciting results. She shares several tips on how best to work with the technology, around the necessary computing and storage requirements, as well as ways to step out of the box to create more diverse outputs.

Neural networks are hot lately, with plenty of research going on in the field. There’s plenty of fun projects, too – like this cartoonifying camera we featured recently.

LEGO bricks sorter

Sorting LEGO Is Like Making A Box Of Chocolates

Did you know that chocolate candy production and sorting LEGO bricks have something in common? They both use the same techniques for turning clumps of chocolates or bricks into individual ones moving down a conveyor belt. At least that’s what [Paco Garcia] found out when making his LEGO Sorter.

Sorting LEGO bricks using guidesHowever, he didn’t find that out right away. He first experimented with his own techniques, learning that if he fed bricks to his conveyor belt by dropping a batch of them in a line perpendicular to the direction of belt travel then no subsequent separation attempt of his worked. He then turned to [akiyuky’s] LEGO sorter for inspiration and dropped them onto the belt at an angle, ensuring that some bricks would be in front of others. A further trick he found is very well demonstrated in the chocolate sorting video below and shown in the image here. That is to use guides on the belt which serve to create speed differentials. Bricks move slower than the conveyor belt while pressed against a guide but when a brick leaves the guide, it accelerates to the speed of the conveyor belt, pulling away from the bricks still at the guide and thus separating them.

A further discovery had nothing to do with chocolate production, unless maybe for quality control. Once an individual brick had been separated out, it had to be classified. To do that he used Google’s Inception v3 neural network. But first, he had to retrain it for recognizing different types of LEGO bricks, something we’ve seen done before for use with recognizing playing cards. And to do the retraining, he needed many images of different bricks all separated into their different types. That’s where he came up with a clever trick. He used his own sorter for that. For example, to get a bunch of images of 1×1 bricks of different colors and orientations, he simply ran them through the sorter, saving the images to files and assigning them to the 1×1 brick class. He then used his desktop machine with a GeForce GT 730 GPU for the retraining, taking around 2.7 seconds per brick. For sorting though, he runs the trained neural network on a Raspberry Pi, taking 3.8 seconds for each brick. The resulting sorter works quite well, sorting with 89% accuracy. Watch it in action in the video below.
Continue reading “Sorting LEGO Is Like Making A Box Of Chocolates”

Hummingbirds, 3D Printing, And Deep Learning

Setting camera traps in your garden to see what local wildlife is around is quite popular. But [Chris Lam] has just one subject in mind: the hummingbird. He devised a custom setup to capture the footage he wanted using some neat tech.

To attract the hummingbirds, [Chris] used an off-the-shelf feeder — no need to re-invent the wheel there. To obtain the closeup footage required, a 4K action cam was used. This was attached to the feeder with a 3D-printed mount that [Chris] designed.

When it came to detecting the presence of a hummingbird in the video, there were various approaches that could have been considered. On the hardware side, PIR and ultrasonic distance sensors are popular for projects of this kind, but [Chris] wanted a pure software solution. The commonly used motion detection libraries for this type of project might have fallen over here, since the whole feeder was swinging in the air on a string, so [Chris] opted for machine learning.

A RESNET architecture was used to run a classification on each frame, to determine if the image contained a hummingbird or not. The initial attempt was not greatly successful, but after cropping the image to a smaller area around the feeder, classification accuracy greatly increased. After a bit of FFmpeg magic, the selected snippets were concatenated to make one video containing all the interesting parts; you can see the result in the clip after the break.

It seems that machine learning and wildlife cams are a match made in heaven. We’ve already written about a proof-of-concept project which identifies different animals in the footage when motion is detected.

Continue reading “Hummingbirds, 3D Printing, And Deep Learning”

Solar Pi Cluster Scours Internet For Nudes

There seems to be a universal truth on the Internet: if you open up a service to the world, eventually somebody will come in and try to mess it up. If you have a comment section, trolls will come in and fill it with pedantic complaints (so we’ve heard anyway, naturally we have no experience with such matters). If you have a service where people can upload files, then it’s a guarantee that something unsavory is eventually going to take up residence on your server.

Unfortunately, that’s exactly what [Christian Haschek] found while developing his open source image hosting platform, PictShare. He was alerted to some unsavory pictures on PictShare, and after he dealt with them he realized these could be the proverbial tip of the iceberg. But there were far too many pictures on the system to check manually. He decided to build a system that could search for NSFW images using a trained neural network.

The nude-sniffing cluster is made up of a trio of Raspberry Pi computers, each with its own Movidius neural compute stick to perform the heavy lifting. [Christian] explains how he installed the compute stick SDK and Yahoo’s open source learning module for identifying questionable images, the aptly named open_nsfw. The system can be scaled up by adding more Pis to the system, and since it’s all ARM processors and compute sticks, it’s energy efficient enough the whole system can run off a 10 watt solar panel.

After opening up the system with a public web interface where users can scan their own images, he offered his system’s services to a large image hosting provider to see what it would find. Shockingly, the system was able to find over 3,000 images that contained suspected child pornography. The appropriate authorities were notified, and [Christian] encourages anyone else looking to search their servers for this kind of content to drop him a line. Truly hacking for good.

This isn’t the first time we’ve seen Intel’s Movidius compute stick in the wild., and of course we’ve seen our fair share of Raspberry Pi clusters. From 750 node monsters down to builds which are far more show than go.

Modular robot legs from Disney

Disney’s New Robot Limbs Trained Using Neural Networks

Disney is working on modular, intelligent robot limbs that snap into place with magnets. The intelligence comes from a reasonable sized neural network that also incorporates some modularity. The robot is their Snapbot whose base unit can fit up to eight of limbs, and so far they’ve trained with up to three together.

The modularity further extends to a choice of three types of limb. One with roll and pitch, another with yaw and pitch, and a third with roll, yaw, and pitch. Interestingly, of the three types, the yaw-pitch one seems most effective.

Learning environment for Disney's modular robot legsIn this age of massive, deep neural networks requiring GPUs or even online services for training in a reasonable amount of time, it’s refreshing to see that this one’s only two layers deep and can be trained in three hours on a single-core, 3.4 GHz Intel i7 processor. Three hours may still seem long, but remember, this isn’t a simulation in a silicon virtual world. This is real-life where the servo motors have to actually move. Of course, they didn’t want to sit around and reset it after each attempt to move across the table so they built in an automatic mechanism to pull the robot back to the starting position before trying to cross the table again. To further speed training, they found that once they’d trained for one limb, they could then copy the last of the network’s layers to get a head starting on the training for two limbs.

Why do training? Afterall, we’ve seen pretty awesome multi-limbed robots working with manual coding, an example being this hexapod tank based on one from the movie Ghost in the Shell. They did that too and then compared the results of the manual approach with those of the trained one and the trained one moved further in the same amount of time. At a minimum, we can learn a trick or two from this modular crawler.

Check out their article for the details and watch it in action in its learning environment below.

Continue reading “Disney’s New Robot Limbs Trained Using Neural Networks”