Open Data Cam Combines Camera, GPU, And Neural Network In An Artisanal DIY Cereal Box

The engineers and product designers at [moovel lab] have created the Open Data Cam – an AI camera platform that can identify and count objects as they move through its field of view – along with an open source guide for making your own.

Step one: get out your ruler and utility knife. In this world of ubiquitous 3D-printers they’ve taken a decidedly low-tech approach to the project’s enclosure: a cut, folded, and zip-tied plastic box, with a cardboard frame inside to hold the electronic bits. It’s “splash proof” and certainly cheap to make, but we’re a little worried about cooling and physical protection for the electronics inside, as they’re not exactly cheap and rugged components.

So what’s inside? An Nvidia Jetson TX2 board, a LiPo battery with some charging circuitry, and a standard webcam. The special sauce, however, is the software, which is available on GitHub. [Moovel lab]’s engineers have put together a nice-looking wifi-accessible mobile UI for marking the areas where you’d like the software to identify and tally objects. The actual object detection and identification tasks are performed by the speedy YOLO neural network, a task the Nvidia board’s GPU is of course well suited for.

As the Open Data Cam’s unblinking glass eye gazes upon our urban environments, it will log its observations in an ancient and mysterious language: CSV. It’s up to you, human, to interpret this information and use it for good.

A summary video and build time lapse are embedded after the break.

Continue reading “Open Data Cam Combines Camera, GPU, And Neural Network In An Artisanal DIY Cereal Box”

Brain Cell Electronics Explains Wetware Computing Power

Neural networks use electronic analogs of the neurons in our brains. But it doesn’t seem likely that just making enough electronic neurons would create a human-brain-like thinking machine. Consider that animal brains are sometimes larger than ours — a sperm whale’s brain weighs 17 pounds — yet we don’t think they are as smart as humans or even dogs who have a much smaller brain. MIT researchers have discovered differences between human brain cells and animal ones that might help clear up some of that mystery. You can see a video about the work they’ve done below.

Neurons have long finger-like structures known as dendrites. These act like comparators, taking input from other neurons and firing if the inputs exceed a threshold. Like any kind of conductor, the longer the dendrite, the weaker the signal. Naively, this seems bad for humans. To understand why, consider a rat. A rat’s cortex has six layers, just like ours. However, whereas the rat’s brain is tiny and 30% cortex, our brains are much larger and 75% cortex. So a dendrite reaching from layer 5 to layer 1 has to be much longer than the analogous neuron in the rat’s brain.

These longer dendrites do lead to more loss in human brains and the MIT study confirmed this by using human brain cells — healthy ones removed to get access to diseased brain cells during surgery. The researchers think that this greater loss, however, is actually a benefit to humans because it helps isolate neurons from other neurons leading to increased computing capability of a single neuron. One of the researchers called this “electrical compartmentalization.” Dig into the conclusions found in the research paper.

We couldn’t help but wonder if this research would offer new insights into neural network computing. We already use numeric weights to simulate dendrite threshold action, so presumably learning algorithms are making weaker links if that helps. However, maybe something to take away from this is that less interaction between neurons and groups of neurons may be more helpful than more interaction.

Watching them probe neurons under the microscope reminded us of probing on an IC die. There’s a close tie between understanding the brain and building better machines so we try to keep an eye on the research going on in that area.

Continue reading “Brain Cell Electronics Explains Wetware Computing Power”

Google's Piano Genie

Piano Genie Trained A Neural Net To Play 88-Key Piano With 8 Arcade Buttons

Want to sound great on a Piano using only your coding skills? Enter Piano Genie, the result of a research project from Google AI and DeepMind. You press any of eight buttons while a neural network makes sure the piano plays something cool — compensating in real time for what’s already been played.

Almost anyone new to playing music who sits down at a piano will produce a sound similar to that of a cat chasing a mouse through a tangle of kitchen pots. Who can blame them, given the sea of 88 inexplicable keys sitting before them? But they’ll quickly realize that playing keys in succession in one direction will produce sounds with consistently increasing or decreasing pitch. They’ll also learn that pressing keys for different lengths of times can improve the melody. But there’s still 88 of them and plenty more to learn, such as which keys will sound harmonious when played together.

Piano Genie training architectureWith Pinao Genie, gone are the daunting 88 keys, replaced with a 3D-printed box of eight arcade-style buttons which they made by following this Adafruit tutorial. A neural network maps those eight buttons to something meaningful on the 88-key piano keyboard. Being a neural network, the mapping isn’t a fixed one-to-one or even one-to-many. Instead, it’s trained to play something which should sound good taking into account what was play previously and won`t necessarily be the same each time.

To train it they use data from the approximately 1400 performances of the International Piano e-Competition. The result can be quite good as you can see and hear in the video below. The buttons feed into a computer but the computer plays the result on an actual piano.

For training, the neural network really consists of two networks. One is an encoder, in this case a recurrent neural network (RNN) which takes piano sequences and learns to output a vector. In the diagram, the vector is in the middle and has one element for each of the eight buttons. The second network is the decoder, also an RNN. It’s trained to turn that eight-element vector back into the same music which was fed into the encoder.

Once trained, only the decoder is used. The eight-button keyboard feeds into the vector, and the decoder outputs suitable notes. The fact that they’re RNNs means that rather than learning a fixed one-to-many mapping, the network takes into account what was previously played in order to come up with something which hopefully sounds pleasing. To give the user a little more creative control, they also trained it to realize when the user is playing a rising or falling melody and to output the same. See their paper for how the turned polyphonic sound into monophonic and back again.

If you prefer a different style of music you can train it on a MIDI collection of your own choosing using their open-sourced model. Or you can try it out as is right now through their web interface. I’ll admit, I started out just banging on it, producing the same noise I would get if I just hammered away randomly on a piano. Then I switched to thinking of making melodies and the result started sounding better. So some music background and practice still helps. For the video below, the researcher admits to having already played for a few hours.

This isn’t the first project we’ve covered by these Google researchers. Another was this music synthesizer again using neural networks but this time with a Raspberry Pi. And if our discussion of recurrent neural networks went a bit over your head, check out our overview of neural networks.

Continue reading “Piano Genie Trained A Neural Net To Play 88-Key Piano With 8 Arcade Buttons”

Jump Into AI With A Neural Network Of Your Own

One of the difficulties in learning about neural networks is finding a problem that is complex enough to be instructive but not so complex as to impede learning. [ThomasNield] had an idea: Create a neural network to learn if you should put a light or dark font on a particular colored background. He has a great video explaining it all (see below) and code in Kotlin.

[Thomas] is very interested in optimization, so his approach is very much based on mathematics and algorithms of optimization. One thing that’s handy is that there is already an algorithm for making this determination. He found it on Stack Exchange, but we’re sure it’s in a textbook or paper somewhere. The existing algorithm makes the neural network really impractical, but it makes training easy since you can algorithmically develop a training set of data.

Once trained, the neural network works well. He wrote a small GUI and you can even select among various models.

Don’t let the Kotlin put you off. It is a derivative of Java and uses the same JVM. The code is very similar, other than it infers types and also adds functional program tools. However, the libraries and the principles employed will work with Java and, in many cases, the concepts will apply no matter what you are doing.

If you want to hardware accelerate your neural networks, there’s a stick for that. If you prefer C and you want something lean and mean, try TINN.

Continue reading “Jump Into AI With A Neural Network Of Your Own”

Artistic Collaboration With AI

Ever since Google’s Deep Dream results were made public several years ago, there has been major interest in the application of AI and neural network technologies to artistic endeavors. [Helena Sarin] has been experimenting in just this field, exploring the possibilities of collaborating with the ghost in the machine.

This image was generated with a landscape model using a dataset containing covers of Japanese poetry books.

The work is centered around the use of Generative Adversarial Networks, or GANs. [Helena] describes using a GAN to create artworks as a sort of game. An apprentice attempts to create new works in the style of their established master, while a critic attempts to determine whether the artworks are created by the master or the apprentice. As the apprentice improves, the critic must become more discerning; as the critic becomes more discerning, the apprentice must improve further. It is through this mechanism that the model improves itself.

[Helena] has spent time experimenting with CycleGAN in the artistic realm after first using it in a work project, and has primarily trained it on her own original artworks to create new pieces with wild and exciting results. She shares several tips on how best to work with the technology, around the necessary computing and storage requirements, as well as ways to step out of the box to create more diverse outputs.

Neural networks are hot lately, with plenty of research going on in the field. There’s plenty of fun projects, too – like this cartoonifying camera we featured recently.

LEGO bricks sorter

Sorting LEGO Is Like Making A Box Of Chocolates

Did you know that chocolate candy production and sorting LEGO bricks have something in common? They both use the same techniques for turning clumps of chocolates or bricks into individual ones moving down a conveyor belt. At least that’s what [Paco Garcia] found out when making his LEGO Sorter.

Sorting LEGO bricks using guidesHowever, he didn’t find that out right away. He first experimented with his own techniques, learning that if he fed bricks to his conveyor belt by dropping a batch of them in a line perpendicular to the direction of belt travel then no subsequent separation attempt of his worked. He then turned to [akiyuky’s] LEGO sorter for inspiration and dropped them onto the belt at an angle, ensuring that some bricks would be in front of others. A further trick he found is very well demonstrated in the chocolate sorting video below and shown in the image here. That is to use guides on the belt which serve to create speed differentials. Bricks move slower than the conveyor belt while pressed against a guide but when a brick leaves the guide, it accelerates to the speed of the conveyor belt, pulling away from the bricks still at the guide and thus separating them.

A further discovery had nothing to do with chocolate production, unless maybe for quality control. Once an individual brick had been separated out, it had to be classified. To do that he used Google’s Inception v3 neural network. But first, he had to retrain it for recognizing different types of LEGO bricks, something we’ve seen done before for use with recognizing playing cards. And to do the retraining, he needed many images of different bricks all separated into their different types. That’s where he came up with a clever trick. He used his own sorter for that. For example, to get a bunch of images of 1×1 bricks of different colors and orientations, he simply ran them through the sorter, saving the images to files and assigning them to the 1×1 brick class. He then used his desktop machine with a GeForce GT 730 GPU for the retraining, taking around 2.7 seconds per brick. For sorting though, he runs the trained neural network on a Raspberry Pi, taking 3.8 seconds for each brick. The resulting sorter works quite well, sorting with 89% accuracy. Watch it in action in the video below.
Continue reading “Sorting LEGO Is Like Making A Box Of Chocolates”

Hummingbirds, 3D Printing, And Deep Learning

Setting camera traps in your garden to see what local wildlife is around is quite popular. But [Chris Lam] has just one subject in mind: the hummingbird. He devised a custom setup to capture the footage he wanted using some neat tech.

To attract the hummingbirds, [Chris] used an off-the-shelf feeder — no need to re-invent the wheel there. To obtain the closeup footage required, a 4K action cam was used. This was attached to the feeder with a 3D-printed mount that [Chris] designed.

When it came to detecting the presence of a hummingbird in the video, there were various approaches that could have been considered. On the hardware side, PIR and ultrasonic distance sensors are popular for projects of this kind, but [Chris] wanted a pure software solution. The commonly used motion detection libraries for this type of project might have fallen over here, since the whole feeder was swinging in the air on a string, so [Chris] opted for machine learning.

A RESNET architecture was used to run a classification on each frame, to determine if the image contained a hummingbird or not. The initial attempt was not greatly successful, but after cropping the image to a smaller area around the feeder, classification accuracy greatly increased. After a bit of FFmpeg magic, the selected snippets were concatenated to make one video containing all the interesting parts; you can see the result in the clip after the break.

It seems that machine learning and wildlife cams are a match made in heaven. We’ve already written about a proof-of-concept project which identifies different animals in the footage when motion is detected.

Continue reading “Hummingbirds, 3D Printing, And Deep Learning”