Here’s Why GPUs Are Deep Learning’s Best Friend

If you have a curiosity about how fancy graphics cards actually work, and why they are so well-suited to AI-type applications, then take a few minutes to read [Tim Dettmers] explain why this is so. It’s not a terribly long read, but while it does get technical there are also car analogies, so there’s something for everyone!

He starts off by saying that most people know that GPUs are scarily efficient at matrix multiplication and convolution, but what really makes them most useful is their ability to work with large amounts of memory very efficiently.

Essentially, a CPU is a latency-optimized device while GPUs are bandwidth-optimized devices. If a CPU is a race car, a GPU is a cargo truck. The main job in deep learning is to fetch and move cargo (memory, actually) around. Both devices can do this job, but in different ways. A race car moves quickly, but can’t carry much. A truck is slower, but far better at moving a lot at once. Continue reading “Here’s Why GPUs Are Deep Learning’s Best Friend”

Neuromorphic Computing: What Is It And Where Are We At?

For the last hundred or so years, collectively as humanity, we’ve been dreaming, thinking, writing, singing, and producing movies about a machine that could think, reason, and be intelligent in a similar way to us. The stories beginning with “Erewhon” published in 1872 by Sam Butler, Edgar Allan Poe’s “Maelzel’s Chess Player,” and the 1927 film “Metropolis” showed the idea that a machine could think and reason like a person. Not in magic or fantastical way. They drew from the automata of ancient Greece and Egypt and combined notions of philosophers such as Aristotle, Ramon Llull, Hobbes, and thousands of others.

Their notions of the human mind led them to believe that all rational thought could be expressed as algebra or logic. Later the arrival of circuits, computers, and Moore’s law led to continual speculation that human-level intelligence was just around the corner. Some have heralded it as the savior of humanity, where others portray a calamity as a second intelligent entity rises to crush the first (humans).

The flame of computerized artificial intelligence has brightly burned a few times before, such as in the 1950s, 1980s, and 2010s. Unfortunately, both prior AI booms have been followed by an “AI winter” that falls out of fashion for failing to deliver on expectations. This winter is often blamed on a lack of computer power, inadequate understanding of the brain, or hype and over-speculation. In the midst of our current AI summer, most AI researchers focus on using the steadily increasing computer power available to increase the depth of their neural nets. Despite their name, neural nets are inspired by the neurons in the brain and share only surface-level similarities.

Some researchers believe that human-level general intelligence can be achieved by simply adding more and more layers to these simplified convolutional systems fed by an ever-increasing trove of data. This point is backed up by the incredible things these networks can produce, and it gets a little better every year. However, despite what wonders deep neural nets produce, they still specialize and excel at just one thing. A superhuman Atari playing AI cannot make music or think about weather patterns without a human adding those capabilities. Furthermore, the quality of the input data dramatically impacts the quality of the net, and the ability to make an inference is limited, producing disappointing results in some domains. Some think that recurrent neural nets will never gain the sort of general intelligence and flexibility that our brains offer.

However, some researchers are trying to creating something more brainlike by, you guessed it, more closely emulates a brain. Given that we are in a golden age of computer architecture, now seems the time to create new hardware. This type of hardware is known as Neuromorphic hardware.

Continue reading “Neuromorphic Computing: What Is It And Where Are We At?”

Shhh… Robot Vacuum Lidar Is Listening

There are millions of IoT devices out there in the wild and though not conventional computers, they can be hacked by alternative methods. From firmware hacks to social engineering, there are tons of ways to break into these little devices. Now, four researchers at the National University of Singapore and one from the University of Maryland have published a new hack to allow audio capture using lidar reflective measurements.

The hack revolves around the fact that audio waves or mechanical waves in a room cause objects inside a room to vibrate slightly. When a lidar device impacts a beam off an object, the accuracy of the receiving system allows for measurement of the slight vibrations cause by the sound in the room. The experiment used human voice transmitted from a simple speaker as well as a sound bar and the surface for reflections were common household items such as a trash can, cardboard box, takeout container, and polypropylene bags. Robot vacuum cleaners will usually be facing such objects on a day to day basis.

The bigger issue is writing the filtering algorithm that is able to extract the relevant information and separate the noise, and this is where the bulk of the research paper is focused (PDF). Current developments in Deep Learning assist in making the hack easier to implement. Commercial lidar is designed for mapping, and therefore optimized for reflecting off of non-reflective surface. This is the opposite of what you want for laser microphone which usually targets a reflective surface like a window to pick up latent vibrations from sound inside of a room.

Deep Learning algorithms are employed to get around this shortfall, identifying speech as well as audio sequences despite the sensor itself being less than ideal, and the team reports achieving an accuracy of 90%. This lidar based spying is even possible when the robot in question is docked since the system can be configured to turn on specific sensors, but the exploit depends on the ability to alter the firmware, something the team accomplished using the Dustcloud exploit which was presented at DEF CON in 2018.

You don’t need to tear down your robot vacuum cleaner for this experiment since there are a lot of lidar-based rovers out there. We’ve even seen open source lidar sensors that are even better for experimental purposes.

Thanks for the tip [Qes]

How To Run ML Applications On Particle Hardware

With the release of TensorFlow Lite at Google I/O 2019, the accessible machine learning library is no longer limited to applications with access to GPUs. You can now run machine learning algorithms on microcontrollers much more easily, improving on-board inference and computation.

[Brandon Satrom] published a demo on how to run TFLite on Particle devices (tested on Photon, Argon, Boron,  and Xenon) making it possible to make predictions on live data with pre-trained models. While some of the easier computation that occurs on MCUs requires manipulating data with existing equations (mapping analog inputs to a percentage range, for instance), many applications require understanding large, complex sets of sensor data gathered in real time. It’s often more difficult to get accurate results from a simple equation.

The current method is to train ML models on specialty hardware, deploy the models on cloud infrastructure, and backhaul sensor data to the cloud for inference. By running the inference and decision-making on-board, MCUs can simply take action without backhauling any data.

He starts off by constructing a simple TGLite model for MCU execution, using mean squared error for loss and stochastic gradient descent for the optimization. After training the model on sample data, you can save the model and convert it to a C array for the MCU. On the MCU, you can load the model, TFLite libraries, and operations resolver, as well as instantiate an interpreter and tensors. From there you invoke the model on the MCU and see your results!

[Thanks dcschelt for the tip!]

Largest Chip Ever Holds 1.2 Trillion Transistors

We get it, press releases are full of hyperbole. Cerebras recently announced they’ve built the largest chip ever. The chip has 400,000 cores and contains 1.2 trillion transistors on a die over 46,000 square mm in area. That’s roughly the same as a square about 8.5 inches on each side. But honestly, the WSE — Wafer Scale Engine — is just most of a wafer not cut up. Typically a wafer will have lots of copies of a device on it and it gets split into pieces.

According to the company, the WSE is 56 times larger than the largest GPU on the market. The chip boasts 18 gigabytes of storage spread around the massive die. The problem isn’t making such a beast — although a normal wafer is allowed to have a certain number of bad spots. The real problems come through things such as interconnections and thermal management.

Continue reading “Largest Chip Ever Holds 1.2 Trillion Transistors”

Blisteringly Fast Machine Learning On An Arduino Uno

Even though machine learning AKA ‘deep learning’ / ‘artificial intelligence’ has been around for several decades now, it’s only recently that computing power has become fast enough to do anything useful with the science.

However, to fully understand how a neural network (NN) works, [Dimitris Tassopoulos] has stripped the concept down to pretty much the simplest example possible – a 3 input, 1 output network – and run inference on a number of MCUs, including the humble Arduino Uno. Miraculously, the Uno processed the network in an impressively fast prediction time of 114.4 μsec!

Whilst we did not test the code on an MCU, we just happened to have Jupyter Notebook installed so ran the same code on a Raspberry Pi directly from [Dimitris’s] bitbucket repo.

He explains in the project pages that now that the hype about AI has died down a bit that it’s the right time for engineers to get into the nitty-gritty of the theory and start using some of the ‘tools’ such as Keras, which have now matured into something fairly useful.

In part 2 of the project, we get to see the guts of a more complicated NN with 3-inputs, a hidden layer with 32 nodes and 1-output, which runs on an Uno at a much slower speed of 5600 μsec.

This exploration of ML in the embedded world is NOT ‘high level’ research stuff that tends to be inaccessible and hard to understand. We have covered Machine Learning On Tiny Platforms Like Raspberry Pi And Arduino before, but not with such an easy and thoroughly practical example.

AI And Art Appreciation

In 2019, using AI to evaluate artwork is finally more productive than foolish. We all hope that someday soon our Roomba will judge our living habits and give unsolicited advice on how we could spruce things up with a few pictures and some natural light. There is already an extensive amount of Deep Learning dedicated to photo recognition but a team in Croatia is adapting them for use on fine art. It makes sense that everything is geared toward cameras since most of us have a vast photographic portfolio but fine art takes longer to render. Even so, the collection on Wikiart.org is vast and already a hotbed for computer classification work, so they set to work there.

As they modify existing convolutional neural networks, they check themselves by comparing results with human ratings to keep what works and discard what flops. Fortunately, fine art has a lot of existing studies and commentary, whereas the majority of photographs in the public domain have nothing more than a file name and maybe some EXIF data. The difference here is that photograph-parsing AI can say, “That is a STOP sign,” while the fine art AI can say, “That is a memorable painting of a sign.” Continue reading “AI And Art Appreciation”