Understanding Deep Learning: Free MIT Press EBook For Instructors And Students

The recently published book Understanding Deep Learning by [Simon J. D. Prince] is notable not only for focusing primarily on the concepts behind Deep Learning — which should make it highly accessible to most — but also in that it can be either purchased as a hardcover from MIT Press or downloaded for free from the Understanding Deep Learning website. If you intend to use it for coursework, a separate instructor answer booklet and other resources can be purchased, but student resources like Python notebooks are also freely available. In the book’s preface, the author invites readers to send feedback whenever they find an issue.

Continue reading “Understanding Deep Learning: Free MIT Press EBook For Instructors And Students”

Generating 3D Scenes From Just One Image

The LucidDreamer project ties a variety of functions into a pipeline that can take a source image (or generate one from a text prompt) and “lift” its content into 3D, creating highly-detailed Gaussian splats that look great and can even be navigated.

Gaussian splatting is a method used to render NeRFs (Neural Radiance Fields), which are themselves a method of generating complex scenes from sparse 2D sources, and doing it quickly. If that is all news to you, that’s probably because this stuff has sprung up with dizzying speed from when the original NeRF concept was thought up barely a handful of years ago.

What makes LucidDreamer neat is the fact that it does so much with so little. The project page has interactive scenes to explore, but there is also a demo for those who would like to try generating scenes from scratch (some familiarity with the basic tools is expected, however.)

In addition to the source code itself the research paper is available for those with a hunger for the details. Read it quick, because at the pace this stuff is expanding, it honestly might be obsolete if you wait too long.

Explore Neural Radiance Fields In Real-time, Even On A Phone

Neural Radiance Fields (NeRF) is a method of reconstructing complex 3D scenes from sparse 2D inputs, and the field has been growing by leaps and bounds. Viewing a reconstructed scene is still nontrivial, but there’s a new innovation on the block: SMERF is a browser-based method of enabling full 3D navigation of even large scenes, efficient enough to render in real time on phones and laptops.

Don’t miss the gallery of demos which will run on anything from powerful desktops to smartphones. Notable is the distinct lack of blurry, cloudy, or distorted areas which tend to appear in under-observed areas of a NeRF scene (such as indoor corners and ceilings). The technical paper explains SMERF’s approach in more detail.

NeRFs as a concept first hit the scene in 2020 and the rate of advancement has been simply astounding, especially compared to demos from just last year. Watch the short video summarizing SMERF below, and marvel at how it compares to other methods, some of which are themselves only months old.

Continue reading “Explore Neural Radiance Fields In Real-time, Even On A Phone”

Here’s Why GPUs Are Deep Learning’s Best Friend

If you have a curiosity about how fancy graphics cards actually work, and why they are so well-suited to AI-type applications, then take a few minutes to read [Tim Dettmers] explain why this is so. It’s not a terribly long read, but while it does get technical there are also car analogies, so there’s something for everyone!

He starts off by saying that most people know that GPUs are scarily efficient at matrix multiplication and convolution, but what really makes them most useful is their ability to work with large amounts of memory very efficiently.

Essentially, a CPU is a latency-optimized device while GPUs are bandwidth-optimized devices. If a CPU is a race car, a GPU is a cargo truck. The main job in deep learning is to fetch and move cargo (memory, actually) around. Both devices can do this job, but in different ways. A race car moves quickly, but can’t carry much. A truck is slower, but far better at moving a lot at once. Continue reading “Here’s Why GPUs Are Deep Learning’s Best Friend”

Neuromorphic Computing: What Is It And Where Are We At?

For the last hundred or so years, collectively as humanity, we’ve been dreaming, thinking, writing, singing, and producing movies about a machine that could think, reason, and be intelligent in a similar way to us. The stories beginning with “Erewhon” published in 1872 by Sam Butler, Edgar Allan Poe’s “Maelzel’s Chess Player,” and the 1927 film “Metropolis” showed the idea that a machine could think and reason like a person. Not in magic or fantastical way. They drew from the automata of ancient Greece and Egypt and combined notions of philosophers such as Aristotle, Ramon Llull, Hobbes, and thousands of others.

Their notions of the human mind led them to believe that all rational thought could be expressed as algebra or logic. Later the arrival of circuits, computers, and Moore’s law led to continual speculation that human-level intelligence was just around the corner. Some have heralded it as the savior of humanity, where others portray a calamity as a second intelligent entity rises to crush the first (humans).

The flame of computerized artificial intelligence has brightly burned a few times before, such as in the 1950s, 1980s, and 2010s. Unfortunately, both prior AI booms have been followed by an “AI winter” that falls out of fashion for failing to deliver on expectations. This winter is often blamed on a lack of computer power, inadequate understanding of the brain, or hype and over-speculation. In the midst of our current AI summer, most AI researchers focus on using the steadily increasing computer power available to increase the depth of their neural nets. Despite their name, neural nets are inspired by the neurons in the brain and share only surface-level similarities.

Some researchers believe that human-level general intelligence can be achieved by simply adding more and more layers to these simplified convolutional systems fed by an ever-increasing trove of data. This point is backed up by the incredible things these networks can produce, and it gets a little better every year. However, despite what wonders deep neural nets produce, they still specialize and excel at just one thing. A superhuman Atari playing AI cannot make music or think about weather patterns without a human adding those capabilities. Furthermore, the quality of the input data dramatically impacts the quality of the net, and the ability to make an inference is limited, producing disappointing results in some domains. Some think that recurrent neural nets will never gain the sort of general intelligence and flexibility that our brains offer.

However, some researchers are trying to creating something more brainlike by, you guessed it, more closely emulates a brain. Given that we are in a golden age of computer architecture, now seems the time to create new hardware. This type of hardware is known as Neuromorphic hardware.

Continue reading “Neuromorphic Computing: What Is It And Where Are We At?”

Shhh… Robot Vacuum Lidar Is Listening

There are millions of IoT devices out there in the wild and though not conventional computers, they can be hacked by alternative methods. From firmware hacks to social engineering, there are tons of ways to break into these little devices. Now, four researchers at the National University of Singapore and one from the University of Maryland have published a new hack to allow audio capture using lidar reflective measurements.

The hack revolves around the fact that audio waves or mechanical waves in a room cause objects inside a room to vibrate slightly. When a lidar device impacts a beam off an object, the accuracy of the receiving system allows for measurement of the slight vibrations cause by the sound in the room. The experiment used human voice transmitted from a simple speaker as well as a sound bar and the surface for reflections were common household items such as a trash can, cardboard box, takeout container, and polypropylene bags. Robot vacuum cleaners will usually be facing such objects on a day to day basis.

The bigger issue is writing the filtering algorithm that is able to extract the relevant information and separate the noise, and this is where the bulk of the research paper is focused (PDF). Current developments in Deep Learning assist in making the hack easier to implement. Commercial lidar is designed for mapping, and therefore optimized for reflecting off of non-reflective surface. This is the opposite of what you want for laser microphone which usually targets a reflective surface like a window to pick up latent vibrations from sound inside of a room.

Deep Learning algorithms are employed to get around this shortfall, identifying speech as well as audio sequences despite the sensor itself being less than ideal, and the team reports achieving an accuracy of 90%. This lidar based spying is even possible when the robot in question is docked since the system can be configured to turn on specific sensors, but the exploit depends on the ability to alter the firmware, something the team accomplished using the Dustcloud exploit which was presented at DEF CON in 2018.

You don’t need to tear down your robot vacuum cleaner for this experiment since there are a lot of lidar-based rovers out there. We’ve even seen open source lidar sensors that are even better for experimental purposes.

Thanks for the tip [Qes]

How To Run ML Applications On Particle Hardware

With the release of TensorFlow Lite at Google I/O 2019, the accessible machine learning library is no longer limited to applications with access to GPUs. You can now run machine learning algorithms on microcontrollers much more easily, improving on-board inference and computation.

[Brandon Satrom] published a demo on how to run TFLite on Particle devices (tested on Photon, Argon, Boron,  and Xenon) making it possible to make predictions on live data with pre-trained models. While some of the easier computation that occurs on MCUs requires manipulating data with existing equations (mapping analog inputs to a percentage range, for instance), many applications require understanding large, complex sets of sensor data gathered in real time. It’s often more difficult to get accurate results from a simple equation.

The current method is to train ML models on specialty hardware, deploy the models on cloud infrastructure, and backhaul sensor data to the cloud for inference. By running the inference and decision-making on-board, MCUs can simply take action without backhauling any data.

He starts off by constructing a simple TGLite model for MCU execution, using mean squared error for loss and stochastic gradient descent for the optimization. After training the model on sample data, you can save the model and convert it to a C array for the MCU. On the MCU, you can load the model, TFLite libraries, and operations resolver, as well as instantiate an interpreter and tensors. From there you invoke the model on the MCU and see your results!

[Thanks dcschelt for the tip!]