Imagine that you’re serving on a jury, and you’re given an image taken from a surveillance camera. It looks pretty much like the suspect, but the image has been “enhanced” by an AI from the original. Do you convict? How does this weigh out on the scales of reasonable doubt? Should you demand to see the original?
AI-enhanced, upscaled, or otherwise modified images are tremendously realistic. But what they’re showing you isn’t reality. When we wrote about this last week, [Denis Shiryaev], one of the authors of one of the methods we highlighted, weighed in the comments to point out that these modifications aren’t “restorations” of the original. While they might add incredibly fine detail, for instance, they don’t recreate or restore reality. The neural net creates its own reality, out of millions and millions of faces that it’s learned.
And for the purposes of identification, that’s exactly the problem: the facial features of millions of other people have been used to increase the resolution. Can you identify the person in the pixelized image? Can you identify that same person in the resulting up-sampling? If the question put before the jury was “is the defendant a former president of the USA?” you’d answer the question differently depending on which image you were presented. And you’d have a misleading level of confidence in your ability to judge the AI-retouched photo. Clearly, informed skepticism on the part of the jury is required.
Unfortunately, we’ve all seen countless examples of “zoom, enhance” in movies and TV shows being successfully used to nab the perps and nail their convictions. We haven’t seen nearly as much detailed analysis of how adversarial neural networks create faces out of a scant handful of pixels. This, combined with the almost magical resolution of the end product, would certainly sway a jury of normal folks. On the other hand, the popularity of intentionally misleading “deep fakes” might help educate the public to the dangers of believing what they see when AI is involved.
This is just one example, but keeping the public interested in and educated on the deep workings and limitations of the technology that’s running our world is more important than ever before, but some of the material is truly hard. How do we separate the science from the magic?
It was a trope all too familiar in the 1990s — law enforcement in movies and TV taking a pixellated, blurry image, and hitting the magic “enhance” button to reveal suspects to be brought to justice. Creating data where there simply was none before was a great way to ruin immersion for anyone with a modicum of technical expertise, and spoiled many movies and TV shows.
Of course, technology marches on and what was once an utter impossibility often becomes trivial in due time. These days, it’s expected that a sub-$100 computer can easily differentiate between a banana, a dog, and a human, something that was unfathomable at the dawn of the microcomputer era. This capability is rooted in the technology of neural networks, which can be trained to do all manner of tasks formerly considered difficult for computers.
With neural networks and plenty of processing power at hand, there have been a flood of projects aiming to “enhance” everything from low-resolution human faces to old film footage, increasing resolution and filling in for the data that simply isn’t there. But what’s really going on behind the scenes, and is this technology really capable of accurately enhancing anything?
Continue reading ““Enhance” Is Now A Thing, But Don’t Believe What You See”
Often, when we think of getting a computer to complete a task, we contemplate creating complex algorithms that take in the relevant inputs and produce the desired behaviour. For some tasks, like navigating a car down a road, the sheer multitude of input data and its relationship to the desired output is so complex that it becomes near-impossible to code a solution. In these cases, it can make more sense to create a neural network and train the computer to do the job, as one would a human. On a more basic level, [Gigante] did just that, teaching a neural network to play a basic driving game with a genetic algorithm.
The game consists of a basic top-down 2D driving game. The AI is given the distance to the edge of the track along five lines at different angles projected from the front of the vehicle. The AI also knows its speed and direction. Given these 7 numbers, it calculates the outputs for steering, braking and acceleration to drive the car.
To train the AI, [Gigante] started with 650 AIs, and picked the best performer, which just barely managed to navigate the first two corners. Marking this AI as the parent of the next generation, the AIs were iterated with random mutations. Each generation showed some improvement, with [Gigante] picking the best performers each time to parent the next generation. Within just four iterations, some of the cars are able to complete a full lap. With enough training, the cars are able to complete the course at great speed without hitting the walls at all.
It’s a great example of machine learning and the use of genetic algorithms to improve fitness over time. [Gigante] points out that there’s no need for a human in the loop either, if the software is coded to self-measure the fitness of each generation. We’ve seen similar techniques used to play Mario, too. Video after the break.
Continue reading “Training A Neural Network To Play A Driving Game”
Ever since [Ian Goodfellow] and his colleagues invented the generative adversarial network (GAN) in 2014, hundreds of projects, from style transfers to poetry generators, have been produced using the concept of contesting neural networks. Unlike traditional neural networks, GANs can generate new data that fits statistically within the same set as the training set.
[Bernat Cuni], the one-man design team behind [cunicode] came up with the idea to generate beetles using this technique. Inspired by material published on Machine Learning for Artists, he decided to deploy some visual experiments with zoological illustrations. The training data was found from a public domain book hosted at archive.org, found through the Biodiversity Heritage Library. A combination of OpenCV and ImageMagick helped with individually extracting illustrations to squared images.
[Cuni] then ran a DCGAN with the data set, generating the first set of quasi-beetles after some tinkering with epochs and settings. After the failed first experiment, he went with StyleGAN, setting up a machine at PaperSpace with 1 GPU and running the training for >3 days on 128 px images. The results were much better, but fairly small and the cost of running the machine was quite expensive (>€125).
Given the success of the previous experiment, he decided to transfer over to Google CoLab, using their 12 hours of K80 GPU per run for free to generate some more beetles. With the intent on producing more HD beetles, he used Runway trained on 1024 px beetles, discovering much better results after 3000 steps. The model was moved over to Google CoLab to produce HD outputs.
He has since continued to experiment with the beetles, producing some confusing generated images and fun collectibles.
Continue reading “Generating Beetles From Public Domain Images”
Asking machines to make music by themselves is kind of a strange notion. They’re machines, after all. They don’t feel happy or hurt, and as far as we know, they don’t long for the affections of other machines. Humans like to think of music as being a strictly human thing, a passionate undertaking so nuanced and emotion-based that a machine could never begin to understand the feeling that goes into the process of making music, or even the simple enjoyment of it.
The idea of humans and machines having a jam session together is even stranger. But oddly enough, the principles of the jam session may be exactly what machines need to begin to understand musical expression. As Sara Adkins explains in her enlightening 2019 Hackaday Superconference talk, Creating with the Machine, humans and machines have a lot to learn from each other.
To a human musician, a machine’s speed and accuracy are enviable. So is its ability to make instant transitions between notes and chords. Humans are slow to learn these transitions and have to practice going back and forth repeatedly to build muscle memory. If the machine were capable, it would likely envy the human in terms of passionate performance and musical expression.
Continue reading “Sara Adkins Is Jamming Out With Machines”
Getting exact statistics on one’s physical activities at the gym, is not an easy feat. While most people these days are familiar with or even regularly use one of those motion-based trackers on their wrist, there’s a big question as to their accuracy. After all, it’s all based on the motions of just one’s wrist, which as we know leads to amusing results in the tracker app when one does things like waving or clapping one’s hands, and cannot track leg exercises at the gym.
To get around the issue of limited sensor data, researchers at Carnegie Mellon University (Pittsburgh, USA) developed a system based around a camera and machine vision algorithms. While other camera solutions that attempt this suffer from occlusion while trying to track individual people as accurately as possible, this new system instead doesn’t try to track people’s joints, but merely motion at specific exercise machines by looking for repetitive motion in the scene.
The basic concept is that repetitive motion usually indicates forms of exercise, and that no two people at the same type of machine will ever be fully in sync with their motions, so that merely a handful of pixels suffice to track motion at that machine by a single person. This also negates many privacy issues, as the resolution doesn’t have to be high enough to see faces or track joints with any degree of accuracy.
In experiments at the university’s gym, the accuracy of their system over 5 days and 42 hours of video. Detecting exercise activities in the scene was with a 99.6% accuracy, disambiguating between simultaneous activities was 84.6% accurate, while recognizing exercise types was 93.6% accurate. Ultimately repetition counts for specific exercises were within 1.7 counts.
Maybe an extended version of this would be a flying drone capturing one’s outside activities, giving one finally that 100% accurate exercise account while jogging?
Thanks to [Qes] for sending this one in!
When it comes to something as futuristic-sounding as brain-computer interfaces (BCI), our collective minds tend to zip straight to scenes from countless movies, comics, and other works of science-fiction (including more dystopian scenarios). Our mind’s eye fills with everything from the Borg and neural interfaces of Star Trek, to the neural recording devices with parent-controlled blocking features from Black Mirror, and of course the enslavement of the human race by machines in The Matrix.
And now there’s this Elon Musk guy, proclaiming that he’ll be wiring up people’s brains to computers starting next year, as part of this other company of his: Neuralink. Here the promises and imaginings are truly straight from the realm of sci-fi, ranging from ‘reading and writing’ to the brain, curing brain diseases and merging human minds with artificial intelligence. How much of this is just investor speak? Please join us as we take a look at BCIs, neuroprosthetics and what we can expect of these technologies in the coming years.
Continue reading “Brain-Computer Interfaces: Separating Fact From Fiction On Musk’s Brain Implant Claims”