DOOM will forever be remembered as one of the founding games of the entire FPS genre. It also stands as a game which has long been a fertile ground for hackers and modders. [Nick Bild] decided to bring gesture control to iD’s classic shooter, courtesy of machine learning.
The setup consists of a Jetson Nano fitted with a camera, which films the player and uses a convolutional neural network to recognise the player’s various gestures. Once recognised, an API request is sent to a laptop playing Doom which simulates the relevant keystrokes. The laptop is hooked up to a projector, creating a large screen which allows the wildly gesturing player to more easily follow the action.
The neural network was trained on 3300 images – 300 per gesture. [Nick] found that using a larger data set actually performed less well, as he became less diligent in reliably performing the gestures. This demonstrates that quality matters in training networks, as well as quantity.
Reports are that the network is fairly reliable, and it appears to work quite well. Unfortunately, playability is limited as it’s not possible to gesture for more than one key at once. Overall though, it serves as a tidy example of how to do gesture recognition with CNNs.
If you’re not convinced by this demonstration, you might be interested to learn that neural networks can also be used to name tomatoes. If you don’t want to roll your own pose detection, check out this selfie drone that uses CMU’s OpenPose library. Video after the break.
Continue reading “Gesture Controlled Doom”
People take their tabletop games very, very seriously. [Andrew Lauritzen], though, has gone far above and beyond in pursuit of a fair game. The game in question is Star War: X-Wing, a strategy wargame where miniature pieces are moved according to rolls of the dice. [Andrew] suspected that commercially available dice were skewing the game, and the automated machine-vision dice tester shown in the video after the break was the result.
The rig is a very clever design that maximizes the data set with as little motion as possible. The test chamber is a box with clear ends that can be flipped end-for-end by a motor; walls separate the chamber into four channels to test multiple dice on each throw, and baffles within the channels assure randomization. A webcam is positioned below the chamber to take a snapshot of each “throw”, which is then analyzed in OpenCV. This scheme has the unfortunate effect of looking at the dice from the table’s perspective, but [Andrew] dealt with that in true hacker fashion: he ignored it since it didn’t impact the statistics he was interested in.
And speaking of statistics, he generated a LOT of them. The 62-page report of results from his study is an impressive piece of work, which basically concludes that the dice aren’t fair due to manufacturing variability, and that players could use this fact to cheat. He recommends pooled sets of dice to eliminate advantages during competitive play.
This isn’t the first automated dice roller we’ve seen around these parts. There was the tweeting dice-bot, the Dice-O-Matic, and all manner of electronic dice throwers. This one goes the extra mile to keep things fair, and we appreciate that.
Continue reading “Automated Dice Tester Uses Machine Vision To Ensure A Fair Game”
Adversarial attacks are not something new to the world of Deep Networks used for image recognition. However, as the research with Deep Learning grows, more flaws are uncovered. The team at the University of KU Leuven in Belgium have demonstrated how, by simple using a colored photo held near the torso of a man can render him invisible to image recognition systems based on convolutional neural networks.
Convolutional Neural Networks or CNNs are a class of Deep learning networks that reduces the number of computations to be performed by creating hierarchical patterns from simpler and smaller networks. They are becoming the norm for image recognition applications and are being used in the field. In this new paper, the addition of color patches is seen to confuse the image detector YoLo(v2) by adding noise that disrupts the calculations of the CNN. The patch is not random and can be identified using the process defined in the publication.
This attack can be implemented by printing the disruptive pattern on a t-shirt making them invisible to surveillance system detection. You can read the paper[PDF] that outlines the generation of the adversarial patch. Image recognition camouflage that works on Google’s Inception has been documented in the past and we hope to see more such hacks in the future. Its a new world out there where you hacking is colorful as ever.
Continue reading “The Cloak Of Invisibility Against Image Recognition”
Nvidia is back at it again with another awesome demo of applied machine learning: artificially transforming standard video into slow motion – they’re so good at showing off what AI can do that anyone would think they were trying to sell hardware for it.
Though most modern phones and cameras have an option to record in slow motion, it often comes at the expense of resolution, and always at the expense of storage space. For really high frame rates you’ll need a specialist camera, and you often don’t know that you should be filming in slow motion until after an event has occurred. Wouldn’t it be nice if we could just convert standard video to slow motion after it was recorded?
That’s just what Nvidia has done, all nicely documented in a paper. At its heart, the algorithm must take two frames, and artificially create one or more frames in between. This is not a manual algorithm that interpolates frames, this is a fully fledged deep-learning system. The Convolutional Neural Network (CNN) was trained on over a thousand videos – roughly 300k individual frames.
Since none of the parameters of the CNN are time-dependent, it’s possible to generate as many intermediate frames as required, something which sets this solution apart from previous approaches. In some of the shots in their demo video, 30fps video is converted to 240fps; this requires the creation of 7 additional frames for every pair of consecutive frames.
The video after the break is seriously impressive, though if you look carefully you can see the odd imperfection, like the hockey player’s skate or dancer’s arm. Deep learning is as much an art as a science, and if you understood all of the research paper then you’re doing pretty darn well. For the rest of us, get up to speed by wrapping your head around neural networks, and trying out the simplest Tensorflow example.
Continue reading “Nvidia Transforms Standard Video Into Slow Motion Using AI”
This past semester I added research to my already full schedule of math and engineering classes, as any masochistic student eagerly would. Packed schedule aside, how do you pass up the chance to work on implementing 360° virtual teleportation to anywhere in the world, in real-time. Yes, it is indeed the same concept as the cult worshipped Star Trek transporter, minus the ability to physically be at the location. Perhaps we can add a, “beam me up, Scotty” command when shutting down.
The research lab I was working with is the Laboratory for Immersive CommunicatiON (LION). It’s funded by NSF, Microsoft, and Adobe and has been on the pursuit of VR teleportation for some time now. There’s a lot of cool technologies at work here, like drones which are used as location collection devices. A network of drones will survey landscape anywhere in the world and build the collection assets needed for recreating it in VR. Okay, so a swarm of drones might seem a little intimidating at first, but when has emerging technology not?
Continue reading “360 Live VR Teleportation Uses Drones, Neural Networks, And Perseverance”
Identifying ham radio signals used to be easy. Beeps were Morse code, voice was AM unless it sounded like Donald Duck in which case it was sideband. But there are dozens of modes in common use now including TV, digital data, digital voice, FM, and more coming on line every day. [Randaller] used CUDA to build a neural network that could interface with an RTL-SDR dongle and can classify the signals it hears. Since it is a neural network, it isn’t so much programmed to do it as it is trained. The proof of concept has training to distinguish FM, SECAM, and tetra. However, you can train it to recognize other modulation schemes if you want to invest the time into it.
Continue reading “Neural Network Learns SDR Ham Radio”
[Eric] has put together a simple python script to scrape election results from CNN.com. It uses urllib2 to return the popular and electoral votes for each party and throws an ElectionWon exception when CNN calls the race. He’s planning on hooking this to DMX controlled RGB LED lighting that will shift to either blue or red as the night progresses. It’s a great starting point if you want to pull off something similar.
You may remember [Eric] for building the IKEA MAME table and the TRS-80 wireless terminal.
UPDATE: [Garrett] of macetech is putting the finishing touches on his version which uses 32 ShiftBrite modules and 2 4-digit displays controlled by a CuBLOC.