Do you have a great application for computer vision, but couldn’t spare the cost of hardware needed to build it? Or perhaps you just need a deadline to pull you away from endless doom scrolling? Either way, the OpenCV team wants you to enter their OpenCV AI Competition 2021 and they’re willing to pitch in hardware to make it happen.
This competition is part of OpenCV’s 20th anniversary celebration, and the field of machine vision has changed a lot in those two decades. OpenCV started within Intel harnessing power of their high end CPUs, but today the excitement is around specialized acceleration hardware for vision processing. Which is why OpenCV put their support and lent their name to the OpenCV AI Kit (OAK) Kickstarter we covered a few months ago. Since then, the hardware was produced and starting to arrive in project backer’s hands. (Barring pandemic-related shipping restrictions…)
This shiny new hardware is the competition’s focus. Phase one solicits team proposals for putting an OAK-D’s power to novel use. University teams may have up to ten members, general teams are limited to four. Each team’s geographic home will put them in one of six global regions. Proposals must be submitted by January 27th, 2021. By February 11th, judges will select the best twenty-five general and ten university team proposals from each region, and every member of the team gets an OAK-D unit to turn their idea into reality by phase two deadline of June 27th. That’s up to 1,200 OAK-D modules available to anyone who can convince the judges they have a great idea and they are capable of bringing it to fruition. Is that you? Of course it is!
Teams will also receive additional resources such as an allotment of cloud compute credits to train their models, and naturally all tutorials and sample code released as part of OAK Kickstarter. No explicit resource for project team organization is mentioned, but of course our own Hackaday.io is available to support you. Best of luck to everyone who enters and we look forward to seeing all the projects this contest will bring to life.
The leap to self-driving cars could be as game-changing as the one from horse power to engine power. If cars prove able to drive themselves better than humans do, the safety gains could be enormous: auto accidents were the #8 cause of death worldwide in 2016. And who doesn’t want to turn travel time into something either truly restful or alternatively productive?
But getting there is a big challenge, as Alfred Jones knows all too well. The Head of Mechanical Engineering at Lyft’s level-5 self-driving division, his team is building the roof racks and other gear that gives the vehicles their sensors and computational hardware. In his keynote talk at Hackaday Remoticon, Alfred Jones walks us through what each level of self-driving means, how the problem is being approached, and where the sticking points are found between what’s being tested now and a truly steering-wheel-free future.
Check out the video below, and take a deeper dive into the details of his talk.
[James Bruton] is an impressive roboticist, building all kinds of robots from tracked, exploring robots to Boston Dynamics-esque legged robots. However, many of the robots are proof-of-concept builds that explore machine learning, computer vision, or unique movements and characteristics. This latest build make use of everything he’s learned from building those but strives to be useful on a day-to-day basis as well, and is part of the beginning of a series he is doing on building a Really Useful Robot. (Video, embedded below.)
While the robot isn’t quite finished yet, his first video in this series explores the idea behind the build and the construction of the base of the robot itself. He wants this robot to be able to navigate its environment but also carry out instructions such as retrieving a small object from a table. For that it needs a heavy base which is built from large 3D-printed panels with two brushless motors with encoders for driving the custom wheels, along with a suspension built from casters and a special hinge. Also included in the base is an Nvidia Jetson for running the robot, and also handling some heavy lifting tasks such as image recognition.
As of this writing, [James] has also released his second video in the series which goes into detail about the mapping and navigation functions of the robots, and we’re excited to see the finished product. Of course, if you want to see some of [James]’s other projects be sure to check out his tracked rover or his investigations into legged robots.
Today it is pretty easy to build a robot with an onboard camera and have fun manually driving through that first-person view. But builders with dreams of autonomy quickly learn there is a lot of work between camera installation and autonomously executing a “go to chair” command. Fortunately we can draw upon work such as View Parsing Network by [Bowen Pan, Jiankai Sun, et al]
When a camera image comes into a computer, it is merely a large array of numbers representing red, green, and blue color values and our robot has no idea what that image represents. Over the past years, computer vision researchers have found pretty good solutions for problems of image classification (“is there a chair?”) and segmentation (“which pixels correspond to the chair?”) While useful for building an online image search engine, this is not quite enough for robot navigation.
A robot needs to translate those pixel coordinates into real-world layout, and this is the problem View Parsing Network offers to solve. Detailed in Cross-view Semantic Segmentation for Sensing Surroundings (DOI 10.1109/LRA.2020.3004325) the system takes in multiple camera views looking all around the robot. Results of image segmentation are then synthesized into a 2D top-down segmented map of the robot’s surroundings. (“Where is the chair located?”)
The authors documented how to train a view parsing network in a virtual environment, and described the procedure to transfer a trained network to run on a physical robot. Today this process demands a significantly higher skill level than “download Arduino sketch” but we hope such modules will become more plug-and-play in the future for better and smarter robots.
With the Jetson Nano, NVIDIA has done a fantastic job of bringing GPU-accelerated machine learning to the masses. For less than the cost of a used graphics card, you get a turn-key Linux computer that’s ready and able to handle whatever AI code you throw at it. But if you’re trying to set up a lab for 30 students, the cost of even relatively affordable development boards can really add up.
Which is why [Tea Vui Huang] has developed jetson-emulator. This Python library provides a work-alike environment to NVIDIA’s own “Hello AI World” tutorials designed for the Jetson family of devices, with one big difference: you don’t need the actual hardware. In fact, it doesn’t matter what kind of computer you’ve got; with this library, anything that can run Python 3.7.9 or better can take you through NVIDIA’s getting started tutorial.
So what’s the trick? Well, if you haven’t guessed already, it’s all fake. Obviously it can’t actually run GPU-accelerated code without a GPU, so the library [Tea] has developed simply pretends. It provides virtual images and even “live” camera feeds to which randomly generated objects have been assigned.
The original NVIDIA functions have been rewritten to work with these feeds, so when you call something like net.Classify(img) against one of them you’ll get a report of what faux objects were detected. The output will look just like it would if you were running on a real Jetson, down to providing fictitious dimensions and positions for the bounding boxes.
If you’re a hacker looking to dive into machine learning and computer vision, you’d be better off getting a $59 Jetson Nano and a webcam. But if you’re putting together a workshop that shows a dozen people the basics of NVIDIA’s AI workflow, jetson-emulator will allow everyone in attendance to run code and get results back regardless of what they’ve got under the hood.
NVIDIA kicked off their line of GPU-accelerated single board computers back in 2014 with the Jetson TK1, a $200 USD development system for those looking to get involved with the burgeoning world of so-called “edge computing”. It was designed to put high performance computing in a small and energy efficient enough package that it could be integrated directly into products, rather than connecting to a data center half-way across the world.
The TK1 was an impressive piece of hardware, but not something the hacker and maker community was necessarily interested in. For one thing, it was fairly expensive. But perhaps more importantly, it was clearly geared more towards industry types than consumers. We did see the occasional project using the TK1 and the subsequent TX1 and TX2 boards, but they were few and far between.
Then came the Jetson Nano. Its 128 core Maxwell CPU still packed plenty of power and was fully compatible with NVIDIA’s CUDA architecture, but its smaller size and $99 price tag made it far more attractive for hobbyists. According to the company’s own figures, the number of active Jetson developers has more than tripled since the Nano’s introduction in March of 2019. With the platform accessible to a larger and more diverse group of users, new and innovative applications for machine learning started pouring in.
Cutting the price of the entry level Jetson hardware in half was clearly a step in the right direction, but NVIDIA wanted to bring even more developers into the fray. So why not see if lightning can strike twice? Today they’ve officially announced that the new Jetson Nano 2GB will go on sale later this month for just $59. Let’s take a close look at this new iteration of the Nano to see what’s changed (and what hasn’t) from last year’s model.
Ever wonder what your favorite board game sounds like? Neither did we. Thankfully [Sara Adkins] did, and created a step sequencer called Let’s Go that uses the classic board game Go as input.
In the game Go, two players place black and white tokens on a grid, vying for control of the board. As the game progresses, the configuration of game pieces gets more complex and coincidentally begins to resemble Conway’s Game of Life (or a weird QR Code). Sara saw music in the evolving arrangement of circles and transformed the ancient board game into a modern instrument so others could hear it too.
To an observer, [Sara’s] adaptation looks fairly indistinguishable from the version played in China 2,500 years ago — with the exception of an overhead webcam and nearby laptop, of course. The laptop uses OpenCV to digitize the board layout. It feeds that information via Open Sound Control (OSC) into popular music creation software Max MSP (though an open-source version could probably be implemented in Pure Data), where it’s used to control a step sequencer. Each row on the board represents an instrumental voice (melodic for white pieces, percussive for black ones), and each column corresponds to a beat.
Every new game is a new piece of music that starts out simple and gradually increases in complexity. The music evolves with the board, and adds a new dimension for players to interact with the game. If you want to try it out yourself, [Sara] has the project fully documented on her website, and all of the code is available on GitHub. Now we’re just left wondering what other games sound like — [tinkartank] already answered that question for chess, but what about Settlers of Catan?