How To Roll Your Own Custom Object Detection Neural Network

Real-time object detection, which uses neural networks and deep learning to rapidly identify and tag objects of interest in a video feed, is a handy feature with great hacker potential. Happily, it’s also possible to make customized CNNs (convolutional neural networks) tailored for one’s own needs, and that process just got easier thanks to some new documentation for the Vizy “AI camera” by Charmed Labs.

Raspberry Pi-based Vizy camera

Charmed Labs has been making hacker-friendly machine vision devices for a long time, and the Vizy camera impressed us mightily when we checked it out last year. Out of the box, Vizy has a perfectly functional object detector application that runs locally on the device, and can detect and tag many common everyday objects in real time. But what if that default application doesn’t quite meet one’s project needs? Good news, because it’s possible to create a custom-trained CNN, and that process got a lot more accessible thanks to step-by-step examples of training a model to recognize hands doing rock-paper-scissors.

Person and cat with machine-generated tags identifying them
Default object detection works well, but sometimes one needs custom results.

The basic process is this: Start with a variety of images that show the item of interest. Then identify and label the item of interest in each photo. These photos (a “training set”) are then sent to Google Colab, which will be used to generate a neural network. The resulting CNN model can then be downloaded and used, to see how well it performs.

Of course things rarely work perfectly the first time around, so at this point it’s pretty common for some refinement to be needed to increase accuracy. Luckily there are a number of tools to help do this without creating a new model from scratch, so it’s just a matter of tweaking until things perform acceptably.

Google Colab is free and the resulting CNNs are implemented in the TensorFlow Lite framework, meaning it’s possible to use them elsewhere. So if custom object detection has been holding up a project idea of yours, this might be what gets you over that hump.

Vizy “AI Camera” Wants To Make Machine Vision Less Complex

Vizy, a new machine vision camera from Charmed Labs, has blown through their crowdfunding goal on the promise of making machine vision projects both easier and simpler to deploy. The camera, which starts around $250, integrates a Raspberry Pi 4 with built-in power and shutdown management, and comes with a variety of pre-installed applications so one can dive right in.

The Sony IMX477 camera sensor is the same one found in the Raspberry Pi high quality camera, and supports capture rates of up to 300 frames per second (under the right conditions, anyway.) Unlike the usual situation faced by most people when a Raspberry Pi is involved, there’s no need to worry about adding a real-time clock, enclosure, or ensuring shutdowns happen properly; it’s all taken care of.

‘Birdfeeder’ application can automatically identify and upload images of visitors.

Charmed Labs are the same folks behind the Pixy and Pixy 2 cameras, and Vizy goes further in the sense that everything required for a machine vision project has been put onboard and made easy to use and deploy, even the vision processing functions work locally and have no need for a wireless data connection (though one is needed for things like automatic uploading or sharing.) For outdoor or remote applications, there’s a weatherproof enclosure option, and wireless connectivity in areas with no WiFi can be obtained by plugging in a USB cellular modem.

A few of the more hacker-friendly hardware features are things like a high-current I/O header and support for both C/CS and M12 lenses for maximum flexibility. The IR filter can also be enabled or disabled via software, so no more swapping camera modules for ones with the IR filter removed. On the software side, applications are all written in Python and use open software like Tensorflow and OpenCV for processing.

The feature list looks good, but Vizy also seems to have a clear focus. It looks best aimed at enabling projects with the following structure:

Detect Things (people, animals, cars, text, insects, and more) and/or Measure Things (size, speed, duration, color, count, angle, brightness, etc.)

Perform an Action (for example, push a notification or enable a high-current I/O) and/or Record (save images, video, or other data locally or remotely.)

The Motionscope application tracking balls on a pool table. (Click to enlarge)

A good example of this structure is the Birdfeeder application which comes pre-installed. With the camera pointed toward a birdfeeder, animals coming for a snack are detected. If the visitor is a bird, Vizy identifies the species and uploads an image. If the animal is not a bird (for example, a squirrel) then Vizy can detect that as well and, using the I/O header, could briefly turn on a sprinkler to repel the hungry party-crasher. A sample Birdfeeder photo stream is here on Google Photos.

Motionscope is a more unusual but very interesting-looking application, and its purpose is to capture moving objects and measure the position, velocity, and acceleration of each. A picture does a far better job of explaining what Motionscope does, so here is a screenshot of the results of watching some billiard balls and showing what it can do.

Vizy The AI Camera Aims To Ease Machine Vision

Cameras are getting smarter and more capable than ever, able to run embedded machine vision algorithms and pull off tricks far beyond what something like a serial camera and microcontroller board would be capable of, and the upcoming Vizy aims to be even smarter and easier to use yet. Vizy is the work of Charmed Labs, and this isn’t their first foray into accessible machine vision. Charmed Labs are the same folks behind the Pixy and Pixy 2 cameras. Vizy’s main goal is to make object detection and classification easy, with thoughtful hardware features and a browser-based interface.

Vizy can identify common birds with “Birdfeeder”, one of the several built-in applications that uses local processing only.

The usual way to do machine vision is to get a USB camera and run something like OpenCV on a desktop machine to handle the processing. But Vizy leverages a Raspberry Pi 4 to provide a tightly-integrated unit in a small package with a variety of ready-to-run applications. For example, the “Birdfeeder” application comes ready to take snapshots of and identify common species of bird, while also identifying party-crashers like squirrels.

The demonstration video on their page shows off using the built-in high-current I/O header to control a sprinkler, repelling non-bird intruders with a splash of water while uploading pictures and video clips. The hardware design also looks well thought out; not only is there a safe shutdown and low-power mode for the Raspberry Pi-based hardware, but the lens can be swapped and the camera unit itself even contains an electrically-switched IR filter.

Vizy has a Kickstarter campaign planned, but like many others, Charmed Labs is still adjusting to the changes the COVID-19 pandemic has brought. You can sign up to be notified when Vizy launches; we know we’ll be keen for a closer look once it does. Easier machine vision is always a good thing, because it helps free people to focus on clever ideas like machine vision-based tool alignment.