Vizy “AI Camera” Wants To Make Machine Vision Less Complex

October 4, 2020 by Donald Papp 9 Comments

Vizy, a new machine vision camera from Charmed Labs, has blown through their crowdfunding goal on the promise of making machine vision projects both easier and simpler to deploy. The camera, which starts around $250, integrates a Raspberry Pi 4 with built-in power and shutdown management, and comes with a variety of pre-installed applications so one can dive right in.

The Sony IMX477 camera sensor is the same one found in the Raspberry Pi high quality camera, and supports capture rates of up to 300 frames per second (under the right conditions, anyway.) Unlike the usual situation faced by most people when a Raspberry Pi is involved, there’s no need to worry about adding a real-time clock, enclosure, or ensuring shutdowns happen properly; it’s all taken care of.

‘Birdfeeder’ application can automatically identify and upload images of visitors.

Charmed Labs are the same folks behind the Pixy and Pixy 2 cameras, and Vizy goes further in the sense that everything required for a machine vision project has been put onboard and made easy to use and deploy, even the vision processing functions work locally and have no need for a wireless data connection (though one is needed for things like automatic uploading or sharing.) For outdoor or remote applications, there’s a weatherproof enclosure option, and wireless connectivity in areas with no WiFi can be obtained by plugging in a USB cellular modem.

A few of the more hacker-friendly hardware features are things like a high-current I/O header and support for both C/CS and M12 lenses for maximum flexibility. The IR filter can also be enabled or disabled via software, so no more swapping camera modules for ones with the IR filter removed. On the software side, applications are all written in Python and use open software like Tensorflow and OpenCV for processing.

The feature list looks good, but Vizy also seems to have a clear focus. It looks best aimed at enabling projects with the following structure:

Detect Things (people, animals, cars, text, insects, and more) and/or Measure Things (size, speed, duration, color, count, angle, brightness, etc.)

Perform an Action (for example, push a notification or enable a high-current I/O) and/or Record (save images, video, or other data locally or remotely.)

The Motionscope application tracking balls on a pool table. (Click to enlarge)

A good example of this structure is the Birdfeeder application which comes pre-installed. With the camera pointed toward a birdfeeder, animals coming for a snack are detected. If the visitor is a bird, Vizy identifies the species and uploads an image. If the animal is not a bird (for example, a squirrel) then Vizy can detect that as well and, using the I/O header, could briefly turn on a sprinkler to repel the hungry party-crasher. A sample Birdfeeder photo stream is here on Google Photos.

Motionscope is a more unusual but very interesting-looking application, and its purpose is to capture moving objects and measure the position, velocity, and acceleration of each. A picture does a far better job of explaining what Motionscope does, so here is a screenshot of the results of watching some billiard balls and showing what it can do.

3D Printed Video Terminal Dials C For Cyberpunk

October 1, 2020 by Tom Nardi 8 Comments

Created for the Disobey 2020 hacker conference in Finland, this Blade Runner inspired communications terminal isn’t just for decoration. It was part of an interactive game that required attendees to physically connect their conference badges up and “call” different characters with the functional keypad on the front of the unit.

[Purkkaviritys] was in charge of designing the 3D printed enclosure for the device, which he says takes an entire 2 kg roll of filament to print out. Unfortunately he wasn’t as involved in the electronics side of things, so we don’t have a whole lot of information about the internals beyond the fact that its powered by a Raspberry Pi 4, features a HyperPixel 4.0 display, and uses power over Ethernet so it could be easily set up at the con with just a single cable run.

The keypad is a custom input device using the Arduino Micro and Cherry MX Blue switches with 3D printed keycaps to get that chunky payphone look and feel. [Purkkaviritys] mentions that the keypad is also responsible for controlling the RGB LED strips built into the sides of the terminal, and that the Raspberry Pi toggles the status of the Caps, Scroll Lock, and Num Lock keys to select the different lighting patterns.

Naturally we’d like to see more info on how this beauty was put together, but given that it was built for such a specific purpose, it’s not like you’d really need to duplicate the original configuration anyway. Thanks to [Purkkaviritys] you have the STL files to print off our own copy of the gloriously cyberpunk enclosure, all you’ve got to do now is figure out how to make video calls with it.

Continue reading “3D Printed Video Terminal Dials C For Cyberpunk” →

Boost Your Animation To 60 FPS Using AI

September 20, 2020 by Bryan Cockfield 43 Comments

The uses of artificial intelligence and machine learning continue to expand, with one of the more recent implementations being video processing. A new method can “fill in” frames to smooth out the appearance of the video, which [LegoEddy] was able to use this in one of his animated LEGO movies with some astonishing results.

His original animation of LEGO figures and sets was created at 15 frames per second. As an animator, he notes that it’s orders of magnitude more difficult to get more frames than this with traditional methods, at least in his studio. This is where the artificial intelligence comes in. The program is able to interpolate between frames and create more frames to fill the spaces between the original. This allowed [LegoEddy] to increase his frame rate from 15 fps to 60 fps without having to actually create the additional frames.

While we’ve seen AI create art before, the improvement on traditionally produced video is a dramatic advancement. Especially since the AI is aware of depth and preserves information about the distance of objects from the camera. The software is also free, runs on any computer with an appropriate graphics card, and is available on GitHub.

Continue reading “Boost Your Animation To 60 FPS Using AI” →

Mirror Turns Webcam Into Document Camera

September 17, 2020 by Elliot Williams 12 Comments

This is one of those so-simple-I-wish-I-invented-it hacks. Professor [Michael Peshkin] is teaching his engineering students remotely. While he has a nice second camera that he can use to transmit whatever he doodles on paper, most of his students just have the single webcam built into their laptops.

The solution is to put a mirror in front of the laptop cam, and flip the image left-to-right in software. They use Zoom, which has a mirror mode. Done.

The trick is making a nice frame. [Michael] has bent one out of wire, but suggests that a mirror compact works about as well in a pinch. It’s super important that his students can ask him questions backed up by drawings, and this reduces the startup cost to nearly nothing, making it universally useful.

[Prof. Peshkin] is not a stranger to mirror-based pedagogical hacks. Seven years ago, he showed us how to make a transparent whiteboard for video lectures, and it blew up on Hackaday. Since then, there are hundreds or thousands of Lightboards in the wild. We hope this idea catches on as well!

Decoding The Netflix Announcement: Explaining Optimized Shot-Based Encoding For 4K

September 16, 2020 by Pedro Umbelino 33 Comments

Netflix has recently announced that they now stream optimized shot-based encoding content for 4K. When I read that news title I though to myself: “Well, that’s great! Sounds good but… what exactly does that mean? And what’s shot-based encoding anyway?”

These questions were basically how I ended up in the rabbit hole of the permanent encoding optimization history, in an effort to thoroughly dissect the above sentences and properly understand it, so I can share it with you. Before I get into it, lets take a trip down memory lane. Continue reading “Decoding The Netflix Announcement: Explaining Optimized Shot-Based Encoding For 4K” →

Video Compression Explainer — Like We’re Five-Year-Olds

August 28, 2020 by Al Williams 12 Comments

[Ottverse] has an interesting series in progress to demystify video compression. The latest installment promises to explain discrete cosine transforms as though you were five years old.

We’ll be honest. At five, we probably didn’t know how to interpret this sentence:

…the Discrete Cosine Transform takes a set of N correlated (similar) data-points and returns N de-correlated (dis-similar) data-points (coefficients) in such a way that the energy is compacted in only a few of the coefficients M where M << N.

Still, the explanation is pretty clear and we really liked the analogy with the spheres and the stars in a constellation.

Continue reading “Video Compression Explainer — Like We’re Five-Year-Olds” →

Russell Kirsch: Pixel Pioneer And The Father Of Digital Imaging

August 20, 2020 by Kristina Panos 28 Comments

It’s true what they say — you never know what you can do until you try. Russell Kirsch, who developed the first digital image scanner and subsequently invented the pixel, was a firm believer in this axiom. And if Russell had never tried to get a picture of his three-month-old son into a computer back in 1957, you might be reading Hackaday in print right now. Russell’s work laid the foundation for the algorithms and storage methods that make digital imaging what it is today.

Russell reads SEAC’s last printout. Image via TechSpot

Russell A. Kirsch was born June 20, 1929 in New York City, the son of Russian and Hungarian immigrants. He got quite an education, beginning at Bronx High School of Science. Then he earned a bachelor’s of Electrical Engineering at NYU, a Master of Science from Harvard, and attended American University and MIT.

In 1951, Russell went to work for the National Bureau of Standards, now known as the National Institutes of Science and Technology (NIST). He spent nearly 50 years at NIST, and started out by working with one of the first programmable computers in America known as SEAC (Standards Eastern Automatic Computer). This room-sized computer built in 1950 was developed as an interim solution for the Census Bureau to do research (PDF).

Standards Eastern Automatic Computer (SEAC) was the first programmable computer in the United States. Credit: NIST via Wikimedia

Like the other computers of its time, SEAC spoke the language of punch cards, mercury memory, and wire storage. Russell Kirsch and his team were tasked with finding a way to feed pictorial data into the machine without any prior processing. Since the computer was supposed to be temporary, its use wasn’t as tightly controlled as other computers. Although it ran 24/7 and got plenty of use, SEAC was more accessible than other computers, which allowed time for bleeding edge experimentation. NIST ended up keeping SEAC around for the next thirteen years, until 1963.

The Original Pixel Pusher

This photo of Russell’s son Walden is the first digitized image. Public Domain via Wikimedia

The term ‘pixel’ is a shortened portmanteau of picture element. Technically speaking, pixels are the unit of length for digital imaging. Pixels are building blocks for anything that can be displayed on a computer screen, so they’re kind of the first addressable blinkenlights.

In 1957, Russell brought in a picture of his son Walden, which would become the first digital image (PDF). He mounted the photo on a rotating drum scanner that had a motor on one end and a strobing disk on the other. The drum was coupled to a photo-multiplier vacuum tube that spun around on a lead screw. Photo-multipliers are used to detect very low levels of light.

As the drum slowly rotated, a photo-multiplier moved back and forth, scanning the image through a square viewing hole in the wall of a box. The tube digitized the picture by transmitting ones and zeros to SEAC that described what it saw through the square viewing hole — 1 for white, and 0 for black. The digital image of Walden is 76 x 76 pixels, which was the maximum allowed by SEAC.

Variable-Shaped Pixels

If Russell Kirsch had any regrets, it is that he designed pixels to be square. Ten years ago at the age of 81, he started working on a variable-shaped pixels with the hope of improving the future of digital imaging. He wrote a LISP program to explore the idea, and simulated triangular and rectangular pixels using a 6×6 array of square pixels for each.

Alternative pixel geometries. Image via Cloudseed Films

In in the video below, Russell discusses the idea and proves that variable pixels make a better image with more information than square pixels do, and with significantly fewer pixels overall. It takes some finagling, as pixel pairs of triangles and rectangles must be carefully chosen, rotated, and mixed together to best represent the image, but the image quality is definitely worth the effort. Following that is a video of Russell discussing SEAC’s hardware.

Russell retired from NIST in 2001 and moved to Portland, Oregon. As of 2012, he could be found in the occasional coffeehouse, discussing technology with anyone he could engage. Unfortunately, Russell developed Alzheimer’s and died from complications on August 11, 2020. He was 91 years old.

Continue reading “Russell Kirsch: Pixel Pioneer And The Father Of Digital Imaging” →