The uses of artificial intelligence and machine learning continue to expand, with one of the more recent implementations being video processing. A new method can “fill in” frames to smooth out the appearance of the video, which [LegoEddy] was able to use this in one of his animated LEGO movies with some astonishing results.
His original animation of LEGO figures and sets was created at 15 frames per second. As an animator, he notes that it’s orders of magnitude more difficult to get more frames than this with traditional methods, at least in his studio. This is where the artificial intelligence comes in. The program is able to interpolate between frames and create more frames to fill the spaces between the original. This allowed [LegoEddy] to increase his frame rate from 15 fps to 60 fps without having to actually create the additional frames.
While we’ve seen AI create art before, the improvement on traditionally produced video is a dramatic advancement. Especially since the AI is aware of depth and preserves information about the distance of objects from the camera. The software is also free, runs on any computer with an appropriate graphics card, and is available on GitHub.
This is one of those so-simple-I-wish-I-invented-it hacks. Professor [Michael Peshkin] is teaching his engineering students remotely. While he has a nice second camera that he can use to transmit whatever he doodles on paper, most of his students just have the single webcam built into their laptops.
The trick is making a nice frame. [Michael] has bent one out of wire, but suggests that a mirror compact works about as well in a pinch. It’s super important that his students can ask him questions backed up by drawings, and this reduces the startup cost to nearly nothing, making it universally useful.
Netflix has recently announced that they now stream optimized shot-based encoding content for 4K. When I read that news title I though to myself: “Well, that’s great! Sounds good but… what exactly does that mean? And what’s shot-based encoding anyway?”
[Ottverse] has an interesting series in progress to demystify video compression. The latest installment promises to explain discrete cosine transforms as though you were five years old.
We’ll be honest. At five, we probably didn’t know how to interpret this sentence:
…the Discrete Cosine Transform takes a set of N correlated (similar) data-points and returns N de-correlated (dis-similar) data-points (coefficients) in such a way that the energy is compacted in only a few of the coefficients M where M << N.
Still, the explanation is pretty clear and we really liked the analogy with the spheres and the stars in a constellation.
It’s true what they say — you never know what you can do until you try. Russell Kirsch, who developed the first digital image scanner and subsequently invented the pixel, was a firm believer in this axiom. And if Russell had never tried to get a picture of his three-month-old son into a computer back in 1957, you might be reading Hackaday in print right now. Russell’s work laid the foundation for the algorithms and storage methods that make digital imaging what it is today.
Russell A. Kirsch was born June 20, 1929 in New York City, the son of Russian and Hungarian immigrants. He got quite an education, beginning at Bronx High School of Science. Then he earned a bachelor’s of Electrical Engineering at NYU, a Master of Science from Harvard, and attended American University and MIT.
In 1951, Russell went to work for the National Bureau of Standards, now known as the National Institutes of Science and Technology (NIST). He spent nearly 50 years at NIST, and started out by working with one of the first programmable computers in America known as SEAC (Standards Eastern Automatic Computer). This room-sized computer built in 1950 was developed as an interim solution for the Census Bureau to do research (PDF).
Like the other computers of its time, SEAC spoke the language of punch cards, mercury memory, and wire storage. Russell Kirsch and his team were tasked with finding a way to feed pictorial data into the machine without any prior processing. Since the computer was supposed to be temporary, its use wasn’t as tightly controlled as other computers. Although it ran 24/7 and got plenty of use, SEAC was more accessible than other computers, which allowed time for bleeding edge experimentation. NIST ended up keeping SEAC around for the next thirteen years, until 1963.
The Original Pixel Pusher
The term ‘pixel’ is a shortened portmanteau of picture element. Technically speaking, pixels are the unit of length for digital imaging. Pixels are building blocks for anything that can be displayed on a computer screen, so they’re kind of the first addressable blinkenlights.
As the drum slowly rotated, a photo-multiplier moved back and forth, scanning the image through a square viewing hole in the wall of a box. The tube digitized the picture by transmitting ones and zeros to SEAC that described what it saw through the square viewing hole — 1 for white, and 0 for black. The digital image of Walden is 76 x 76 pixels, which was the maximum allowed by SEAC.
In in the video below, Russell discusses the idea and proves that variable pixels make a better image with more information than square pixels do, and with significantly fewer pixels overall. It takes some finagling, as pixel pairs of triangles and rectangles must be carefully chosen, rotated, and mixed together to best represent the image, but the image quality is definitely worth the effort. Following that is a video of Russell discussing SEAC’s hardware.
Russell retired from NIST in 2001 and moved to Portland, Oregon. As of 2012, he could be found in the occasional coffeehouse, discussing technology with anyone he could engage. Unfortunately, Russell developed Alzheimer’s and died from complications on August 11, 2020. He was 91 years old.
Cameras are getting smarter and more capable than ever, able to run embedded machine vision algorithms and pull off tricks far beyond what something like a serial camera and microcontroller board would be capable of, and the upcoming Vizy aims to be even smarter and easier to use yet. Vizy is the work of Charmed Labs, and this isn’t their first foray into accessible machine vision. Charmed Labs are the same folks behind the Pixy and Pixy 2 cameras. Vizy’s main goal is to make object detection and classification easy, with thoughtful hardware features and a browser-based interface.
The usual way to do machine vision is to get a USB camera and run something like OpenCV on a desktop machine to handle the processing. But Vizy leverages a Raspberry Pi 4 to provide a tightly-integrated unit in a small package with a variety of ready-to-run applications. For example, the “Birdfeeder” application comes ready to take snapshots of and identify common species of bird, while also identifying party-crashers like squirrels.
The demonstration video on their page shows off using the built-in high-current I/O header to control a sprinkler, repelling non-bird intruders with a splash of water while uploading pictures and video clips. The hardware design also looks well thought out; not only is there a safe shutdown and low-power mode for the Raspberry Pi-based hardware, but the lens can be swapped and the camera unit itself even contains an electrically-switched IR filter.
Vizy has a Kickstarter campaign planned, but like many others, Charmed Labs is still adjusting to the changes the COVID-19 pandemic has brought. You can sign up to be notified when Vizy launches; we know we’ll be keen for a closer look once it does. Easier machine vision is always a good thing, because it helps free people to focus on clever ideas like machine vision-based tool alignment.
Video conferencing is nothing new, but the recent world events have made it much more mainstream than it has been in the past. Luckily, web camera technology is nothing new and most software can also show your screen. But what about your paper documents? Turns out that [John Nelson] can show you how to spend $5 for an old laptop camera module and put your documents center stage on your next Zoom, Skype or other video conferences.
This is especially good for things that would be hard to draw in real time during a conference like a quick sketch, a schematic, or as you can see in the post and the video demo below, chemical molecule diagrams.