Camera held in hand

Review: Vizy Linux-Powered AI Camera

Vizy is a Linux-based “AI camera” based on the Raspberry Pi 4 that uses machine learning and machine vision to pull off some neat tricks, and has a design centered around hackability. I found it ridiculously simple to get up and running, and it was just as easy to make changes of my own, and start getting ideas.

Person and cat with machine-generated tags identifying them
Out of the box, Vizy is only a couple lines of Python away from being a functional Cat Detector project.

I was running pre-installed examples written in Python within minutes, and editing that very same code in about 30 seconds more. Even better, I did it all without installing a development environment, or even leaving my web browser, for that matter. I have to say, it made for a very hacker-friendly experience.

Vizy comes from the folks at Charmed Labs; this isn’t their first stab at smart cameras, and it shows. They also created the Pixy and Pixy 2 cameras, of which I happen to own several. I have always devoured anything that makes machine vision more accessible and easier to integrate into projects, so when Charmed Labs kindly offered to send me one of their newest devices, I was eager to see what was new.

I found Vizy to be a highly-polished platform with a number of truly useful hardware and software features, and a focus on accessibility and ease of use that I really hope to see more of in future embedded products. Let’s take a closer look.

Continue reading “Review: Vizy Linux-Powered AI Camera”

Amazing “Connect Fore!” Robot Challenges Your Putting Practice

We’ve just come across [Bithead]’s amazing, robotically-automated mashup of miniature golf and Connect Four, which also includes an AI opponent who pulls no punches in its drive to win. Connect Fore! celebrates Scotland — the birthplace of golf, after all — and looks absolutely fantastic.

Scotty the AI opponent uses this robotic turret to make their moves in a game of Connect Fore!

The way it works is this: players take turns putting colored balls into one of seven different holes at the far end of the table. Each hole feeds to a clear tube — visible in the middle of the table — which represent each of the columns in a game of Connect Four.

Each player attempts to stack balls in such a way that they create an unbroken line of four in their color, either horizontally, vertically, or diagonally. In a one-player game, a human player faces off against “Scotty”, the computer program that chooses its moves with intelligence and fires balls from a robotic turret.

[Bithead] started this project as a learning experience, and being such a complex project, the write-up is extensive. We really recommend reading through the whole thing if you are at all interested in what goes into making such a project work.

What’s particularly interesting is all of the ways in which things nearly worked, or needed nudging or fine adjustment. One might think that reliably getting a ball to enter a hole and roll down a PVC tube wouldn’t be a particularly finicky task, but it turns out that all kinds of things can go wrong.

Even finding the right play surface was a challenge. [Bithead]’s first purchase from Amazon was a total waste: it looked bad, smelled bad, and balls didn’t roll well on it. There are high-quality artificial turfs out there, but the good stuff gets shockingly expensive, and such a small project pretty much pigeonholes one as a nuisance customer when it comes to vendors. The challenges [Bithead] overcame serve as a reminder to keep the 80/20 rule (or Pareto principle) in mind when estimating what will get a project to the finish line.

Right under the page break below is a brief video tour of the completed table, and after that, you can watch a game in action as [Bithead] faces off against Scotty the AI. Curious about the inner workings? The last video has some build details that fill in a few blanks from the write-up.

We’ve seen an automated Chess table before, but this is an entirely other, utterly fantastic level of work.
Continue reading “Amazing “Connect Fore!” Robot Challenges Your Putting Practice”

AI-Generated Sleep Podcast Urges You To Imagine Pleasant Nonsense

[Stavros Korokithakis] finds the experience of falling asleep to fairy tales soothing, and this has resulted in a fascinating project that indulges this desire by using machine learning to generate mildly incoherent fairy tales and read them aloud. The result is a fantastic sort of automated, machine-generated audible sleep aid. Even the logo is machine-generated!

The Deep Dreams Podcast is entirely machine-generated, including the logo.

The project leverages the natural language generation abilities of OpenAI’s GPT-3 to create fairytale-style content that is just coherent enough to sound natural, but not quite coherent enough to make a sensible plotline. The quasi-lucid, dreamlike result is perfect for urging listeners to imagine pleasant nonsense (thanks to Nathan W Pyle for that term) as they drift off to sleep.

We especially loved reading about the methods and challenges [Stavros] encountered while creating this project. For example, he talks about how there is more to a good-sounding narration than just pointing a text-to-speech engine at a wall of text and mashing “GO”. A good episode has things like strategic pauses, background music, and audio fades. That’s where pydub — a Python library for manipulating audio — came in handy. As for the speech, text-to-speech quality is beyond what it was even just a few years ago (and certainly leaps beyond machine-generated speech in the 80s) but it still took some work to settle on a voice that best suited the content, and the project gradually saw improvement.

Deep Dreams Podcast has a GitLab repository if you want to see the code that drives it all, and you can go to the podcast itself to give it a listen.

Hackaday Links Column Banner

Hackaday Links: March 13, 2022

As Russia’s war on Ukraine drags on, its knock-on effects are being felt far beyond the eastern Europe theater. And perhaps nowhere is this more acutely felt than in the space launch industry, seeing that at least until recently, Russia was pretty much everyone’s go-to ride to orbit. All that has changed now, at least temporarily, and has expanded to include halting sales of rocket engines used in other nations’ launch vehicles. Specifically, Roscosmos has put an end to exports of the RD-180 engine used in the US Atlas V launch vehicle, along with the RD-181 thrusters found in the Antares rocket. The loss of these engines may be more symbolic than practical, at least for the RD-180 — United Launch Alliance stopped selling launches on Atlas V back last year, and had secured the engines it needed for the 29 flights it has booked by that April. Still, there’s some irony that the Atlas V, which started life as an ICBM aimed at the USSR in the 1950s, has lost its Russian-made engines.

Bad news for Jan Mrázek’s popular open-source parametric search utility which made JLCPCB’s component library easier to use. We wrote about it back in 2020, and things seemed to be going fine up until this week, when Jan got a take-down request for his service. When we first heard about this, we checked the application’s web page, which bore a big red banner that included what were apparently unpleasant accusations Jan had received, including the words “reptile” and “parasitic.” The banner is still there, but the text has changed to a more hopeful tone, noting that LCSC, the component supplier for JLC’s assembly service, objected to the way Jan was pulling component data, and that they are now working together on something that everyone can be happy with. Here’s hoping that the service is back in action again soon.

Good news, everyone: Epson is getting into the 3D printer business. Eager to add a dimension to the planar printing world they’ve mostly worked in, they’ve announced that they’ll be launching a direct-extrusion printer sometime soon. Aimed at the industrial market, the printer will use a “flat screw extruder,” which is supposed to be similar to what the company uses on its injection molding machines. We sure didn’t know Epson was in the injection molding market, so it’ll be interesting to see if expertise there results in innovation in 3D printing, especially if it trickles down to the consumer printing market. Just as long as they don’t try to DRM the pellets, of course.

You can’t judge a book by its cover, but it turns out that there’s a lot you can tell about a person’s genetics just by looking at their face. At least that’s according to an AI startup called FDNA, which makes an app called “Face2Gene” that the company claims can identify 300 genetic disorders by analyzing photos of someone’s face. Some genetic disorders, like Down Syndrome, leave easily recognizable facial features, but some changes are far more subtle and hard to recognize. We had heard of cases where photos of toddlers posted on social media were used to diagnose retinoblastoma, a rare cancer of the retina. But this is on another level entirely.

And finally, working in an Amazon warehouse has got to be a tough gig, and if some of the stories are to be believed, it borders on being a horror show. But one Amazonian recently shared a video that showed what it’s like to get trapped by his robotic coworkers. The warehouse employee somehow managed to get stuck in a maze created by Amazon’s pods, which are stacks of shelves that hold merchandise and are moved around the warehouse floor by what amounts to robotic pallet jacks. Apparently, the robots know enough to not collide with their meat-based colleagues, but not enough to not box them in. To be fair, the human eventually found a way out, but it was a long search and it seems like another pod could have moved into position to block the exit at any time. You could see it as a scary example of human-robot interaction gone awry, but we prefer to look at it as the robots giving their friend a little unscheduled break away from the prying eyes of his supervisor.

AI Maybe Revives Dead Languages

While Star Trek’s transporter is hard to imagine — perfect matter movement across vast distances with no equipment on one end — it may not be the most far-fetched piece of tech on the Enterprise. While there are several contenders, I strongly suspect the universal translator is the most unlikely MacGuffin. After all, how would you decipher a totally unknown language in real-time? Of course, no one wants to watch 30 episodes of TV about how we finally figured out what Klingons call clouds, so pretty much every science fiction movie has some hand-waving explanation for speaking the viewer’s language. Farscape had microbes, some aliens have telepathy that works with alien brains of any kind, and still others study English from afar for decades off camera. Babelfish anyone?

I was thinking about this because of an article I read by [Alizeh Kohari] about [Jiaming Luo’s] work using AI to decode dead languages. While this might seem to be similar to Spock’s translator, it really isn’t. Human languages change over time and distance. You only have to watch the BBC or read something written by Thomas Jefferson to see that. But there is still a lot in common, at least within certain domains.

Continue reading “AI Maybe Revives Dead Languages”

Display Your Speech In Realtime To Help Lipreaders In The Mask Era

Masks are all well and good when it comes to reducing the spread of deadly pathogens, but they can make it harder to understand people when they speak. They also make lipreading impossible. [Kevin Lewis] set about building something to help.

The system consists of a small screen that can be worn on the chest or other part of the body, and a lapel microphone to record the wearer’s speech. Using the Deepgram AI speech recognition API running on a Raspberry Pi Zero W, the system decodes the speech and displays it on the Hyperpixel screen.

The API is quite capable, and can be set to only respond to the wearer’s voice, or in a group mode, display speech from multiple people in the area, displaying other voices in another colour. There’s also a translation feature using the iTranslateApp API as well.

It’s a neat tool that could be of great use in conferences or in situations where a quick simple machine translation could majorly ease communication. Video after the break.
Continue reading “Display Your Speech In Realtime To Help Lipreaders In The Mask Era”

Robot with glowing eyes

Spatial AI And CV Hack Chat

Join us on Wednesday, December 1 at noon Pacific for the Spatial AI and CV Hack Chat with Erik Kokalj!

A lot of what we take for granted these days existed only in the realm of science fiction not all that long ago. And perhaps nowhere is this more true than in the field of machine vision. The little bounding box that pops up around everyone’s face when you go to take a picture with your cell phone is a perfect example; it seems so trivial now, but just think about what’s involved in putting that little yellow box on the screen, and how it would not have been plausible just 20 years ago.

Erik Kokalj

Perhaps even more exciting than the development of computer vision systems is their accessibility to anyone, as well as their move into the third dimension. No longer confined to flat images, spatial AI and CV systems seek to extract information from the position of objects relative to others in the scene. It’s a huge leap forward in making machines see like we see and make decisions based on that information.

To help us along the road to incorporating spatial AI into our projects, Erik Kokalj will stop by the Hack Chat. Erik does technical documentation and support at Luxonis, a company working on the edge of spatial AI and computer vision. Join us as we explore the depths of spatial AI.

join-hack-chatOur Hack Chats are live community events in the Hackaday.io Hack Chat group messaging. This week we’ll be sitting down on Wednesday, December 1st at 12:00 PM Pacific time. If time zones have you tied up, we have a handy time zone converter.