Moondream title with man's face visible in background.

Using Moondream AI To Make Your Pi “See” Like A Human

[Jaryd] from Core Electronics shows us human-like computer vision with Moondream on the Pi 5.

Using the Moondream visual language model, which runs directly on your Raspberry Pi, and not in the cloud, you can answer questions such as “are the clothes on the line?”, “is there a package on the porch?”, “did I leave the fridge open?”, or “is the dog on the bed?” [Jaryd] compares Moondream to an alternative visual AI system, You Only Look Once (YOLO).

Processing a question with Moondream on your Pi can take anywhere from just a few moments to 90 seconds, depending on the model used and the nature of the question. Moondream comes in two varieties, based on size, one is two billion parameters and the other five hundred million parameters. The larger model is more capable and more accurate, but it has a longer processing time — the fastest possible response time coming in at about 22 to 25 seconds. The smaller model is faster, about 8 to 10 seconds, but as you might expect its results are not as good. Indeed, [Jaryd] says the answers can be infuriatingly bad.

In the write-up, [Jaryd] runs you through how to use Moonbeam on your Pi 5 and the video (embedded below) shows it in action. Fair warning though, Moondream is quite RAM intensive so you will need at least 8 GB of memory in your Pi if you want to play along.

If you’re interested in machine vision you might also like to check out Machine Vision Automates Trainspotting With Unique Full-Length Portraits.

Continue reading “Using Moondream AI To Make Your Pi “See” Like A Human”