Object Detection, With TensorFlow

Getting computers to recognize objects has been a historically difficult problem in computer science, but with the rise of machine learning it is becoming easier to solve. One of the tools that can be put to work in object recognition is an open source library called TensorFlow, which [Evan] aka [Edje Electronics] has put to work for exactly this purpose.

His object recognition software runs on a Raspberry Pi equipped with a webcam, and also makes use of Open CV. [Evan] notes that this opens up a lot of creative low-cost detection applications for the Pi, such as setting up a camera that detects when a pet is waiting at the door to be let inside or outside, counting the number of bees entering and exiting a beehive, or monitoring parking spaces at an office.

This project uses a number of other toolkits as well, including Protobuf. It also makes extensive use of Python scripts, but if you’re comfortable with that and you have an application for computer vision, [Evan]’s tutorial will get you started.

11 thoughts on “Object Detection, With TensorFlow

  1. Can anyone please point me to tutorial for cross compiling opencv and installing on pi downloading and building opencv on pi takes forever. I have RPi zero which does not have network connection.

    1. I installed it over several hours today.

      Tip: If you are starting from a fresh NOOBs image, as I did, you should type
      sudo apt-get install libxml2-dev libxslt-dev
      after the initial update, to avoid some downstream errors with the lxml and cython installs.

      While the system seemed reasonably good at recognizing people (probably because this was a key use case), it was comically bad at recognizing other things. It mistakenly reported donuts, toilets, TVs, ovens, suitcases, trucks, birds, handbags, and other seemingly random stuff as I moved the camera around my workbench.

      Not that it is fair to expect a neural net that was trained on relatively small set 2D images over a few hours or days to recognize 3D objects as well as a human being, who typically has had years to internalize spatial understanding of thousands of real-world 3D objects, due to years of experience processing visual stimuli in stereo.

      The AI training process can only be expected to produce an image classification heuristic, and not achieve actual contextual understanding. In other words, this is not true intelligence, but merely a quick and dirty hack that will produce acceptable results only in situations where the scope, expectations, and/or environmental conditions are sufficiently limited.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.