I’ve always been fascinated by AI and machine learning. Google TensorFlow offers tutorials and has been on my ‘to-learn’ list since it was first released, although I always seem to neglect it in favor of the shiniest new embedded platform.
Last July, I took note when Intel released the Neural Compute Stick. It looked like an oversized USB stick, and acted as an accelerator for local AI applications, especially machine vision. I thought it was a pretty neat idea: it allowed me to test out AI applications on embedded systems at a power cost of about 1W. It requires pre-trained models, but there are enough of them available now to do some interesting things.
I wasn’t convinced I would get great performance out of it, and forgot about it until last November when they released an improved version. Unambiguously named the ‘Neural Compute Stick 2’ (NCS2), it was reasonably priced and promised a 6-8x performance increase over the last model, so I decided to give it a try to see how well it worked.
I took a few days off work around Christmas to set up Intel’s OpenVino Toolkit on my laptop. The installation script provided by Intel wasn’t particularly user-friendly, but it worked well enough and included several example applications I could use to test performance. I found that face detection was possible with my webcam in near real-time (something like 19 FPS), and pose detection at about 3 FPS. So in accordance with the holiday spirit, it knows when I am sleeping, and knows when I’m awake.
That was promising, but the NCS2 was marketed as allowing AI processing on edge computing devices. I set about installing it on the Raspberry Pi 3 Model B+ and compiling the application samples to see if it worked better than previous methods. This turned out to be more difficult than I expected, and the main goal of this article is to share the process I followed and save some of you a little frustration.
First off, Intel provides a separate install process for the Raspberry Pi. The normal installer won’t work (I tried). Very generally, there are 3 steps to getting the NCS2 running with some application samples: Initial configuration of the Raspberry Pi, installing OpenVino, and finally compiling some application samples. The last step will take 3+ hours and some will fail, pace yourself accordingly. If you’re not installing it right this moment, it’s still worth your time to read through the other examples section below to get a feel for what is possible.
Preparing the Raspberry Pi
First, download Noobs, unzip to a microSD card (I used 16GB), and boot the Raspberry Pi off it. Install the default graphical environment, connect to the Internet, and update all software on the device. When done, open a terminal and run
sudo raspi-config. Select interfaces→enable camera. Shut down, remove power, plug in your camera, and boot up.
Open a terminal again, and run
sudo modprobe bcm2835-v4l2 (note that’s a lowercase L, not a 1), then confirm /dev/video0 now exists by navigating to /dev and running
ls. You’ll need to run this modprobe command each time you want the camera to be accessible this way, so consider adding this to startup.
Now, some of the applications we are going to compile will run out of memory, because the default swap partition size is 100 megabytes. Run
sudo nano /etc/dphys-swapfile and increase it – I changed it from 100 to 1024 and this proved sufficient. Save, reboot and run
free -h to confirm the swap size is increased. Finally, install cmake with
sudo apt-get install cmake, as we’ll need that later on when compiling.
At this stage you’re ready to begin Intel’s OpenVino install process.
Installing OpenVino Toolkit
In this section, we’ll be roughly following the instructions from Intel. I’ll assume you’re installing to a folder on the desktop for simplicity. Download OpenVino for Raspberry Pi (.tgz file), then copy it to /home/pi/Desktop and untar it with
tar xvf filename.tgz.
The install scripts need to explicitly know where they are located, so in the OpenVino folder, enter the /bin directory and open setupvars.sh in any text editor. Replace with the full path to your OpenVino folder, e.g. /home/pi/Desktop/inference_engine_vpu_arm/ and save.
The later scripts need this script loaded, so enter
sudo nano /home/pi/.bashrc and add ‘source /home/pi/Desktop/inference_engine_vpu_arm/bin/setupvars.sh’ to the end of the file. This will load setupvars.sh every time you open a terminal. Close your terminal window and open it again to apply this change.
Next we’ll set up the USB rules that will allow the NCS2 to work. First add yourself to the user group that the hardware requires with
sudo usermod -a -G users "$(whoami)". Log out, then back in.
Enter the install_dependencies folder of your OpenVino install. Run
sh install_NCS_udev_rules.sh. Now if you plug in your NCS2 and run
dmesg, you should see it correctly detected at the end of the output.
Intel’s documentation now shows us how to compile a single example application. We’ll compile more later. For now, enter /deployment_tools/inference_engine/samples and run:
$run mkdir build && cd build $cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS="-march=armv7-a" $make -j2 object_detection_sample_ssd
Compiling Other Examples
Compiling the other examples is less straightforward, but due to our initial setup, we can expect some success. My goal was to get face recognition and pose estimation working, so I stopped there. Object detection, classification, and some type of speech recognition also appear to have compiled correctly.
Before we try to compile the samples, it’s important to note that the pretrained AI models for the samples aren’t included in the Raspberry Pi OpenVino installer. In the normal installer, there’s a script that will automatically download them all for you – however no such luck with the Raspberry Pi version. Luckily you can download the relevant models for the application samples. In case that link breaks one day, all I did was look for URLs in all the scripts located in the model_downloader folder in the laptop/desktop version of the OpenVino installer. Alternatively, if you have OpenVino installed on another computer, you can copy the models over. I installed them to a folder named intel_models on the desktop, and the commands below assume you’ve done the same.
With that out of the way, enter /home/pi/Desktop/inference_engine_vpu_arm/deployment_tools/inference_engine/samples and open build_samples.sh in any text editor. Replace everything after the last if block (after the last “fi”) with:
build_dir=/home/pi/Desktop/ mkdir -p $build_dir cd $build_dir cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS="-march=armv7-a" make -j8 printf "\nBuild completed.\n\n”
./build_samples.sh, for me this ran for about 3 hours before failing at 54% complete. However, several of the sample applications had compiled correctly by then. At this point, you should be able to enter the directory: deployment_tools/inference_engine/samples/build/armv7l/Release and run:
./interactive_face_detection_demo -d MYRIAD -i /dev/video0 -m /home/pi/Desktop/intel_models/face-detection-retail-0004/FP16/face-detection-retail-0004.xml
Or for pose estimation:
./human_pose_estimation_demo -d MYRIAD -i /dev/video0 -m /home/pi/Desktop/intel_models/human-pose-estimation-0001/FP16/human-pose-estimation-0001.xml
As for silly mistakes I seem to keep making, remember to use modprobe as described earlier to make the Raspberry Pi camera accessible as /dev/video0, and remember to actually plug in the NCS2.
Overall, performance is something like 18FPS for facial recognition and 2.5FPS for pose detection, very similar to performance on my laptop. That’s good enough to open up a few applications I had in mind.
Other than that, I’ve learned that while AI taking over the world mainly makes for very entertaining stories, with only a few afternoons of careful assistance, it is presently able to take over a sizable proportion of my desk.