Video Streaming Like Your Raspberry Pi Depended On It

The Raspberry Pi is an incredibly versatile computing platform, particularly when it comes to embedded applications. They’re used in all kinds of security and monitoring projects to take still shots over time, or record video footage for later review. It’s remarkably easy to do, and there’s a wide variety of tools available to get the job done.

However, if you need live video with as little latency as possible, things get more difficult. I was building a remotely controlled vehicle that uses the cellular data network for communication. Minimizing latency was key to making the vehicle easy to drive. Thus I set sail for the nearest search engine and begun researching my problem.

My first approach to the challenge was the venerable VLC Media Player. Initial experiments were sadly fraught with issues. Getting the software to recognize the webcam plugged into my Pi Zero took forever, and when I did get eventually get the stream up and running, it was far too laggy to be useful. Streaming over WiFi and waving my hands in front of the camera showed I had a delay of at least two or three seconds. While I could have possibly optimized it further, I decided to move on and try to find something a little more lightweight.

Native MJPEG Streaming — If Your Network is Fast

The next streaming tool I turned to was a Microsoft Lifecam VX-2000, which natively provides an MJPEG output at up to 640×480 resolution. This led me to a tool known as MJPEGStreamer. By using a tool dedicated to streaming in that format, I would avoid any time-intensive transcoding of the webcam video that could introduce latency into the stream. At first, it looked like this might be a perfect solution to my issues. Once installed on the Raspberry Pi, MJPEGstreamer can be accessed remotely through a web interface, upon which it presents the stream.

To measure latency, I downloaded a stopwatch app on my phone. I held the phone next to the screen displaying the video stream over WiFi. By then filming the phone & the stream with another camera, I could check the recording and see the difference between the time on the stopwatch, and the time displayed on the stream.

Latency was under 500ms, which I considered acceptable. However, there was a problem. MJPEGstreamer didn’t handle low speed connections well. The software insisted on sending every frame no matter what, so if the connection hit a rough patch, MJPEGstreamer would wait until the frames were sent. This meant that while initial latency was 500ms, it would blow out to several seconds in just a few minutes of running the stream. This simply wouldn’t do, so I resumed my search.

Gstreamer is the Swiss Army Knife of Streaming

I eventually came across a utility called Gstreamer. It’s a real Swiss Army knife of video streaming tools, rolled up into an arcane command-line utility. Gstreamer is configured by the use of “pipelines”, which are a series of commands specifying where to get video from, how to process and encode it, and then where to send it out. To understand how this works, let’s take a look at the pipelines I used in my low-latency application below.

Let’s look at the Raspberry Pi side of things first. This is the GStreamer command I used:

gst-launch -v v4l2src ! "image/jpeg,width=160,height=120,framerate=30/1" ! rtpjpegpay ! udpsink host= port=5001

Let’s break this down into sections. “gst-launch” refers to the gstreamer executable. The “-v” enables verbose mode, which tells Gstreamer to tell us absolutely everything that’s going on. It’s useful when you’re trying to troubleshoot a non-functioning stream. “v4l2src” tells Gstreamer we want it to grab video from a video capture source, in our case, a webcam using the Video4Linux2 drivers. The following section indicates we want to grab the JPEG frames straight from the camera in 160×120 resolution, at 30 frames per second. “rtpjpegpay”encodes the JPEG frames into RTP packets. RTP stands for “Real-time Transport Protocol”, and is a standard for streaming audio and video over IP networks. Finally, the “udpsink” command indicates we want to send the RTP stream out over a UDP connection to the host at on port 5001.

Now let’s take a look at the receiving end of things. For viewing the stream, I used a Windows 10 laptop, and initial testing was done over WiFi. Here’s the pipeline:

gst-launch-1.0.exe -e -v udpsrc port=5001 ! application/x-rtp, encoding-name=JPEG, payload=26 ! rtpjpegdepay ! jpegdec ! autovideosink

gst-launch-1.0.exe refers to the executable, and the “-v” switch functions identically here. “-e” tells the remote machine to stop sending video when we stop viewing the stream. This time, we are using “udpsrc”, as we want to grab the RTP stream that’s coming in on port 5001. We have to tell Gstreamer that it’s looking at an RTP stream in JPEG format, which is what the application, encoding name, and payload commands do. “rtpjpegdepay” then unpacks the RTP stream into individual JPEG frames, which are then decoded by “jpegdec” and sent out to the screen by “autovideosink”.

This might seem straightforward, but it can take a lot of experimentation to get a working pipeline on both ends. It can often be very confusing to figure out if an issue is on the sending or receiving end. Something as simple as using a receiving computer with a different graphics card can completely stop a pipeline from working, and different plugins or video sinks will be required. I had this issue when my laptop switched between its Intel and NVIDIA graphics cards. Thankfully, plugins like “videotestsrc” and “testsink” can be used for troubleshooting.

Gstreamer in Practice

Overall, performance is great. Measured latency was consistently under 300ms, even when switching to a 4G cellular data connection. On a good day, I can comfortably use the stream at up to 320×240 resolution without issues. I suspect that sending JPEG frames is highly inefficient, and I’d like to try sending an H264 stream instead.

The resolution of the stream is very low, still very usable for remote piloting, but begs for a resolution upgrade.

While some Raspberry Pis do have hardware H264 encoding on board, I’d prefer to start with a native stream for maximum performance. I’ve ordered a 1080P camera that uses the Pi camera interface, and I can’t wait to start experimenting.

Fundamentally, Gstreamer turned out to be the right tool for the job for a number of reasons. Firstly, it’s open source and based around a highly flexible plugin architecture. Secondly, it’s lightweight, and readily drops frames so the receiving machine is always displaying the latest image. It’s difficult to get to grips with, but when it works, it works.

For a more thorough tutorial on setting up Gstreamer with various setups, I can highly recommend this article by Einar Sundgren, which taught me much of what I needed to know. Below, I’ve attached a video of my remote vehicle escapades, courtesy of the Pi Zero, Gstreamer, and long nights spent hacking with a few cold beers.

Lastly, I’d also love to hear from the wider community – what’s your favoured way to run a low latency webcam stream with the Raspberry Pi? Tell us in the comments below!

20 thoughts on “Video Streaming Like Your Raspberry Pi Depended On It

  1. There’s this little project called Wifibroadcast, which uses a Pi with it’s camera and streams over WiFi in monitor mode…
    I was able to reach 200ms when viewing on a laptop and it’s even faster if the other end is another Pi and the display is not buffering the picture, the GPU HW decoding/encoding offers very low latency…

    1. Your picture is crap not because of the resolution, but because of the shit optics.
      If you’re going to buy a Pi camera, get the one with large lenses (they have a M12 fine thread mount, security cams use these lenses), the tiny ones have atrocious low-light performance.

  2. I highly suggest using the Raspberry Pi cameras for this – because of the hardware-accelerated encoding, plugins available for most popular video toolkits and their “just works” principle in general. Other than that, Logitech cameras usually “just work” for me, too – and if I need a microphone built-in, this is usually the way to go.

    BTW, do check out v4l2-ctl, it’s a command-line tool that allows you to set up internal parameters of your camera (exposure, brightness, power line compensation, focus if it’s not automatic and similar things). Sometimes it can really make a difference =)

    1. Thanks for the tip! I did have a little tinker with v4l2-ctl as when I first plugged my camera in, it wasn’t working, displaying only a flat white image. Only after pulling my hair out did I realise it was just overexposed!

  3. I also needed a live stream for a remote-controlled car, but it had to be cross-platform and easy to set up (so no apps/tools to be installed and configured), and ended up modifying a project I found on Github. It uses HTML5, websockets for real-time streaming and a video decoder compiled with asm.js (the HTML5 video element is too slow).
    Link to the original project for anyone interested:

  4. Good comparison of various streaming protocols I have been confused about ! So Gstreamer is the winner.

    Now if you really want to go cheapo, how about using a $5/9 Omega2/+ ( 64/128 MB RAM and 16/32MB storage with OpenWrt, no SD card expense) with a $7 PS3Eye (from Amazon) which only outputs YUV but not Mjpeg frames, as the source device streaming over WiFi to an RPi or other capable device.

    First Omega2 will have to encode YUV to h264/jpeg for streaming requiring its 580Mhz MIPS CPU to work hard. Next Gstreamer will do the RTP/UDP streaming over its capable WiFi. Omega2 does have D+/D- pins for USB connection to the PS3EYE. Both can share a PSU with 3.3V buck for Omega2.

    Is above possible? What WiFi throughput is needed with what approximate range achievements ?

    What possible frame rate and resolution e.g., 240x 120 or so?

    1. Even MJPEG on such a CPU is starting to get painful, unfortunately – last time I tried, it was, at least. But I have to say – go ahead and try! I’d pick something like a security-camera-processor-based-devboard, but I don’t think those get as much support for their hardware encoding features, at least not as open-source drivers (as opposed to binary blobs, I think we have plenty of those)

  5. Back when I was building my HD FPV system for my hex-copter I used the RPI with it’s camera over wifi. I got around 70ms latency using raspivid over netcat to mplayer. Not bad for just flying around doing photography, but still a little too much to super fast stunt flying. I ripped apart a UBNT loco M5 (5Ghz) to feed to my ground station running linux on a laptop. From my notes:

    1st From laptop: ​
    ​raspivid -vs -n -w 1024 -h 576 -t 0 -b 5000000 -o – | nc.traditional 5000 &

    Then From RPI (FPV):
    ​nc.traditional -l -p 5000 | mplayer -noautosub -noass -nosound -framedrop -nocorrect-pts -nocache -fps 60 -demuxer h264es –

    Of course insert your appropriate IP. My last piece of puzzle since I did this in scripts and python/perl was to rewrite the netcat to auto connect since you needed to enter the laptop commands first to enter monitor mode, then connect from RPI. Always bugged me you couldn’t reverse the connection on netcat.

    Then I bought a DJI Mavic, and well… That project got shelved. lol

  6. An IP cctv camera would be a much better starting out point.
    wifi ones are available.
    Better optics
    Aimed for low latency live streaming
    Hardware h264 or h265 encoding
    I/O ports
    some even have telemetry output on 485/232 – so you know, you could control something in 3 axis with a joystick…

    Armed with an RPi, everything looks like a nail. Or something like that.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s