Process 4 Billion Pixels Per Second From 16 DIY Cameras For The Best V-Tubing Rig Ever

June 10, 2026

[Dennis] is on YouTube with his channel “Made By Dennis,” but for the record he is a maker, not a V-tuber. On the other hand, his latest project– creating a profesisonal-level tracking rig with DIY IR cameras and a whole lot of moxie–does mean he’s now equipped to make the move to the prestigious, high-status world of pretending to be an anime girl.

That is of course not why he did it. Like most projects around here, the motivation was more a case of “I wonder if I can…”– in this case [Dennis] wondered what it would take for him to pull off the same sort of optical motion capture, or MoCap, that is used in Hollywood studios. Optical mocap has the advantage of being very precise, able to track things at high speeds, and not being in any way limited to the human form like the slew of AI-assisted methods hitting the market right now. The disatvantage is that you need to place markers on any part of your subject you want tracked, film them from all angles, and process a whole lot of pixels. In [Dennis]’s case, it ended up being about four billion. Keeping in mind that actually locating those points in 3D space is dependent on knowing exactly where your cameras are: if you want sub-millimeter precision, your cameras need to be fixed with sub-millimeter tolerance. It’s a big project, hence a long video, which is embedded below.

The DIY cameras use a AR0234 MIPI camera on a custom PCB with M12 lenses and IR filters. To improve the signal-to-noise ratio on optical MoCap, it’s standard to use near-IR light. The camera boards, as you might expect given the MIPI interface, hook into Raspberry Pi compute modules– the cheapest CM4 should work, though he’s using CM5s. The compute modules sit on custom boards that provide PoE, and some other niceties– like a small microcontroller driven by the pulse-per-second pin to help trigger the cameras in sync.

Each camera gets a ring light of near-IR LEDs that pulse at 160 W, which would be way more than PoE is specced to provide, but since the LEDs are only on when the camera is taking a frame, the average power is well within allowable limits. With 16 cameras each having their own ring light, that’s a lot of near-IR photons. Don’t forget your safety squints!

Rather than process the images with OpenCV, he has his own custom solution optimized for this use-case that [Dennis] reports is 300x faster. Luckily, he’s put his implementation on GitHub, along with the rest of the project. Even if you don’t have any v-tubing ambitions, this project is very impressive and worth checking out in its entirety.

Optical MoCap isn’t the only game in town, of course. If you want to do this cheap and easy, you can strap a bunch of IMU sensors to yourself– just don’t expect the same precision.

Thanks to [Dennis] for the tip!

24 thoughts on “Process 4 Billion Pixels Per Second From 16 DIY Cameras For The Best V-Tubing Rig Ever”

Andrew says:

June 10, 2026 at 10:58 pm

“professional”

Report comment

Reply
sweethack says:

June 11, 2026 at 1:20 am

if you want sub-millimeter precision, your cameras need to be fixed with sub-millimeter tolerance

Not you don’t. At worst, you want precise angles, not position. But in reality, you need to calibrate your camera, they can be anywhere whatever the tolerance. Put a single object in the scene and observe it by all your camera. Move your object, do the same (you don’t even need to know how you moved your object).
There’s a single solution that would match all the observation. In practice, it’s a least square error solution you’ll find by solving the linear system (of the projection by the pinhole equation). You’ll know both the object and camera pose (that is, attitude and position in 3D space). Bonus point: you can redo your calibration before each shot since it’s likely it’ll move when you’ll hurt the rig anyway.

Report comment

Reply
1. Christian says:
  
  June 11, 2026 at 7:40 am
  
  Yes. There should be some well defined calibration triangle, that is presented to all camera at the same time. Trying to setup all cameras mechanically beforehand seems futile.
  
  Report comment
  
  Reply
  1. Christian says:
    
    June 11, 2026 at 7:55 am
    
    Should be a tetrahedron with IR LEDs and well defined lengths. That should allow for camera calibration, as it is moved around, and calibration of the cameras against each other.
    
    Report comment
    
    Reply
    1. Christian says:
      
      June 11, 2026 at 7:58 am
      
      And the input should not be filtered to just pixel on/off. Maybe to find the lights/reflectors, but after that use the intensity of the pixels for subpixel accuracy. Don’t just throw it away.
      
      Report comment
      
      Reply
2. Tyler August says:
  
  June 11, 2026 at 8:06 am
  
  That’s a much fuller explanation. That said: the cameras must stay fixed to within that fraction of a millimeter from one another during filiming, or you’ll lose precision. That’s what the article was trying to say.
  
  Report comment
  
  Reply
  1. Miles says:
    
    June 11, 2026 at 8:14 am
    
    Seems like it would make sense to use accelerometer data and average the cameras, no it won’t be sub millimeter, but asking a rig to be that stable is a big ask. I suppose you could use dampened mounts?
    
    Report comment
    
    Reply
    1. S says:
      
      June 12, 2026 at 12:25 am
      
      The point of this was to answer your big an ask it really is. Turns out it’s actually achievable.
      
      Report comment
      
      Reply
3. jpa says:
  
  June 11, 2026 at 10:36 am
  
  One way is to have a few markers stationary in the surroundings, so the errors can be compensated for each frame.
  
  Report comment
  
  Reply
4. Iqn says:
  
  July 6, 2026 at 10:34 pm
  
  You misunderstand the statement in the article.
  It is not saying you need to PLACE the cameras with submilimeter accuracy.
  It is saying they need to be CALIBRATED to that precision, and “fixed” in place so they don’t move around and become uncalibrated.
  
  Precision and accuracy are not the same thing.
  
  Report comment
  
  Reply
Pheebe says:

June 11, 2026 at 2:16 am

It looks like it can track features with markers on really well, but I’m not sure it’ll be that useful for Vtubing. The biggest challenges I’ve come across with Vtubing is tracking hands, fingers, thumbs and facial features including eyes – the bits that, on the whole, seem to benefit from having some level of AI model to track.

Report comment

Reply
1. Tyler August says:
  
  June 11, 2026 at 8:09 am
  
  The bit about v-tubing was supposed to be a joke. That said, you can put a small marker on each finger for hand tracking and it should work great. Gluing retroreflective spheres to your pupils for eye tracking would be much less comfortable.
  
  Report comment
  
  Reply
  1. Anonymous says:
    
    June 11, 2026 at 9:36 am
    
    Need a way to get an eye shine like Riddick then it’ll be built in and you can save on electricity by needing less lighting.
    
    Report comment
    
    Reply
    1. HaHa says:
      
      June 11, 2026 at 11:35 am
      
      Sputter just enough aluminum to be half transparent onto hard contact lenses.
      
      Like coating a telescope mirror, but cut short.
      
      Report comment
      
      Reply
  2. cplamb says:
    
    June 11, 2026 at 12:55 pm
    
    Retroreflective prisms have been glued to contact lenses and used in eyes.
    
    Report comment
    
    Reply
RunnerPack says:

June 11, 2026 at 4:27 am

Sounds like a DIY version of the system StuffMadeHere uses for his robotic sports equipment.

Report comment

Reply
1. eldesconocido says:
  
  June 11, 2026 at 5:13 am
  
  Is called optitrack. Is a well know piece of equipment in robotics labs. You can see them in almost every lab robotics video
  
  Report comment
  
  Reply
purplepeopleated says:

June 11, 2026 at 5:14 am

freemocap project? its working towards realtime.
https://freemocap.org/

Report comment

Reply
1. S says:
  
  June 12, 2026 at 12:27 am
  
  This is real-time.
  
  Report comment
  
  Reply
DanielF says:

June 11, 2026 at 6:20 am

Useful pieces of in-depth hardware knowledge in there.

Report comment

Reply
heathdutton says:

June 11, 2026 at 12:23 pm

Instructions unclear. Watched video 15 times to learn about pain.

Report comment

Reply
Michael O says:

June 11, 2026 at 2:30 pm

“but for the record he is a maker, not a V-tuber. ”
A critical bit of information for this audience. 😂

Report comment

Reply
Zozo says:

June 12, 2026 at 2:20 am

Excellent equipment he owns.
Clever design! A bit overcomplicated with everything over LAN, would be easier with an RF sync.
Also great opportunity for a Kalman filter to eliminate the final jitter in dataset.

Report comment

Reply
exoHD says:

June 14, 2026 at 12:32 pm

I was honestly surprised to see Vtubing mentioned in a hackaday article. That being said I appreciate the video and the project.

Report comment

Reply

Hackaday

Process 4 Billion Pixels Per Second From 16 DIY Cameras For The Best V-Tubing Rig Ever

24 thoughts on “Process 4 Billion Pixels Per Second From 16 DIY Cameras For The Best V-Tubing Rig Ever”

Leave a ReplyCancel reply

Search

Never miss a hack

If you missed it

Encryption In The 1790s

The Need For Speed: Internet Speed Measurement (or DIY?)

Postal IRCs Are Almost A Thing Of The Past

Launching Rockets Is Hard, Bring Them Back Is Harder

Putting Some Zig In A Linux-Based 3D Printer

Our Columns

Hackaday Europe 2026: Half Quad, Half Blimp: Test. Fly. Survive.

FLOSS Weekly Episode 876: There Is No Money Fairy

Compile Here, Run Everywhere: Crosstool-Ng

Giving Resin 3D Printers Another Shot After Six Years

Hackaday Europe 2026: Project Gigapixel

24 thoughts on “Process 4 Billion Pixels Per Second From 16 DIY Cameras For The Best V-Tubing Rig Ever”

Leave a ReplyCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns