Re-imagining Telepresence With Humanoid Robots And VR Headsets

July 30, 2024

Don’t let the name of the Open-TeleVision project fool you; it’s a framework for improving telepresence and making robotic teleoperation far more intuitive than it otherwise would be. It accomplishes this in part by taking advantage of the remarkable technology packed into modern VR headsets like the Apple Vision Pro and Meta Quest. There are loads of videos on the project page, many of which demonstrate successful teleoperation across vast distances.

Teleoperation of robotic effectors typically takes some getting used to. The camera views are unusual, the limbs don’t move the same way arms do, and intuitive human things like looking around to get a sense of where everything is don’t translate well.

A stereo camera with gimbal streaming to a VR headset complete with head tracking seems like a very hackable design.

To address this, researches provided a user with a robot-mounted, real-time stereo video stream (through which the user can turn their head and look around normally) as well as mapping arm and hand movements to humanoid robotic counterparts. This provides the feedback to manipulate objects and perform tasks in a much more intuitive way. In short, when our eyes, bodies, and hands look and work more or less the way we expect, it turns out it’s far easier to perform tasks.

The research paper goes into detail about the different systems, but in essence, a stereo depth and RGB camera is perched with a 3D printed gimbal atop a humanoid robot frame like the Unitree H1 equipped with high dexterity hands. A VR headset takes care of displaying a real-time stereoscopic video stream and letting the user look around. Hand tracking for the user is mapped to the dexterous hands and fingers. This lets a person look at, manipulate, and handle things without in-depth training. Perhaps slower and more clumsily than they would like, but in an intuitive way all the same.

Interested in taking a closer look? The GitHub repository has the necessary code, and while most of us will never be mashing ADD TO CART on something like the Unitree H1, the reference design for a stereo camera streaming to a VR headset and mirroring head tracking with a two-motor gimbal looks like the sort of thing that would be useful for a telepresence project or two.

20 thoughts on “Re-imagining Telepresence With Humanoid Robots And VR Headsets”

eswan says:

July 30, 2024 at 9:00 am

Okay. Now put one on the moon.

Report comment

Reply
1. ono says:
  
  July 30, 2024 at 1:46 pm
  
  and some other ones on the minefield
  
  Report comment
  
  Reply
2. Panondorf says:
  
  July 31, 2024 at 5:53 am
  
  Stolen from Google who probably generated it from text stolen from elsewhere:
  
  “Radio waves propagate in vacuum at the speed of light c, exactly 299,792,458 m/s. Propagation time to the Moon and back ranges from 2.4 to 2.7 seconds, with an average of 2.56 seconds (the average distance from Earth to the Moon is 384,400 km).”
  
  So…. not entirely unusable but it’s probably not going to be as easy to use one that is on the moon.
  
  Report comment
  
  Reply
3. TG says:
  
  July 31, 2024 at 11:39 am
  
  They will be putting it in Bangalore. The operator’s station, that is.
  
  Report comment
  
  Reply
Joe says:

July 30, 2024 at 9:09 am

Shameless plug for my own VR telepresence project: https://hackaday.io/project/188718-panobot the next version will have a 3d panorama view.

Report comment

Reply
1. recook says:
  
  July 31, 2024 at 2:08 am
  
  Hi Joe,
  FYI, i could only access that through the internet [archived](https://web.archive.org/web/20230606132620/https://hackaday.io/project/188718-panobot) version.
  
  Report comment
  
  Reply
deshipu says:

July 30, 2024 at 9:48 am

There was a movie about this https://www.youtube.com/watch?v=nbJGQl-dJ6c

Report comment

Reply
1. TG says:
  
  July 31, 2024 at 11:41 am
  
  I was wondering if it would be Sleep Dealer!
  
  Report comment
  
  Reply
k-ww says:

July 30, 2024 at 10:38 am

If it looks like a duck, and quacks like a duck, it’s a waldo.

Report comment

Reply
make piece not war says:

July 30, 2024 at 10:54 am

Yeah, I need one of this but mobile, to crawl unde desks to replace/lay network cables, to carry computers to and from users, to replace printer toner, to be in the servers room to do maintenance, to carry parcels. It would be a nice thing for it to do it as I taking a nap.
Boss, can I use it to do my shopping? Pretty pleaseeee?

Not to mention that you cand strap a gun to it and all that movies with Arnold are having a chance to come to pass.

Report comment

Reply
1. TG says:
  
  July 31, 2024 at 11:51 am
  
  And you think they’d still pay you to do it after all that?
  
  Report comment
  
  Reply
jbx says:

July 30, 2024 at 11:04 am

Giving as an example a supermarket cashier, come on ! What great added value.
It is pictured here just to show the possibilities of the project, that’s all.

It was obviously developed for risk situations or dangerous environments, where the operator has great skills and long training and cannot be replaced easily… for example a specialized military.

Also perfect to get rid of my mother-in-law remotely without leaving any clues. /s

Report comment

Reply
1. Garth says:
  
  July 30, 2024 at 11:58 am
  
  Oh now there would be a video… Coupon Karen vs Android Cashier…” What do you mean this coupon is expired ? Everyone else takes it !….I want to see your programmer ! “
  
  Report comment
  
  Reply
  1. TG says:
    
    July 31, 2024 at 12:58 pm
    
    “Robot” cashier: DOOO_NOT_REDEEEEM beep bob
    
    Report comment
    
    Reply
2. TG says:
  
  July 31, 2024 at 12:57 pm
  
  >It was obviously developed for risk situations or dangerous environments
  Such as countries where the people stubbornly demand a wage that isn’t third-world
  
  Report comment
  
  Reply
Garth says:

July 30, 2024 at 12:05 pm

VSI….Life But Only Better. (Virtual Self Industries)
In the near future, people live their lives free of pain, danger and complications through robotic representations of themselves…..Surrogates (Bruce Willis)

Report comment

Reply
Jan says:

July 30, 2024 at 2:24 pm

Combine this with the previous human skin and you have that doctor at the hospital that doesnt speak your language. :)

Report comment

Reply
The Commenter Formerly Known As Ren says:

July 31, 2024 at 2:28 pm

Why have it pick up a barcode scanner when one could be integrated into the vision software or mounted on the robot (such as a wrist or chest)?

Report comment

Reply
1. The Commenter Formerly Known As Ren says:
  
  July 31, 2024 at 2:30 pm
  
  Or mount a rotary tool 🔧 (e.g. Dremel) on a wrist?
  
  Report comment
  
  Reply
psuedonymous says:

August 1, 2024 at 2:10 am

I’m not sure where the ‘reimagining’ comes in, this is the same sort of telepresence and teleoperation that’s been experimented with for many decades (e.g. the old LEEP Telehead from the early 90s, the many waldo systems from the mid 20th century for radioactive material handling, etc).

Typically the hardest problem to solve with teleoperation is the bidirectional feedback loop: you need to feed pose and haptic sensing both ways across the link without inducting hysteresis and ‘haptic hammering’, and without adding so much smoothing that it impacts latency.
This work sidesteps that problem by just not attempting to solve it at all: there is no feedback at all, only unidirectional post replication. Pose errors and latency are handled by ignoring them and just operating the system really really slowly, then speeding up the video for the demo.

Report comment

Reply