Project Starline Realizes Asimov’s 3D Vision

May 20, 2021

Issac Asimov wrote Caves of Steel in 1953. In it, he mentions something called trimensional personification. In an age before WebEx and Zoom, imagining that people would have remote meetings replete with 3D holograms was pretty far-sighted. We don’t know if any Google engineers read the book, but they are trying to create a very similar experience with project Starline.

The system is one of those that seems simple on the face of it, but we are sure the implementation isn’t easy. You sit facing something that looks like a window. The other person shows up in 3D as though they were on the other side of the window. Think prison visitation without the phone handset. The camera is mounted such that you look naturally at the other person through your virtual window.

Since you are sitting in a relatively fixed position, making a 3D display without headgear is much easier. From the video demonstrations, the display is awfully good, too. Of course, there are only a few Starline setups in Google offices today, but it does give you an idea of where things are probably going.

Then again, there’s no reason you couldn’t try cooking something like this up on your own. Granted, making a really good 3D display is still pretty difficult. Then again, you could always go retro.

36 thoughts on “Project Starline Realizes Asimov’s 3D Vision”

Ostracus says:

May 20, 2021 at 7:26 pm

Ah great. Add some “agony” to that booth and realize our Trek fantasies.

Thing is these booths run smack into the desire for telecommuting because only the most monied (and hooked up) will be able to use these.

Reply
1. Gravis says:
  
  May 20, 2021 at 9:48 pm
  
  That’s how all technologies start out. Computers were exceptionally expensive machines when they got started but now you would be hard pressed to buy something that doesn’t have a computer in it.
  
  Reply
2. LordNothing says:
  
  May 21, 2021 at 12:22 am
  
  i always figured that would work with some kind of neural scanner. which is likely a bit of kit in the medical scanning field. this of course would be tied to a cnc torture machine that fires laser pulses at all your nerve endings. don’t ever vote for me.
  
  Reply
3. pelrun says:
  
  May 21, 2021 at 1:18 am
  
  Deep learning researchers are producing systems that can synthesize plausible front-on facial images from simple off-axis cameras. That’s going to show up in standard video conferencing applications soon enough, using the standard webcam or phone camera.
  
  Reply
CityZen says:

May 20, 2021 at 7:39 pm

So you need multiple cameras to capture proper depth information. This information is sent across, and you need to be rendered in 3D on the other end, projected according to how the other person is looking at you. So their eye position also needs to be calculated. You don’t really need a 3D display to be very effective, but it helps. The eye tracking and corresponding rendering is what makes the “screen” disappear and seem like glass instead.

Reply
1. CityZen says:
  
  May 20, 2021 at 7:46 pm
  
  There is another, more cheesy, way to do this: Use a screen with an array of pinhole cameras behind it. Choose one camera image to send across to the other side, according to the eye position of the remote viewer. I suppose it would “pop” when the camera changed, so to avoid this, choose the 4 closest cameras and use bilinear interpolation to compute the image to send across. I imagine you’d run into a bunch of alignment issues, so throw an AI at it to solve them. :-)
  
  Reply
  1. Edward Nardella says:
    
    May 20, 2021 at 9:40 pm
    
    This approach breaks as the viewer changes distance from the screen.
    
    Reply
    1. CityZen says:
      
      May 20, 2021 at 11:08 pm
      
      That’s true. Making people sit in a chair helps address that.
      
      The problem could be solved by using the appropriate light field, which is probably being captured by those cameras already. You’d just need to choose the appropriate sets of pixels to combine. That’s really what was being done already, but just being simplistic about which pixels to choose.
      
      Reply
2. CityZen says:
  
  May 20, 2021 at 7:57 pm
  
  Cheesy approach #2: For the display, use an angled half-silvered mirror that shows a screen above/below/or off to one side. Then put a camera on a motion rig behind the mirror. The position and angle of the camera is controlled according to the position and gaze of the remote viewer. If you use some kind of mechanical tracking, then no computing power is needed here.
  
  Btw, you may have already noticed a limitation for these systems: one viewer (per side) only. If you wear shutter glasses, or use some other means to make sure each viewer sees a unique image, then you can have more.
  
  Reply
Truth says:

May 20, 2021 at 7:49 pm

So are they tracking the head position and modifying what is displayed based on head position ?
And how does it handle 3, 4, 5 or 20 people simultaneously in each room ?

Reply
1. Al Williams says:
  
  May 20, 2021 at 7:56 pm
  
  I doubt it does. That’s why the demo has the child in the Mom’s lap.
  
  Reply
2. Per Holmström says:
  
  May 21, 2021 at 2:38 am
  
  There are videos on the project page with a moving camera behind the person talking and the 3d effect does not seem to go away just because the camera goes to the side of the person infront of the mirror. You are thinking of somthing like the experiments Johnny Lee got famous for in the mid 2000s?
  I think the interesting part seems to be the display technology used.
  
  I do not know if Al is referring the peppers ghost pyramid “hologram” as a joke (as it is just translucent plastic pyramid reflecting four different images, one at each side)
  
  (I am still sore that no one got my joke about Peppers ghost on the video with a similar pyramid displaying an Ironman suit)
  
  Reply
  1. CityZen says:
    
    May 21, 2021 at 11:03 am
    
    Most likely, the videos on the project page were made by tracking the camera instead of tracking the person viewing (ie, treating the camera as the viewer).
    
    Reply
    1. Per Holmström says:
      
      May 21, 2021 at 3:49 pm
      
      Why would he call it a 3d display in that case? It would just be a normal display that changes its content depending on the viewers position.
      
      I have done development with multiple types of 3d cameras, Time of flight (Fotonic, Kinect), Structured Light (Orbbec) and others.
      So the 3D data collection is very familiar grounds for me.
      
      I am not as convinced as you that the display uses the point of view to create the 3d feeling, I am hoping for a 3d display with as good quality as the one in the video.
      
      Reply
      1. CityZen says:
        
        May 21, 2021 at 3:52 pm
        
        It could just be stereoscopic. If it were some amazing new 3D display technology, I’m sure the hype would focus on that instead of the remote connection part.
Truth says:

May 20, 2021 at 8:25 pm

I wonder how long until they add the overlay with exaggerated face temperature difference and neural network amplified micro expression detection. Knowing google, all the interactions using Starline are permanently recorded and will be used for training data at a later date.

Reply
Truth says:

May 20, 2021 at 8:29 pm

At 27 to 28 seconds in, you can just make out the cameras.

Reply
J. Peterson says:

May 20, 2021 at 8:31 pm

Old news: https://www.youtube.com/watch?v=9YOEEpWAXgU

Reply
1. ConsultingJoe says:
  
  May 20, 2021 at 8:35 pm
  
  Lol. But not the same
  
  Reply
Joe Sammarco says:

May 20, 2021 at 8:31 pm

This is amazing 👏. Would be great to make a diy version.
I’m on it!
…when I have time ⏲️😏

Reply
1. Adrian says:
  
  May 22, 2021 at 8:53 am
  
  A good excuse to keep that old 3D TV out of the dumpster :)
  
  Reply
Saabman says:

May 20, 2021 at 9:24 pm

we cant even get enough bandwidth to make a “flat”image display in semi realistic notion (smooth movement etc) this wil never work in Australia – at least on our NBN :lol:

Reply
1. TacticalNinja says:
  
  May 21, 2021 at 8:57 am
  
  Try flipping your NBN modem upside down. That usually helps.
  
  Reply
Bruce Perens K6BP says:

May 20, 2021 at 9:31 pm

Heinlein wrote a 1948 story “Waldo”, which is how telemanipulation got its name.

Reply
Comedicles says:

May 20, 2021 at 9:38 pm

And in the sequel, The Naked Sun – 1956 , they can walk around outdoors visiting virtually. Made possible by the precision with which the robots could more with cameras and projectors.

Azimov has a cultural insight in which there is a difference between ‘viewing’ and ‘seeing’. People will view in any state of dress or situation but be appalled if seen at anything but their best. I notice a trend like this with Zoom and Facetime and QQ and WeChat, etc.

Reply
Andrew says:

May 21, 2021 at 12:30 am

Now we just need one of the participants to be holding a telephone handset with a curly cord.

Reply
Alice Lalita Heald says:

May 21, 2021 at 1:36 am

A little late for the covid party, also feels like a prison’s visitor room :D

Why do you need depth if you can have stereo camera with 3D screen?
Also cannot 3D TVs display split images as 3D, in that case Skype could be used, just need to merge stereo cameras into a split screen.

Reply
Elliot Williams says:

May 21, 2021 at 2:08 am

The “compression” — from a few artifacts (hair, baby moving hand quickly) it looks like it’s making a 3D model, skinning it, and presenting that on the other side. Cool.

But somehow creepy. He says, even though phone calls have been linear speech models for decades now…

Reply
1. jive says:
  
  May 21, 2021 at 3:59 am
  
  Don’t they say it directly? they basically use something like https://hackaday.com/2012/02/27/make-any-photo-3d-using-the-gimp/ but start with real time 3D models instead of 2D pictures. I would suspect this needs some serious hardware to encode and decode the changes in model in real time not to mention render the output.
  
  Reply
robomonkey says:

May 21, 2021 at 8:36 am

But what about the bandwidth requirements and latency? We’re seeing it at its best, but show me that on a connection that we can reasonable expect and I’ll be impressed. The video and site don’t go into how much data we’re seeing, compressed or not. The only way to deal with latency on a high bandwidth connection is to prioritize the bits, and if EVERYONE is prioritizing bits, they’re all prioritized.

Still, neat idea.

Reply
1. CityZen says:
  
  May 21, 2021 at 12:31 pm
  
  There are various ways to transmit a 3D representation. One way is to send a regular color image as well as a depth image. Of course, since only one viewpoint is captured, there will be artifacts when rendering the scene from a different viewpoint (where data wasn’t captured). Capturing more views can address this, but it multiplies the bandwidth issues unless some fancy compression is used to avoid sending redundant data.
  
  You can imagine lots of ways to process or preprocess the data to minimize bandwidth, but each has its own issues. For instance, you could try to generate and transmit an accurate initial model as a preprocessing step, then try to capture and send only differences in real-time. Figuring out the differences is an interesting problem.
  
  As far as latency, there’s a reasonable amount of leeway here, since the interaction is a conversation. Most folks will tolerate a fair fraction of a second in this, depending on the back-and-forth speed of the conversation. You can see the upper bounds of what folks tolerate when you see folks interacting over a geosynchronous satellite link, where the latency can be up to one second.
  
  Reply
Fennec says:

May 21, 2021 at 9:15 am

Well I live in London now. I was born in New Zealand. So when I catch up with my friends I should spend a couple thousand £ and endure the 2×12 hour flights with 4-8 hour stopover to do that?

Reply
Hirudinea says:

May 21, 2021 at 9:49 am

What!? You mean IRL? People don’t do that any more, do they?

Reply
Garth Bock says:

May 21, 2021 at 10:13 am

The right hand picture looks like Guardian Bob after he got back from Mainframe to help Dot Matrix and Enzo defeat Megabyte and Hexi Decimal… (Reboot – animated series)

Reply
Ren says:

May 21, 2021 at 6:12 pm

Nice work by Evil Incorporated. I wonder how thy will utilize it against us.

Reply
tetsuoii says:

May 22, 2021 at 5:52 am

I guess this is related to the ‘Glasses Free 3D’ tech of the Nintendo 3DS and some others. Using a striped film on a regular display, you serve different scanlines to each eye, like those rulers with moving dinosaurs we had in school, or 3D postcards. The display I had on a Toshiba Qosmio F750-125 used diagonal stripe lens film, and using webcam face tracking to know which eye could see which pixel.

Reply

Hackaday

Project Starline Realizes Asimov’s 3D Vision

36 thoughts on “Project Starline Realizes Asimov’s 3D Vision”

Leave a ReplyCancel reply

Search

Never miss a hack

If you missed it

Launching Rockets Is Hard, Bring Them Back Is Harder

Putting Some Zig In A Linux-Based 3D Printer

UDP Broadcasting And The Joys Of IPv4 Subnetting

The Death Of Physical Media And The Real Challenges To Software Archiving

A Brief History Of The Crazy Old 7-Segment Display

Our Columns

Hackaday Europe 2026: Project Gigapixel

Hackaday Links: July 19, 2026

Simple Games From A Simpler Time

Hackaday Podcast Episode 378: C Coders, Ceramic Printers, And Shadow Archives

This Week In Security: Another Record Patch Tuesday, LAME Is More Secure, Secure Boot Is Less Secure, And Milk Malware

36 thoughts on “Project Starline Realizes Asimov’s 3D Vision”

Leave a ReplyCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns