Meta Cancels Augmented Reality Headset After Apple Vision Pro Falls Flat

The history of consumer technology is littered with things that came and went. For whatever reason, consumers never really adopted the tech, and it eventually dies. Some of those concepts seem to persistently hang on, however, such as augmented reality (AR). Most recently, Apple launched its Vision Pro ‘mixed reality’ headset at an absolutely astounding price to a largely negative response and disappointing sale numbers. This impending market flop seems to now have made Meta (née Facebook) reconsider bringing a similar AR device to market.

To most, this news will come as little of a surprise, considering that Microsoft’s AR product (HoloLens) explicitly seeks out (government) niches with substantial budgets, and Google’s smart glasses have crashed and burned despite multiple market attempts. In a consumer market where virtual reality products are already desperately trying not to become another 3D display debacle, it would seem clear that amidst a lot of this sci-fi adjacent ‘cool technology,’ there are a lot of executives and marketing critters who seem to forego the basic question: ‘why would anyone use this?’

Continue reading “Meta Cancels Augmented Reality Headset After Apple Vision Pro Falls Flat”

Meta Doesn’t Allow Camera Access On VR Headsets, So Here’s A Workaround

The cameras at the front of Meta’s Quest VR headsets are off-limits to developers, but developer [Michael Gschwandtner] created a workaround (Linkedin post) and shared implementation details with a VR news site.

The view isn’t a pure camera feed (it includes virtual and UI elements) but it’s a clever workaround.

The demo shows object detection via MobileNet V2, which we’ve seen used for machine vision on embedded systems like the Raspberry Pi. In this case it is running locally on the VR headset, automatically identifying objects even though the app cannot directly access the front-facing cameras to see what’s in front of it.

The workaround is conceptually simple, and leverages the headset’s ability to cast its video feed over Wi-Fi to other devices. This feature is normally used for people to share and spectate VR gameplay.

First, [Gschwandtner]’s app sets up passthrough video, which means that the camera feed from the front of the headset is used as background in VR, creating a mixed-reality environment. Then the app essentially spawns itself a Chromium browser, and casts its video feed to itself. It is this video that is used to — in a roundabout way — access what the cameras see.

The resulting view isn’t really direct from the cameras, it’s akin to snapshotting a through-the-headset view which means it contains virtual elements like the UI. Still, with passthrough turned on it is a pretty clever workaround that is contained entirely on-device.

Meta is hesitant to give developers direct access to camera views on their VR headset, and while John Carmack (former Meta consulting CTO) thinks it’s worth opening up and can be done safely, it’s not there yet.

Robust Speech-to-Text, Running Locally On Quest VR Headset

[saurabhchalke] recently released whisper.unity, a Unity package that implements whisper locally on the Meta Quest 3 VR headset, bringing nearly real-time transcription of natural speech to the device in an easy-to-use way.

Whisper is a robust and free open source neural network capable of quickly recognizing and transcribing multilingual natural speech with nearly-human level accuracy, and this package implements it entirely on-device, meaning it runs locally and doesn’t interact with any remote service.

Meta Quest 3

It used to be that voice input for projects was a tricky business with iffy results and a strong reliance on speaker training and wake-words, but that’s no longer the case. Reliable and nearly real-time speech recognition is something that’s easily within the average hacker’s reach nowadays.

We covered Whisper getting a plain C/C++ implementation which opened the door to running on a variety of platforms and devices. [Macoron] turned whisper.cpp into a Unity binding which served as inspiration for this project, in which [saurabhchalke] turned it into a Quest 3 package. So if you are doing any VR projects in Unity and want reliable speech input with a side order of easy translation, it’s never been simpler.

Re-imagining Telepresence With Humanoid Robots And VR Headsets

Don’t let the name of the Open-TeleVision project fool you; it’s a framework for improving telepresence and making robotic teleoperation far more intuitive than it otherwise would be. It accomplishes this in part by taking advantage of the remarkable technology packed into modern VR headsets like the Apple Vision Pro and Meta Quest. There are loads of videos on the project page, many of which demonstrate successful teleoperation across vast distances.

Teleoperation of robotic effectors typically takes some getting used to. The camera views are unusual, the limbs don’t move the same way arms do, and intuitive human things like looking around to get a sense of where everything is don’t translate well.

A stereo camera with gimbal streaming to a VR headset complete with head tracking seems like a very hackable design.

To address this, researches provided a user with a robot-mounted, real-time stereo video stream (through which the user can turn their head and look around normally) as well as mapping arm and hand movements to humanoid robotic counterparts. This provides the feedback to manipulate objects and perform tasks in a much more intuitive way. In short, when our eyes, bodies, and hands look and work more or less the way we expect, it turns out it’s far easier to perform tasks.

The research paper goes into detail about the different systems, but in essence, a stereo depth and RGB camera is perched with a 3D printed gimbal atop a humanoid robot frame like the Unitree H1 equipped with high dexterity hands. A VR headset takes care of displaying a real-time stereoscopic video stream and letting the user look around. Hand tracking for the user is mapped to the dexterous hands and fingers. This lets a person look at, manipulate, and handle things without in-depth training. Perhaps slower and more clumsily than they would like, but in an intuitive way all the same.

Interested in taking a closer look? The GitHub repository has the necessary code, and while most of us will never be mashing ADD TO CART on something like the Unitree H1, the reference design for a stereo camera streaming to a VR headset and mirroring head tracking with a two-motor gimbal looks like the sort of thing that would be useful for a telepresence project or two.

Continue reading “Re-imagining Telepresence With Humanoid Robots And VR Headsets”

Giving People An Owl-like Visual Field Via VR Feels Surprisingly Natural

We love hearing about a good experiment, and here’s a pretty neat one: researchers used a VR headset, an off-the-shelf VR360 camera, and some custom software to glue them together. The result? Owl-Vision squashes a full 360° of un-distorted horizontal visual perception into 90° of neck travel to either side. One can see all around oneself, without needing to physically turn one’s head any further than is natural.

It’s still a work in progress, and accessing the paper currently doesn’t have a free option, but the demonstration video at that link (also embedded below) gives a solid overview of what’s going on.

Continue reading “Giving People An Owl-like Visual Field Via VR Feels Surprisingly Natural”

Bats Can No Longer Haunt Apple VR Headsets Via Web Exploit

Bug reporting doesn’t usually have a lot of visuals. Not so with the visionOS bug [Ryan Pickren] found, which fills a user’s area with screeching bats after visiting a malicious website. Even better, closing the browser doesn’t get rid of them! Better still? Doesn’t need to be bats, it could be spiders. Fun!

The bug has been fixed, but here’s how it worked: the Safari browser build for visionOS allowed a malicious website to fill the user’s 3D space with animated objects without interaction or permission. The code to trigger this is remarkably succinct, and is actually a new twist on an old feature: Apple AR Quick Look, an HTML-based feature for rendering 3D augmented reality content in Safari.

How about spiders, instead?

Leveraging this old feature is what lets an untrusted website launch an arbitrary number of animated 3D objects — complete with sound — into a user’s virtual space without any interaction from the user whatsoever. The icing on the cake is that Quick Look is a separate process, so closing Safari doesn’t get rid of the pests.

Providing immersive 3D via a web browser is a valuable way to deliver interactive content on both desktops and VR headsets; a good example is the fantastic virtual BBC Micro which uses WebXR. But on the Apple Vision Pro the user is always involved and there are privacy boundaries that corral such content. Things being launched into a user’s space in an interaction-free way is certainly not intended behavior.

The final interesting bit about this bug (or loophole) was that in a way, it defied easy classification and highlights a new sort of issue. While it seems obvious from a user experience and interface perspective that a random website spawning screeching crawlies into one’s personal space is not ideal, is this a denial-of-service issue? A privilege escalation that technically isn’t? It’s certainly unexpected behavior, but that doesn’t really capture the potential psychological impact such bugs can have. Perhaps the invasion of personal space and user boundaries will become a quantifiable aspect of bugs in these new platforms. What fun.

DIY Eye And Face Tracking For The Valve Index VR Headset

The Valve Index VR headset has been around for a few years now. It doesn’t come with eye or face tracking, but that didn’t stop inspired folks like [Physics-Dude] from adding DIY solutions in elegant and effective ways using a combination of hardware, open software, and 3D printable parts.

The whole assembly integrates tightly, thanks in part to the “frunk” designed into the Index for exactly this kind of thing.

This project leverages the EyeTrackVR project (and optionally, Project Babble for mouth tracking) which both have great applications particularly in social VR spaces.

These are open-source, self-contained and modular solutions intended for a variety of hardware platforms. Of course, every millimeter and gram tends to count when it’s something that gets worn on one’s head, so [Physics-Dude] tailored a solution specifically for the Valve Index. His project makes great use of the platform’s hacker-friendly hardware design.

[Physics-Dude] also makes excellent use of a certain widely-available “gumstick” style USB hub as an important part of his build. Combined with with the front-mounted USB port on the Index, it results in an extremely compact and tightly integrated solution that looks great. While it can be risky to rely on a particular off-the-shelf item in a build, doing so absolutely has its place here.

The documentation is fantastic, including welcome guidance on cable routing and step-by-step instructions. If you’ve been interested in adding eye tracking to a project, be sure to give it a look. Already have eye tracking in a project of your own? Tell us all about it!