Along with announcing a renaming of the company (to Meta), Facebook also showed off some of its thinking about their new virtual world concept, which they call the metaverse.
This is not a new concept. Indeed, Second Life has been offering most of what they describe for many years now, and, obviously, there is also the fictional Oasis from the Ready Player One novel and movie, which goes well beyond anything Facebook / Meta have talked about so far.
Just what are the challenges here though? Why has Second Life not been able to become a mainstream way of interacting, especially in the last 18 months while we have been trapped at home in the real world. Can Meta make this concept something billions use.
What Is The Metaverse
For those that have played Second Life, or even a multi-player 3D game like Roblox or Fortnite, the concept of controlling an avatar in a virtual world will not be new. Facebook bought Oculus a number of years ago, so while the current generation of 3D virtual worlds are generally accessed through a 2D screen, it is very probable that the Metaverse will allow, if not require, a virtual reality headset.
There has also been discussion of having AR glasses that connect to it, but I really don’t see how that would work in a meaningful way given all the examples of the metaverse that have been shown so far. Even the limited use case of a “side conversation” that you can have during, for example, a meeting without others being aware is somewhat flawed. While you can obviously read incoming messages on your AR glasses while in a meeting, I am not sure I see how to reply secretly through glasses alone. I’m also not sure this is something we as a society should be encouraging.
Think about what we’ve had over the last couple of years during the pandemic where everyone’s been on Zoom, and one of the things that I’ve found very productive is you can have side channel conversations or chat threads going while you’re having the main meeting. I actually think that would be a pretty useful thing to be able to have in real life too where basically you’re having a physical conversation or you’re coming together, but you can also receive incoming messages without having to take out your phone or look at your watch and even respond quickly in a way that’s discreet and private.An Interview with Mark Zuckerberg about the Metaverse – stratechery.com
Maybe Meta’s glasses will be able to read my mind too…
The Technical Challenges
Putting aside the whole idea of AR, and sticking with the better understood, though no less challenging, VR concept, there are still a number of technical challenges that stand in the way of this being truly mainstream. Especially for business use cases, such as the business meeting from the keynote video.
First obvious challenge is getting VR headsets that are both cheap enough, powerful enough and comfortable enough that people will be willing to wear them for extended periods of time. For those who have been doing video calls for work for the last 18 months, wearing a regular audio headset for hours at a time can quickly become uncomfortable. Wearing a VR headset all day, and especially a tethered one, would not be an experience many would choose just to be in a virtual office.
The second thing, which will also be apparent to anybody who has been attending group video, or even audio calls, is that we do not have the ability to make the audio in a meeting work the same way as the audio in a room. In any kind of call, it is imperative that people try to speak one at a time. One of the examples given was playing card games; if you imagine a fast paced card game, or a game where people need to be first to answer, while humans are great at handling that when together in a room, it just doesn’t work on a video call.
A concert, or a webinar/presentation work just fine, but they also work on Zoom today. It’s not clear what a virtual world is adding to the experience.
In all the demos, avatars are fluidly moving around the virtual world, not colliding with anything. In the meeting room above, they were even in a low gravity scenario where they could float and cruise around. VR headsets can track the movement of your head (indeed, they need to do this in real time to avoid giving you motion sickness), the technology to move the avatar through the world is nowhere near as advanced. RP1 described multi-directional treadmills that would instantly adapt to changes in speed and direction so the player could run, walk and turn on the spot IRL, and their avatar would follow suit in the virtual world. That’s a long way from becoming mainstream.
Instead, most games use controller joysticks or keyboard controls to guide an avatar through the world. In a business setting, do you really see people trying to guide their avatar from one virtual meeting room to another? Or are we just going to teleport into the room? In which case, why not just join a video call? Maybe we’ll end up sitting in a virtual office joining video calls from there.
Beyond that, there is motion of arms, hands, legs (want to be able to kick a ball in the virtual world, or play catch?), and even things as simple as sitting down or lying down. A VR headset cannot help with those. To get that, we are entering the realm of full body suits and maybe 3D tracking camera systems. How long do we want to be wearing one of those for…
There are some very obvious senses that we lose in the virtual world without specialized gear again, the most obvious being touch. Experiments over the years have shown it is very hard to pick things up in a virtual world without some form of tactile feedback. In addition to tracking movement of your fingers, the suit also needs to be able to give you feedback. To be realistic, that needs to vary for hard, soft, rough, smooth, hot, cold. To be truly immersive, it also needs to be more than just your hands. You’ll want to feel the sand under your feet on the virtual beach while you’re on your virtual vacation, won’t you?
Although we mentioned audio above, that was more in the context of a call. Audio is also an important sense for other things though; like knowing when somebody is approaching, especially from behind you. Or being able to determine the direction a sound came from, such as a warning beep. There are visual ways to replace a lot of these, but it all presents technical challenges to move VR beyond the gaming world and into mainstream and business use.
Bandwidth & Computing Power
The bandwidth needed for video calls is already bad enough, but what are we going to need, and what kind of local processing power, especially GPU power, are we going to need make this a reality? How affordable will that be around the world?
When it comes to interacting with objects and other people, latency also starts to be important. When we stop moving an arm that is reaching out to press a button or pick something up, it needs to stop immediately. Too much latency there and we will overshoot. Talking to other people in the virtual world can’t sound like a satellite call. If i have team members all around the world meeting in a virtual room, it needs to feel like they’re all there together, with no noticeable latency.
It isn’t all technical challenges either. There are people out there who struggle to play 3D games like Minecraft and Roblox without feeling sick. For these people, a space like this is very likely to be a nightmare.
Others may simply not feel comfortable with the control aspects of these worlds. Some of these same people struggle connecting to a video call today; having to take control of an avatar in a virtual world and navigate to a virtual conference room will probably be too intimidating. My kids’ school last year had a teacher who was clearly very uncomfortable with the technology needed to teach remotely. She came over as a very capable teacher, but she was totally overwhelmed by the tools needed to teach remotely.
More important than the specifics of each space though is the fabric that binds all of this together. It was the development of the (somewhat) standardized web browser that allowed the fledgling World Wide Web to flourish and become what we have today. For something like the metaverse to succeed, it cannot be built or owned by one company (no matter how much Zuckerberg would like it to be owned by his company).
Before we can have all these amazing 3D spaces to play in, there needs to be work done to define the equivalents of HTTP and HTML. That’s where we arrive at WebXR, the follow on to WebVR that covers more device types, including browsers on computers without any goggles etc.
Will it work for Facebook / Meta better than it did for Linden Lab with Second Life? Maybe. Second Life was ahead of its time with a concept that was too advanced for the state of the technology when it launched. Timing is everything. I suspect that Meta might a little early too though. For the experience to really work the way they have been suggesting it might, we are going to need some of the technology envisaged in Ready Player One. At the moment, VR and AR are something of a gimmick. Events like the the Olympics had immersive VR feeds, but how many people actually watched that way?
Nearer term, I think it will need to be optimized around a less tech-dependent model, perhaps sticking with the 2D screen view and teleport options to move between spaces. There really is no reason to re-create things like a commute to the office or walking between rooms in an office building. We can build something more like Star Trek beaming to get from one place to another (with the effects being there to pre-announce the arrival of somebody new into the room). I don’t really need to even be able to walk around inside the room once i am there. I could beam into an empty seat for the meeting, and just stay there.
Where this really might work in the near term is for virtual conferences and trade shows. Creating keynotes, presentations and panel sessions in VR would allow attendees to have an experience similar to the real world one, without the lines for the bathroom between sessions. A room with virtual booths in it would allow vendors to show off their products and collect contact information.