It’s no secret that Mark Zuckerberg intends to use his firm, Meta, to lead the charge in the emerging metaverse (formerly Facebook). Following Meta’s earlier declaration that it was creating a record-breaking supercomputer to power the metaverse, the recently finished Meta event “Inside the lab: Building for the metaverse with AI” was another milestone in Meta’s mission to unlock the metaverse with AI. According to experts, AI, VR, AR, blockchain, and 5G will combine to power the metaverse, and Zuckerberg is interested in constructing many massive AI systems to power the embryonic metaverse world.
“At Meta, we work on a wide range of technologies, from virtual reality to creating our own data centers.” We’re especially interested in basic technologies that might enable whole new possibilities. “Today, we’re going to concentrate on artificial intelligence, which is maybe the most important fundamental technology of our time,” added Zuckerberg.
The kind of experiences we’ll have in the metaverse, according to Zuckerberg, will go “beyond what’s conceivable today.” He describes the metaverse as “an immersive version of the internet.” He claims that the metaverse would need advancements in a variety of domains, ranging from new hardware devices to software for creating and exploring worlds, and that AI will be the key to unlocking many of these advancements.
Self-supervised learning in a new light
In a session titled “Unlocking the Metaverse with AI and Open Science,” Jérôme Pesenti, the head of Facebook AI, and Joelle Pineau, the co-managing director of Facebook AI Research, went into how Meta aims to unlock the metaverse using AI after Zuckerberg’s opening remarks. One of the keys to the metaverse, according to Pesenti, is AI. He said that Meta AI’s aim is to bring the world closer together through developing AI via breakthroughs in AI research and enhancing Meta products as a result.
Meta AI, according to Pesenti, is making considerable progress in key areas such as embodiment and robotics, creativity, and self-supervised learning. Self-supervised learning, in which computers learn through direct human supervision, has traditionally been accomplished by training oriented systems to execute a specific job by providing them with a large number of human-generated examples. However, according to Pesenti, the problem with this technique is that it is task-dependent. It’s unclear whether the computer really learns beyond the specific job with this technique, and it requires a lot of human effort, which might add undesired biases.
Meta AI, according to Pesenti, is moving to a new self-supervised method, in which AI may learn data without the need for human supervision.
“When it comes to language, for example, the AI system may delete words from the input text and attempt to find them by inferring patterns from the surrounding words. As the AI system advances, it gains a greater knowledge of language’s meaning and structure. One of the most significant benefits of the self-supervised model is this: It is task-independent, meaning that a single model may be used to execute several downstream tasks with minimum fine tuning. The model may assist in recognizing hate speech and ensuring that your news feed in search results does not include incidents.”
Self-supervised learning is no longer confined to language, according to Pesenti, thanks to Meta AI’s research breakthrough. “Researchers at Meta AI and throughout the industry have shown outstanding achievements in comprehending voice pictures in the last six months,” he added.
Researchers at Meta AI have developed self-supervised algorithms for picture reconstruction, in which they split an image into tiny patches, ring 80 percent of these patches, and ask the AI to recreate the image, according to Pesenti. He went on to say that Meta AI researchers have shown that this new self-supervised strategy, when paired with a little quantity of annotated data, can compete with older systems that need far more human supervision.
He claims that Meta AI is beginning to develop unified models that can grasp many modalities at once, such as reading lips while listening for improved voice recognition or identifying policy-breaking social media postings by assessing all components — text, picture, or video — at the same time. But, according to Pesenti, Meta AI won’t end there.
“We don’t simply want AI models that can recognize words, photos, and videos. We want AI models that can comprehend the whole planet. And now, with the arrival of the metaverse, we have a unique task and chance to do so.”
The metaverse introduces a slew of new obstacles
Pineau feels that the metaverse brings with it a slew of new difficulties. She claims that since much of the recent decade’s rapid growth in AI is strongly rooted in the internet, it’s not unexpected that data modalities like voice, language, and vision — which are the internet’s native modalities — have seen the greatest improvement.
AR and VR, on the other hand, provide unique and expansive experiences and capabilities. “For example, movement from the hands to the faces to the whole body becomes a primary vector for transmitting and receiving data. This throws up some exciting new possibilities, but it also necessitates significant advancements in our AI models,” Pineau added.
Although Pesenti shared the objective of creating unified models, Pineau pointed out that it isn’t nearly enough, and that work on world models is critical. “Building a world model,” she continued, is a concept that AI experts have discussed for years.
“The goal is to create a rich representation that can be used not just to generate predictions, but also to look forward in time and evaluate different options for actions or interventions.” Our world models, like supervised models, will need to be trained using a combination of static pre-recorded data and a stream of interactive experiences as we progress toward constructing AI agents that can work fluidly across genuine reality, augmented reality, and virtual reality,” she added.
There’s still a lot of unknowns, as Pineau acknowledges that Meta AI doesn’t yet know all of the new techniques and algorithms it will create in the future years — but she does know that a few study lines are about to shift dramatically. Embodiment and robotics is one of these areas. Meta AI is looking at robotics, according to Pineau, since it’s a great example of how world models can make a big impact. The goal, according to Pineau, is to produce “unbounded robotics” – robots that can work fluidly in the home and office, interacting with people and things as smoothly as possible, outside of the lab or more limited environments such as factories.
“As we construct robots that learn through rich interaction, one critical step is for the robot to increase its capacity to detect the environment via touch.”
Meta AI has been experimenting with novel touch sensors, collaborating with academics at Carnegie Mellon University and MIT to develop sensors that employ AI algorithms to infer contact position and quantify contact pressures based on picture changes captured by a camera inside the sensors. Pineau claims that the digit sensor developed in collaboration with MIT is substantially less expensive to produce than presently available commercial tactile sensors.
One of the problems Meta AI hopes to tackle is developing models that can work in both the real world and virtual worlds, enabling avatars to choose and handle things in a realistic manner while maintaining consistency from one to the other. Meta AI realizes that there is a significant gap between simulation and reality, and is working to bridge that gap by using virtual reality to train and test new algorithms for robot navigation and manipulation with realistic sensing and interaction with space and things.
While Pineau recognized that there is considerable work to be done in order to create really dependable world models, she also raised the intriguing topic of whether world models need to be exact all of the time. Meta AI is working on a project that will enable it to “lean into the inner kid we all have inside of us and be creative” instead of attempting to perceive and replicate the actual environment.
“This is only the beginning,” she added. “As we investigate new ways that AI models may increase human creativity, you should expect to see a lot more.”
Its intentions are being made public
Pineau said that Meta AI’s designs would be open source, making them available to research teams all across the globe. “We built and released an open source library — in this case, the PyTorch library — that includes several functionalities such as detecting touch slip, estimating, the robot paws, and the object itself, which can all be included as part of a broader system with navigation and other robotics capabilities,” she said.
Pineau stated that as Meta begins on a new path to construct AI for a “embodied interactive metaverse,” the firm must raise the standard on how it does so, as well as the principles it promotes in its design and technology. Pesenti agreed with Pineau that Meta would raise the standard by adhering to the best practices, responsibilities, and models for AI systems and technologies that are fair, inclusive, transparent, and allow consumers greater power while safeguarding their privacy.
These best practices, according to Pesenti, are difficult to describe since the issues generally entail complicated social issues. “This is why it’s critical for us to be open about our work and share it with the larger responsible AI community so that we can obtain their comments and benefit from their knowledge,” he added.
Meta seems to want to solve some of the privacy concerns it has experienced over the years by embracing comments from its open source community as part of its path toward “responsible AI.”
“We’re also pleased to announce that Meta AI is making TorchRec, the recommendations library that underpins several of our products, open-source.” Meta AI’s dedication to AI openness and open research is shown via TorchRec. It’s part of the PyTorch toolkit and includes common sparsity and parallelism primitives, allowing researchers to create the same cutting-edge personalisation that Facebook newsfeed and Instagram reels do today. These are simply a few solid steps on a long road to more responsible AI, according to Pesenti.