Artificial Intelligence | June 23, 2023
Meta unveiled on Tuesday an AI with the ability to complete images using “common sense.” The company explained that its model doesn’t compare pixels like other available tools, but instead can understand abstract representations based on “prior knowledge about the world.” In this way, the company claimed that it can more accurately complete unfinished images compared to other tools in the market.
They named it Image Joint Embedding Predictive Architecture (I-JEPA). The model is based on the vision of Meta’s Chief AI Scientist, Yann LeCun. Their idea is to “create machines that can learn internal models of how the world works,” as stated in the company’s blog.
While systems like ChatGPT are trained using supervised learning methods, which rely on large sets of labeled data, I-JEPA, instead of labeled data, directly analyzed images or sounds, Meta explained. They refer to this as “self-supervised learning.”
If we show young children drawings of cows, eventually they can recognize any cow they see. Similarly, I-JEPA can identify representations through comparisons.
Meta published an example of how their AI was able to complete images of various animals and a landscape. The model was able to semantically recognize which parts were missing based on context, such as the head of a dog or the leg of a bird, for example.
“Human and non-human animals seem capable of learning huge amounts of prior knowledge about how the world works through observation and through an incomprehensibly small amount of task-independent, unsupervised interactions,” LeCun explained in a post from February 2022. It is worth hypothesizing, he said at the time, that this accumulated knowledge may “constitute the basis of what is commonly called common sense.”
This “common sense” would guide AI models to understand what is probable, possible, and impossible. That’s why, according to Meta, I-JEPA would not make errors that are common in images generated by other AIs, such as hands with more than five fingers.
AI systems based on labeled datasets (like ChatGPT) are often very good at specific tasks they were trained for. “But it’s impossible to label everything in the world,” Meta explained in another research report from 2021.
There are also some tasks for which there simply isn’t enough labeled data. If AI systems can gain a deeper understanding of reality beyond their training, they “will be more useful and ultimately bring AI closer to human-level intelligence.”
Achieving “common sense” would be like reaching the dark matter of AI, Meta had explained in 2021. The company believes that this type of AI can learn much faster, plan how to perform complex tasks, and easily adapt to unknown situations.
After focusing for a while on the Metaverse, Meta has now started to draw more attention to its AI developments. In May, they launched AI Sandbox, a “testing ground” for early versions of AI-driven advertising tools. Currently, the tests are focused on text drafting, background generation, and image overlay.
They have also introduced LLaMa, their large language model, and SAM, an AI capable of recognizing elements and meanings within an image. Furthermore, Mark Zuckerberg, CEO of the company, revealed plans to develop a virtual assistant focused on improving users’ social lives.
I-JEPA, like the other developments announced by Meta, is currently designed to be tested by the scientific community rather than the general public. This has been Meta’s distinctive approach compared to the competition.