Imagine an AI with an imagination

Understanding "why" may be the key to unlocking an AI's imagination.

July 20, 2021

Humans are really good at imagining things based on the mixing and matching of existing elements.
One of the holy grails of computer science is the development of an AI that can extrapolate from data, and USC researchers have a model for how that could happen.
Rather than focusing on the tiny details of individual samples, the model uses groups of related samples to encourage AI to figure out broader principles.

Get the Big Think newsletter on Substack

One explanation of human imagination — and of creativity — is that it’s the process of creating something new by combining existing elements in a novel way. It could be a daydream built on “what ifs,” such as familiar rhythms and motifs turned into a new song, or seemingly unrelated bits of knowledge brought together for the first time as the building blocks of a breakthrough insight.

Using our imaginations comes naturally to us. We do it all the time in ways big and small. For artificial intelligence, however, recombining elements of different things is the opposite of what comes “naturally” to it. Machines learn by breaking things down and cataloguing the existing attributes of objects in order to identify them. These traits are not treated as free-floating characteristics available for mixing and matching in new ways.

Now, a team of researchers from the University of Southern California has announced the development of something profoundly new: a model for an AI with imagination.

“Humans can separate their learned knowledge by attributes — for instance, shape, pose, position, color,” the study’s lead author Yunhao Ge tellsUSC Viterbi, “and then recombine them to imagine a new object. Our paper attempts to simulate this process using neural networks.”

Disentanglement

They’re all machines that flyCredit: yganko / Adobe Stock/Big Think

Machine learning typically occurs through the close examination of images and the recording of up-close attributes, such as the colors of pixels. The goal is for an algorithm to correctly identify a new image of the same (or a similar) object. AI makes no attempt to understand what the object is or how it works. Machine learning is mostly pattern recognition.

Scientists have long dreamed, however, of an AI that can extrapolate from what it has learned by inferring from small details an object’s broader themes, including how it operates. Identifying a picture of an airplane is good; identifying why and how it’s a “flying machine” is much better.

The key to endowing an AI with imagination, the USC researchers said, is “disentanglement,” the idea that attributes can be unbundled, or separated, from the objects that exhibit those traits.

Indeed, disentanglement is the concept behind the increasingly amazing “deep fakes” proliferating on the internet. For instance, in a deep fake video, a face’s movement is disentangled from its identity. This allows deep fake creators to, as Ge says, “synthesize new images and videos that substitute the original person’s identity with another person, but keep the original movement.”

Teaching AI

Credit: khuruzero / Adobe Stock/Big Think

“Controllable disentangled representation learning” is the term USC’s researchers have bestowed on the way their algorithm learns. Instead of being fed individual objects to catalogue, the AI is given a sample group of related images with the goal of analyzing them until it ultimately discovers the broader themes that unify them. The individual attributes can then be disentangled from the basic characteristics that identify an object.

“For instance,” says Ge, “take the Transformer movie as an example. It can take the shape of Megatron car, the color and pose of a yellow Bumblebee car, and the background of New York’s Times Square. The result will be a Bumblebee-colored Megatron car driving in Times Square, even if this sample was not witnessed during the training session.”

The team has developed a dataset of 1.56 million images for training their AI.

From mimicking to understanding

According to the study’s senior author Laurent Itti, “Deep learning has already demonstrated unsurpassed performance and promise in many domains, but all too often this has happened through shallow mimicry, and without a deeper understanding of the separate attributes that make each object unique.”

Noting that their model can be applied to many different types of data, the researchers foresee AI being able to overcome its current myopia. For instance, it may help scientists discover new combinations of existing compounds based on an analysis of their disentangled properties. Autonomous driving AI could be that much more useful and safe if it can imagine, and thus anticipate, hazardous scenarios.

Itti concludes, “This new disentanglement approach, for the first time, truly unleashes a new sense of imagination in AI systems, bringing them closer to humans’ understanding of the world.”

Robby Berman

Contributing Writer

Full Profile

Monthly Issue May 2026