'Experience what characters are feeling': U of T researchers use AI to add '4D' effects to movies
James Cameron's 3D film Avatar sought to revolutionize the movie-going experience when it was first released in 2009, creating an immersive world for viewers. But what if you also wanted to feel the heat and the wind, while flying on a banshee, direct from your cinema seat?
While a small number of so-called "4D" movies that add a physical element already exist, researchers from the University of Toronto are working on a way to apply the feature more broadly.
“Usually the chair will shake, there can be splashing or some other kind of interaction while watching the film,” says Yuhao Zhou, a fourth-year undergraduate in the Edward S. Rogers Sr. department of electrical & computer engineering, of the emerging entertainment. “Right now all these effects are created from the first phase of production. We’d like to automate this kind of process for movies that were not originally created for 4D cinemas.”
Zhou is working with Makarand Tapaswi, a U of T postdoctoral fellow of computer science, and Sanja Fidler, an assistant professor at U of T Mississauga’s department of mathematical and computational sciences and the tri-campus graduate department of computer science. They recently had their work, Now You Shake Me: Towards Automatic 4D Cinema, featured in a spotlight presentation at the Computer Vision and Pattern Recognition (CVPR) conference in Salt Lake City, Utah.
Zhou says a 4D movie is usually perceived from the first-person viewpoint, or camera. If Will Turner in Pirates of the Caribbean is feeling the wind blowing in his face, and the moviegoer wants to experience being Turner, then they, too, would have to experience wind in their face.
“We want to have a feature where you can just flip a switch and experience what characters are feeling,” Zhou says.
To take a regular or 3D movie to 4D, the researchers used a freelance website to annotate the film’s effects for their 4D prediction model.
“For example, [in Lord of the Rings: The Fellowship of the Ring] Frodo pulls Sam out of the water, but there are several effects happening simultaneously,” says Zhou, who began working with Fidler during his third-year of undergraduate studies. “First, he pulls him – there's a physical interaction with the hand. When Sam goes back down into the water, he pulls Frodo, and the boat shakes.
“The camera is your input,” adds Tapaswi. “But in this case you want to experience not only what the camera sees, but also one of the characters – relive how the characters felt shaking and so on.”
While 4D technology is still out of the range of physical interactions – that is, a hand pulling – Tapaswi envisions pressure sensors to simulate touch as the technology advances. The model could prove useful in other areas such as virtual reality or augmented reality.
“We're collecting these types of annotations for future studies,” Zhou says.
For their dataset, they applied both effect classification and detection. For effect classification, Zhou says their neural network, a function of machine learning that allows deep analysis and learning of data, extracted features from a short clip, including movement and audio. For detection, he says, the neural net can predict what the effects are, and where they occur, in a long video clip.
“You don't only want to know what happens to a character in a particular shot. You want to be able to say, '[the effect] is wind now, not only because I see the wind right now, but [because] it was probably windy before,’” Tapaswi says.
The researchers found certain genres of movies tended to share similar effects – for example, movies set in space like Interstellar or Gravity. This can be seen as a novel way to cluster, says Tapaswi.
“Usually with 3D movies, film-goers wear glasses and sit in a chair,” says Zhou. “With automatic 4D cinema, the neural network would process 2D and 3D movie information, feed it into the chair, and simulate the effects.”