Google’s Genie 2 “world model” reveal leaves more questions than answers

May Be Interested In:Christine McGuinness wows in a busty black mini dress as she joins leggy Denise Van Outen and stylish Ashley James at the 2025 TV Choice Awards


As podcaster Ryan Zhao put it on Bluesky, “The design process has gone wrong when what you need to prototype is ‘what if there was a space.'”

Gotta go fast

When Google revealed the first version of Genie earlier this year, it also released a detailed research paper outlining the specific steps taken behind the scenes to train the model and how that model generated interactive videos. They haven’t done the same for a research paper detailing Genie 2’s process, leaving us guessing at some important details.

One of the most important of these details is model speed. The first Genie model generated its world at roughly one frame per second, a rate that was orders of magnitude slower than would be tolerably playable in real time. For Genie 2, Google only says that “the samples in this blog post are generated by an undistilled base model, to show what is possible. We can play a distilled version in real-time with a reduction in quality of the outputs.”

Reading between the lines, it sounds like the full version of Genie 2 operates at something well below the real-time interactions implied by those flashy GIFs. It’s unclear how much “reduction in quality” is necessary to get a diluted version of the model to real-time controls, but given the lack of examples presented by Google, we have to assume that reduction is significant.

Oasis’ AI-generated Minecraft clone shows great potential, but still has a lot of rough edges, so to speak.


Credit:

Oasis

Real-time, interactive AI video generation isn’t exactly a pipe dream. Earlier this year, AI model maker Decart and hardware maker Etched published the Oasis model, showing off a human-controllable, AI-generated video clone of Minecraft that runs at a full 20 frames per second. However, that 500 million parameter model was trained on millions of hours of footage of a single, relatively simple game, and focused exclusively on the limited set of actions and environmental designs inherent to that game.

When Oasis launched, its creators fully admitted the model “struggles with domain generalization,” showing how “realistic” starting scenes had to be reduced to simplistic Minecraft blocks to achieve good results. And even with those limitations, it’s not hard to find footage of Oasis degenerating into horrifying nightmare fuel after just a few minutes of play.

share Share facebook pinterest whatsapp x print

Similar Content

JPMorgan, Pictet Defy Consensus to Bet on More Swiss Franc Gains
JPMorgan, Pictet Defy Consensus to Bet on More Swiss Franc Gains
In this photo released by Governor of the Saratov region Roman Busargin telegram channel on Wednesday, Jan. 8, 2025, Governor of the Saratov region Roman Busargin, right, speaks to firefighters and rescuers at the industrial side damaged after Ukrainian drones
A Russian missile attack in southern Ukraine has killed at least 13 civilians, officials say
Microsoft updates Intel-based Surface PCs, but regular people still can’t buy them
Microsoft updates Intel-based Surface PCs, but regular people still can’t buy them
At Blackstone, junior staff talk deals to top execs. 'It's scary,' said Jon Gray.
At Blackstone, junior staff talk deals to top execs. ‘It’s scary,’ said Jon Gray.
Steward Health Care CEO faces contempt of Congress vote
Steward Health Care CEO faces contempt of Congress vote
EXCLUSIVE: Kenny Chesney announces he’s the next headliner at Las Vegas’ Sphere
EXCLUSIVE: Kenny Chesney announces he’s the next headliner at Las Vegas’ Sphere
Changing Perspectives: A New Take on Global Events | © 2024 | Daily News