Spell is an AI model to generate full 3D scenes or “Worlds” from an image, in just a few minutes. The worlds are consistent with the initial image input and are rendered as a volume that can be rendered using Gaussian Splatting (or NeRFs).
Oh looks like someone hunted this for us!
For anyone interested. This has been a massive endeavor in terms of research and development.
Spell is a special model because it has a deeper understanding of the 3d space. Which means that it can approximate physical behaviors.
We wrote more about it here: https://blog.spline.design/intro...