Product Thumbnail

V-JEPA 2

Meta's world model for physical world understanding

Open Source
Robots
Artificial Intelligence
GitHub

V-JEPA 2 is Meta's new world model, trained on video to understand and predict the physical world. It enables zero-shot robot planning and sets SOTA benchmarks in visual understanding. Model, code, and new benchmarks are now open.

Top comment

Hi everyone!

V-JEPA 2 is Meta's new world model, a serious take on building AI that understands the physical world with the kind of intuition humans have. It's a foundational step towards what they call Advanced Machine Intelligence (AMI).

It learns from over a million hours of video, not just static images, to build a sense of how things move, interact, and follow basic physics. This allows it to understand and predict what might happen next in a scene.

And this isn't just theory. It's being used for zero-shot robot planning, letting a robot pick up and move objects it has never seen before. That’s a very impressive demonstration of where this technology is headed.

On top of the model and code, Meta has also released three new benchmarks for physical reasoning, which is a great contribution to help push the entire research community forward.

Comment highlights

Just gave it a quick look and I’m impressed by how easy it is to get started. Definitely something I could see myself using.

V-JEPA 2 feels like a major leap toward AI systems that truly understand real-world dynamics, especially exciting to see it applied in zero-shot robot planning. How well does it generalise to edge cases or unpredictable environments where physical rules might not be so clear?

Huge release from Meta! V-JEPA 2’s ability to learn from video and perform zero-shot planning is a major step toward more intuitive, grounded AI. Exciting to see the model, code, and benchmarks all open — can’t wait to see how the community builds on this.