Product upvotes vs the next 3

Waiting for data. Loading

Product comments vs the next 3

Waiting for data. Loading

Product upvote speed vs the next 3

Waiting for data. Loading

Product upvotes and comments

Waiting for data. Loading

Product vs the next 3

Loading

Phi-4-reasoning-vision

Open-weight 15B multimodal model for thinking and GUI agents

Phi-4-reasoning-vision-15B is a compact open-weight multimodal model built on a mid-fusion architecture. Balancing fast direct perception with deep chain-of-thought, building capable computer-use agents and solving complex math is now highly efficient.

Top comment

Hi everyone!

Phi-4-Reasoning-Vision-15B is Microsoft"s new 15B open-weight model that makes multimodal reasoning feel much more efficient.

It was trained on 200B multimodal tokens, handles high-res screens well, and stays direct on simpler tasks while switching into deeper reasoning when needed.

Looks especially strong for math, science, and computer-use agents. Weights on HF.