Product upvotes vs the next 3

Waiting for data. Loading

Product comments vs the next 3

Waiting for data. Loading

Product upvote speed vs the next 3

Waiting for data. Loading

Product upvotes and comments

Waiting for data. Loading

Product vs the next 3

Loading

MiniCPM-V 4.5

GPT-4o level vision model on the phone

MiniCPM-V 4.5 is a new 8B open-source MLLM that delivers GPT-4o level performance on your phone. It excels at image, video, and document understanding, beating top proprietary models on key benchmarks like OCRBench.

Top comment

Hi everyone!

While on-device models still have a way to go to catch up with the cloud, the progress recently has been incredibly fast. The promise has always been about bringing powerful capabilities locally, and MiniCPM-V 4.5 is a huge step in that direction.

It's an 8B open-source multimodal model that is outperforming giants like GPT-4o and Gemini Pro on major vision benchmarks.

Its efficiency and accessibility are awesome. It has great OCR and video understanding, and it's easy to run with tools like Ollama and llama.cpp. This is a very powerful new option for building on the edge.

Try this model here on Gradio.