Run leading vision models locally with the new engine
Ollama v0.7 introduces a new engine for first-class multimodal AI, starting with vision models like Llama 4 & Gemma 3. Offers improved reliability, accuracy, and memory management for running LLMs locally.
Hi everyone!
Ollama v0.7 is here, and it's a significant update focused on its new engine for multimodal AI. This is a big step for running powerful vision models locally with Ollama!
With this new engine, Ollama now offers first-class, native support for vision models like Meta's Llama 4, Google's Gemma 3, and Qwen 2.5 VL. The aim is improved reliability, accuracy, and memory management when working with these complex models on your own machine. It also simplifies how new models can be integrated into Ollama.
Beyond supporting current vision models, this update also lays the groundwork for Ollama to handle more modalities in the future, such as speech, image generation, and video.
It's good to see Ollama expanding its core capabilities for advanced local AI.