Product Thumbnail

GLM-4.6V

Open-source multimodal model with native tool use

Open Source
Artificial Intelligence
Development

GLM-4.6V is GLM's newest open-source multimodal model with a 128k context window. It features native function calling, bridging visual perception with executable actions for complex agentic workflows like web search and coding.

Top comment

Hi everyone!

GLM-4.6V is a significant iteration for the GLM multimodal series. It scales the training context window to 128k and hits SOTA visual understanding for its size.

The biggest update here is the native Function Calling. For the first time in the GLM architecture, tool use is integrated directly into the visual model. This effectively bridges the gap from "visual perception" to "executable action."

It can automatically generate high-quality image-text interleaved content and handle complete workflows independently, like viewing products, comparing prices, and generating shopping lists. The frontend replication and visual interaction capabilities are also impressive, which significantly shortens the path from design to code for developers.

Try it on Z.ai or find the open weights on HF.

Comment highlights

Wow, Z.ai looks incredible! That 128k context window on GLM-4.6V is a game changer. How well does the function calling handle nested tools in agentic workflows?

How consistent is the function calling when handling multi-step visual tasks?

Native function calling + 128k context is huge. This could be a game changer for building actual AI agents instead of just chatbots.

Big question - how does this compare to Claude or GPT-4V in real-world tasks? Any benchmarks? Also curious about API pricing when it goes live.

Seems like solid work! 🚀

Wait wait wait ...

It can analyze images and handle tasks for me ? Nice my lazy era has officially begun 😄

Gave Z.ai a spin on the train. Simple UI, snappy. Rumination mode took a bit but made cleaner steps. 128k context for free is wild. MIT license too. Really want to see how the function calling handles web/code tools. Bookmarking to test a longer task tonight.

Their progress is amazing and I actually love them now more than I loved ChatGPT when it came out. Somehow ChatGPT, even hough they improved GPT models seem to stagnate whereas Z.ai really shines with their Open Source releases. (Which also shows that OpenAI should actually have stayed open, I am pretty sure they would rock the generative AI world by now)

Cool! We used Z.ai for our hackathon in Slovakia (actually, they were our main sponsor), so happy to see how they progress even on PH :)