A 128B model for coding, reasoning, and long tasks
Mistral Medium 3.5 is a 128B dense model merging coding, reasoning, and instruction-following in one set of weights. 256k context, configurable reasoning effort. Open weights on HuggingFace for engineers and teams running self-hosted inference.
Mistral just shipped their most capable model yet, and it runs self-hosted on four GPUs.
What it is: Mistral Medium 3.5 is a 128B dense model that merges instruction-following, reasoning, and coding into a single set of weights, with a 256k context window and configurable reasoning effort per request.
Most frontier-class models either require massive infrastructure to self-host or lock you into proprietary APIs.
Mistral Medium 3.5 sits in an interesting position: it scores 77.6% on SWE-Bench Verified, ahead of models like Qwen3.5 397B A17B, while running on as few as four GPUs.
The reasoning effort is configurable per call, so you're not paying or waiting for deep reasoning on a simple reply, but the same model can handle a multi-step agentic run.
What makes it different: This is Mistral's first "merged" flagship model, meaning instruction-following, reasoning, and coding live in one set of weights rather than being split across specialised variants.
The open weights are released under a modified MIT license on Hugging Face, and it's already the default model in both Mistral Vibe and Le Chat.
The vision encoder was trained from scratch to handle variable image sizes and aspect ratios.
Key features:
128B dense model, 256k context window
Configurable reasoning effort per request
77.6% on SWE-Bench Verified
Open weights on Hugging Face under a modified MIT license
Self-hostable on 4 GPUs
API at $1.5/M input tokens and $7.5/M output tokens
Powers Vibe remote coding agents and Le Chat Work mode (Pro/Team/Enterprise plans)
Run a frontier-class model on your own infrastructure without a large GPU cluster
Tune reasoning depth at the API level, useful for cost-sensitive agentic pipelines
Single model handles the full range from quick chat replies to long-horizon coding tasks
Open weights means fine-tuning, auditing, and on-prem deployment are all on the table
Who it's for: Backend and ML engineers evaluating open-weight alternatives to proprietary frontier models for agentic pipelines, coding tools, or self-hosted inference.
The interesting design choice here is the merged weights architecture.
Most labs at this capability tier still ship separate reasoning and instruction models.
Collapsing them with configurable effort per call is a practical tradeoff that's worth watching as other labs respond.
About Mistral Medium 3.5 on Product Hunt
“A 128B model for coding, reasoning, and long tasks”
Mistral Medium 3.5 launched on Product Hunt on April 30th, 2026 and earned 98 upvotes and 1 comments, placing #12 on the daily leaderboard. Mistral Medium 3.5 is a 128B dense model merging coding, reasoning, and instruction-following in one set of weights. 256k context, configurable reasoning effort. Open weights on HuggingFace for engineers and teams running self-hosted inference.
On the analytics side, Mistral Medium 3.5 competes within Android, Newsletters and Artificial Intelligence — topics that collectively have 536.4k followers on Product Hunt. The dashboard above tracks how Mistral Medium 3.5 performed against the three products that launched closest to it on the same day.
Who hunted Mistral Medium 3.5?
Mistral Medium 3.5 was hunted by Kumar Abhishek. A “hunter” on Product Hunt is the community member who submits a product to the platform — uploading the images, the link, and tagging the makers behind it. Hunters typically write the first comment explaining why a product is worth attention, and their followers are notified the moment they post. Around 79% of featured launches on Product Hunt are self-hunted by their makers, but a well-known hunter still acts as a signal of quality to the rest of the community. See the full all-time top hunters leaderboard to discover who is shaping the Product Hunt ecosystem.
For a complete overview of Mistral Medium 3.5 including community comment highlights and product details, visit the product overview.
Mistral just shipped their most capable model yet, and it runs self-hosted on four GPUs.
What it is: Mistral Medium 3.5 is a 128B dense model that merges instruction-following, reasoning, and coding into a single set of weights, with a 256k context window and configurable reasoning effort per request.
Most frontier-class models either require massive infrastructure to self-host or lock you into proprietary APIs.
Mistral Medium 3.5 sits in an interesting position: it scores 77.6% on SWE-Bench Verified, ahead of models like Qwen3.5 397B A17B, while running on as few as four GPUs.
The reasoning effort is configurable per call, so you're not paying or waiting for deep reasoning on a simple reply, but the same model can handle a multi-step agentic run.
What makes it different: This is Mistral's first "merged" flagship model, meaning instruction-following, reasoning, and coding live in one set of weights rather than being split across specialised variants.
The open weights are released under a modified MIT license on Hugging Face, and it's already the default model in both Mistral Vibe and Le Chat.
The vision encoder was trained from scratch to handle variable image sizes and aspect ratios.
Key features:
128B dense model, 256k context window
Configurable reasoning effort per request
77.6% on SWE-Bench Verified
Open weights on Hugging Face under a modified MIT license
Self-hostable on 4 GPUs
API at $1.5/M input tokens and $7.5/M output tokens
Powers Vibe remote coding agents and Le Chat Work mode (Pro/Team/Enterprise plans)
Available on NVIDIA build.nvidia.com and as an NIM container
Benefits:
Run a frontier-class model on your own infrastructure without a large GPU cluster
Tune reasoning depth at the API level, useful for cost-sensitive agentic pipelines
Single model handles the full range from quick chat replies to long-horizon coding tasks
Open weights means fine-tuning, auditing, and on-prem deployment are all on the table
Who it's for: Backend and ML engineers evaluating open-weight alternatives to proprietary frontier models for agentic pipelines, coding tools, or self-hosted inference.
The interesting design choice here is the merged weights architecture.
Most labs at this capability tier still ship separate reasoning and instruction models.
Collapsing them with configurable effort per call is a practical tradeoff that's worth watching as other labs respond.