Introducing Dream 7B, the most powerful open diffusion large language model to date. Matches/exceeds similar-sized AR models (LLaMA3, Qwen2.5). Excels at planning & offers flexible inference.
As an HKU alum, I'm incredibly proud of Dream 7B, a new language model from the HKU NLP Group & Huawei Noah’s Ark Lab.
Watching a diffusion model like Dream 7B generate text is fascinating. Instead of the word-by-word output of standard autoregressive models, the text seems to emerge and refine itself all at once, almost like watching a dream form. It’s a unique feeling!
While diffusion isn't the mainstream approach for language models yet, Dream 7B clearly shows the huge potential. It achieves performance comparable to top autoregressive models of its size and brings natural strengths in planning and flexible generation (like infilling or controlling the output order).
They've released the model weights and code openly, which is fantastic. This really feels like a glimpse into exciting new possibilities for AI.
It's somewhat like another reasoning mode of the brain: intuitive reasoning. "Even though I don't know why the answer is this way, I somehow just know that this is the answer."
As a non-technical person, I am very curious to see what interesting effects such a reasoning mode could produce in a real work environment. We need more precise language models that output word-for-word, but I would also be very pleased to see faster, more intuition-based language models.
Dream 7B could revolutionize my productivity with its planning capabilities. How do you see it transforming workflows in your projects?