Powerful Open Diffusion LLM, Beyond Autoregressive
Introducing Dream 7B, the most powerful open diffusion large language model to date. Matches/exceeds similar-sized AR models (LLaMA3, Qwen2.5). Excels at planning & offers flexible inference.
As an HKU alum, I'm incredibly proud of Dream 7B, a new language model from the HKU NLP Group & Huawei Noah’s Ark Lab.
Watching a diffusion model like Dream 7B generate text is fascinating. Instead of the word-by-word output of standard autoregressive models, the text seems to emerge and refine itself all at once, almost like watching a dream form. It’s a unique feeling!
While diffusion isn't the mainstream approach for language models yet, Dream 7B clearly shows the huge potential. It achieves performance comparable to top autoregressive models of its size and brings natural strengths in planning and flexible generation (like infilling or controlling the output order).
They've released the model weights and code openly, which is fantastic. This really feels like a glimpse into exciting new possibilities for AI.