The first LLM for text-to-speech. While other TTS just “reads” words, Octave grasps their meaning. Create any AI voice with a descriptive prompt, guide its emotional delivery (angrier! more sarcasm!), and bring your stories to life with human-like expression.
Hey Product Hunt! I’m Alan Cowen, CEO and Chief Scientist at Hume AI.
We're launching Octave, the first of a new generation of text-to-speech models. Traditional TTS models focus on the mechanical process of turning letters into sounds. Octave isn't a traditional TTS model, but a voice-enabled LLM, trained on 1000x more language. As a result, it understands the cognitive and emotional aspects of human speech. It reads your script like a human actor, delivering realistic emotions, sarcasm, pace, word emphasis, and more.
And unlike any other other TTS system, it can take explicit instructions to generate any voice you describe and modify its emotional tone and speaking style.
Octave is made possible by Hume's research. We're leading the space in voice-enabled LLMs, and we run large-scale psychology studies to help fine-tune our models to generate the right voices at the right time, drawing on a decade of research at the intersection of emotion science and AI.
We’re launching both a platform for creators and an API for developers. We're also launching the Expressive TTS Arena (arena.hume.ai)—a new public benchmark for evaluating emotion-rich, long-form speech generation with instructions.
Congrats on the launch! Out of curiosity, what are the pros and cons of text-to-voice LLM vs. voice-to-voice LLM?
Great work addressing the limitations of traditional TTS! Many AI voices struggle with conveying nuanced emotions. How does Octave balance user-defined emotional guidance with maintaining a natural, unstilted flow in longer dialogues or narrations?
Congrats on the launch! Octave TTS2 sounds like a gamechanger for AI generated voices and does t support multiple languages or Voice customizations?
Good project, wish you best of luck! Which languages are you support ? Do you measure accuracy?
Octave is something new in the world of voiceovers! The intonation in voiceovers has always been a challenge, and I think with the intonation feature, it will be much more interesting)
I trained a TTS system myself last year and I am 100% amazed by how well Octave sounds! :)
So exciting to see this launch! Such an epic achievement from the whole Hume team 💚
Now it’s the time to use more human like product to see how it replies to my questions related to all facets of life from professional to personal.
I hacked together a quick prototype to get this added to Home Assistant. Works great with Home Assistant Voice 😎
🥰 Oh this is so cool! When I use 11labs or OpenAIs voice synths, I usually have to record many takes and then remix snippets to get the right tonality and feel. 11labs. Please buy this company 🙏
yoooo this is really sick!! i think this is going to have a big impact on independent storytellers and videographers
This is next-level for text-to-speech! Traditional TTS often feels robotic because it lacks understanding of the emotional weight behind words. But the way Octave treats speech like a human actor, adjusting tone, pacing, and even sarcasm, makes it sound incredibly natural.
Congrats on the launch!
Best wishes and sending lots of wins to the team :) @achume