Orpheus TTS is the open-source TTS using a Llama-3b backbone for human-like speech with natural emotion/intonation. Features zero-shot cloning, guided emotion & low latency streaming.
Sharing Orpheus TTS from Canopy Labs. It's an open-source text-to-speech system built on a Llama-3b backbone. This approach uses the LLM's understanding to create more natural and expressive speech.
The audio sounds human-like, capturing emotion and rhythm well. You can guide the emotion with simple tags like or . The base model also shows zero-shot voice cloning capabilities.
It offers low-latency streaming suitable for real-time use. They've also released multilingual models (in research preview) and provide code so you can fine-tune your own voices.