Zonos offers flexible control of vocal speed, emotion, tone, and audio quality as well as instant unlimited high quality voice cloning. Zonos natively generates speech at 44Khz. Our hybrid is the first open-source SSM hybrid audio model.
This is a super impressive launch for open-weight text-to-speech models.
Zyphra's AI & product teams include talent from Google DeepMind, Anthropic, StabilityAI, Qualcomm, Neuralink, Nvidia, and Apple.
They released both their transformer and SSM-hybrid models under an Apache 2.0 license.