Product Thumbnail

Amazon Nova Sonic

AI That Hears How You Speak

Artificial Intelligence
Audio

Nova Sonic is Amazon's Speech-to-speech AI on Bedrock. Understands how you speak (tone, pace) & responds with adaptive, expressive voice in real-time.

Top comment

Hi everyone!


Sharing Amazon Nova Sonic, a new foundation model available on Bedrock that represents a really interesting step towards more natural AI voice conversations.


Traditional voice AI often stitches together separate speech-to-text, LLM, and text-to-speech models, losing important context like tone, emotion, and pacing along the way. Nova Sonic tackles this with a single, end-to-end speech-to-speech model.


This means it doesn't just understand the words you say, but how you say them. Key capabilities include:


👂 Understands Prosody: Picks up on tone, inflection, pace, pauses, hesitations, etc.

🗣️ Adaptive & Expressive Speech: Generates responses whose tone and style dynamically adapt to the input speech – making interactions feel more human.

⚡ Real-Time Streaming: Designed for low-latency, back-and-forth conversations via a bidirectional API.

🛠️ Grounding & Tool Use: Can leverage knowledge bases and call functions/APIs (it also provides a text transcript for this).

☁️ Accessible via the Amazon Bedrock API (currently US-East-1).


It supports different English accents and voice styles. This focus on understanding how something is said, not just what, could make AI interactions significantly less robotic.

Comment highlights

No comment highlights available yet. Please check back later!