
Amazon has launched a new generative AI model, Nova Sonic, which is capable of natively processing voice and generating natural-sounding speech.
Amazon on Tuesday, April 8, 2025, revealed that Sonic’s performance is competitive with frontier voice models from OpenAI and Google on benchmarks measuring speed, speech recognition, and conversational quality.
Nova Sonic is Amazon’s answer to newer AI voice models, such as the model powering ChatGPT’s Voice Mode, which feels more natural to speak with than the more rigid models from Amazon Alexa’s early days.
To note, Nova Sonic is available through Bedrock, Amazon’s developer platform for building enterprise AI applications, via a new bi-directional streaming API.
Amazon called Nova Sonic “the most cost-efficient” AI voice model on the market, around 80% less expensive than OpenAI’s GPT-4o.
Speaking to TechCrunch, Amazon SVP and Head Scientist of AGI Rohit Prasad, stated that Nova Sonic builds on Amazon’s expertise in “large orchestration systems,” the technical scaffolding that makes up Alexa.
Compared to rival AI voice models, Nova Sonic excels at routing user requests to different APIs.
Nova Sonic is a part of Amazon’s broader strategy to build AGI (artificial general intelligence), which the company defines as “AI systems that can do anything a human can do on a computer.”
It is worth mentioning that Prasad claimed that Amazon plans to launch more AI models that can understand different modalities, including image, video, and voice, along with “other sensory data that are relevant if you bring things into the physical world.”