Universal-3 Pro is a new class of speech language model built for Voice AI. Control transcription using instructions and domain context like names, terminology, and topics to get accurate output at the source. No custom models, no post-processing pipelines, no hallucinations. Includes 1,000 keyterms, audio tagging, and 6-language code-switching for $0.21/hr.
We built Universal-3 Pro because we were tired of seeing developers spend 40% of their time on transcription workarounds instead of shipping features.
Today, developers are stuck with rigid solutions. They can transcribe their audio, then run an increasingly complex pipeline of regex and LLM calls to extract what they need. Company names get mangled and jargon becomes gibberish. Then they have to build correction layers on top of correction layers.
Worse..by the time they're fixing errors, they've lost acoustic information like tone, hesitation, and emphasis, that would have helped get it right in the first place.
Universal-3 Pro fills this gap with the reliability of traditional ASR + the controllability of LLMs.
Tell it "This is a medical consultation about diabetes management" and it optimizes for clinical terminology. Add your company's product names as keyterms and watch accuracy jump 45%. Tag [hold music] and [beep] so you're not transcribing phone system garbage.
We're making it free to try - test it with your hardest audio and see the difference!
In AssemblyAI's Universal-3 Pro (promptable speech language model), what's fixed vs prompt-controlled in the API for keyterms prompting and speaker roles? That split keeps outputs predictable and avoids invented words, so teams can drop brittle cleanup code.
Congrats on the launch — love the developer-first Voice AI platform and robust API.