Universal-3 Pro is a new class of speech language model built specifically for Voice AI applications. It represents the first promptable speech language model that allows users to control transcription using instructions and domain context like names, terminology, and topics to get accurate output directly at the source.
The model provides full control through prompting, allowing users to give context about names, terminology, topics, and format before processing audio. Key features include context-aware transcription, audio tagging capabilities, verbatim mode for capturing disfluencies, keyterms recognition for up to 1,000 domain terms, speaker role labeling, and 6-language code-switching support. It can capture fillers (um, uh, er, erm, ah, hmm, mhm, like, you know, I mean), repetitions (I I I, the the), restarts (I was- I went), stutters (th-that, b-but, no-not), and informal speech (gonna, wanna, gotta).
The unique approach involves providing domain-specific accuracy without requiring custom models or post-processing pipelines. Users can describe their audio in plain language and get specialized outputs across various applications. The model adapts to accent patterns, audio quality, or background noise based on instructions provided.
Benefits include getting pharmaceutical-grade accuracy immediately for medical conversations, seeing up to 45% fewer errors on specialized vocabulary, and capturing meaningful audio events like [beep] and [hold music] for better sentiment analysis. Use cases span medical history evaluations, legal records requiring verbatim accuracy, customer intelligence applications, and business meeting summaries.
The product targets developers building Voice AI applications across medical, legal, customer intelligence, and business domains. It integrates as part of AssemblyAI's complete Voice AI platform and offers predictable usage-based pricing at $0.21 per hour.
admin
Universal-3 Pro targets developers building Voice AI applications across medical, legal, customer intelligence, and business domains. It serves organizations needing specialized transcription accuracy for pharmaceutical conversations, legal proceedings, customer service analysis, and business meeting documentation. The platform is designed for companies requiring domain-specific transcription without custom model development.