OpenAI Launches Three Real-Time Audio Models for Voice Agents

OpenAI unveiled three new audio models in 2026 aimed at developers building voice-based applications, with a focus on real-time task completion and more interactive voice agents.

The three models each serve a distinct function. GPT-Realtime-2 is designed to handle complex requests and manage interruptions during live conversations. GPT-Realtime-Translate provides live translation across multiple languages. GPT-Realtime-Whisper delivers instant speech-to-text conversion, suited for use cases such as live captions and note-taking.

Companies already testing the tools include Zillow and Priceline, signaling early enterprise interest in deploying the models for real-world applications.

The release reflects OpenAI’s stated goal of making voice agents more capable of completing tasks in real time, which may expand how businesses integrate spoken-language interfaces into their products and services.

Source: Tech-Economic Times

This article was generated by AI and cites original sources.