OpenAI Realtime API GA: production-ready voice-to-voice over WebRTC

In one sentence OpenAI promotes the Realtime API to GA: low-latency voice-in/voice-out (~300ms), tool calling, function calling, native WebRTC. Opens the production voice-app era with a single end-to-end API.

Needs review Official source

ShareLinkedIn X

OpenAI ships the Realtime API in stable (GA), after its October 2024 beta. It's the API that lets you build real-time voice apps: the model hears your voice and replies with voice, without the traditional three steps (speech-to-text → LLM → text-to-speech).

What's new at GA: native WebRTC support (easier to embed in browsers), lower prices, stable function calling integration, and a cheaper mini variant. Average latency ~300ms, below the perceptual threshold of natural conversation.

For builders of voice assistants, customer support, AI call centers, this is the first moment the infrastructure is "production-grade" from a single provider, without orchestrating Deepgram + GPT + ElevenLabs.