OpenAI Outlines WebRTC Architecture for Low-Latency Voice AI at Scale



Posted on Wed May 20 2026 | 6:00 pm


OpenAI recently outlined how it adapted WebRTC for low-latency voice AI at global scale. The new architecture replaced a conventional media termination model with a relay-transceiver design better suited to Kubernetes and cloud load balancers. It keeps WebRTC session state in a dedicated transceiver layer while using relays to reduce public UDP exposure and keep media routing close to users.




Search
Side Widget
You can put anything you want inside of these side widgets. They are easy to use, and feature the new Bootstrap 4 card containers!