In May, OpenAI showcased an advanced voice mode for its ChatGPT platform, demonstrating eerily realistic, nearly real-time responses. Initially, the company promised that this feature would be available to paying ChatGPT users within a few weeks. However, months later, OpenAI has announced a delay, stating that more time is needed to refine the feature.
On OpenAI’s official Discord server, the company revealed that it had planned to begin rolling out the advanced voice mode in alpha to a select group of ChatGPT Plus users in late June. Lingering issues, however, have forced the company to push the launch to sometime in July.
“We’re improving the model’s ability to detect and refuse certain content,” OpenAI wrote. “We’re also working on enhancing the user experience and preparing our infrastructure to scale to millions while maintaining real-time responses. As part of our iterative deployment strategy, we’ll start the alpha with a small group of users to gather feedback and expand based on what we learn.”
The full release of the advanced voice mode to all ChatGPT Plus customers may not happen until the fall, depending on whether the feature meets specific internal safety and reliability standards. Despite this setback, the rollout of new video and screen-sharing capabilities, demonstrated separately during OpenAI’s spring press event, remains on schedule.
These capabilities include solving math problems given a picture of the problem and explaining various settings menus on a device. They are designed to work across ChatGPT on both smartphones and desktop clients, including the macOS app, which became available to all ChatGPT users earlier today.
“ChatGPT’s advanced Voice Mode can understand and respond with emotions and nonverbal cues, moving us closer to real-time, natural conversations with AI,” OpenAI stated. “Our mission is to bring these new experiences to you thoughtfully.”
During the launch event, OpenAI employees showcased ChatGPT’s ability to respond almost instantly to requests, such as solving a math problem on a piece of paper placed in front of a researcher’s smartphone camera.
The advanced voice mode generated controversy due to the default “Sky” voice’s similarity to actress Scarlett Johansson’s. Johansson later announced that she had hired legal counsel to investigate how the voice was developed and had refused OpenAI’s repeated requests to license her voice for ChatGPT. OpenAI denied using Johansson’s voice without permission or a soundalike and subsequently removed the offending voice.
As OpenAI continues to refine its advanced voice mode, users can look forward to other new features that promise to enhance their ChatGPT experience. The company remains committed to thoughtfully bringing these innovative experiences to its users, despite the challenges and delays.