First, OpenAI offered a tool that allowed people to create digital images simply by describing what they wanted to see. Then, he created similar technology that generated full-motion video like something out of a Hollywood movie.
Now it has unveiled technology that can recreate someone's voice.
The high-profile AI startup said Friday that a small group of companies was testing a new OpenAI system, Voice Engine, that can recreate a person's voice from a 15-second recording. If you upload a recording of yourself and a paragraph of text, you can read the text using a synthetic voice that sounds like your own.
The text does not have to be in your native language. If you speak English, for example, it can recreate your voice in Spanish, French, Chinese or many other languages.
OpenAI isn't sharing the technology more widely because it's still trying to understand its potential dangers. Like image and video generators, a voice generator could help spread misinformation on social media. It could also allow criminals to impersonate people online or during phone calls.
The company said it is particularly concerned that this type of technology could be used to crack voice authenticators that control access to online bank accounts and other personal applications.
“This is a sensitive issue and it's important to get it right,” an OpenAI product manager, Jeff Harris, said in an interview.
The company is exploring ways to watermark synthetic voices or add controls that prevent people from using the technology with the voices of politicians or other prominent figures.
Last month, OpenAI took a similar approach when it unveiled its video generator, Sora. He showed off the technology but didn't make it public.
OpenAI is among many companies that have developed a new generation of AI technology that can generate synthetic voices quickly and easily. They include tech giants like Google and startups like New York-based ElevenLabs. (The New York Times is suing OpenAI and its partner, Microsoft, over copyright infringement claims involving AI systems that generate text.)
Businesses can use these technologies to generate audiobooks, give voice to online chatbots, or even create an automated DJ radio station. Since last year, OpenAI has been using its technology to power a version of ChatGPT that talks. And it has long offered companies a set of voices that can be used for similar applications. All were constructed from clips provided by voice actors.
But the company has not yet offered a public tool that allows individuals and businesses to recreate voices from a short clip like Voice Engine does. The ability to recreate any voice in this way, Harris said, is what makes the technology dangerous. The technology could be especially dangerous in an election year, he said.
In January, New Hampshire residents received robocall messages dissuading them from voting in the state primary in a voice that was most likely artificially generated to sound like President Biden's. The Federal Communications Commission subsequently banned such calls.
Mr Harris said OpenAI has no immediate plans to profit from the technology. He says the tool could be particularly useful to people who have lost their voice due to illness or accident.
It demonstrated how technology had been used to recreate a woman's voice after brain cancer had damaged her. She could now speak, she said, after providing a short recording of a presentation she had once given when she was a high school student.