AI voice scams: How to prevent and protect
But even OpenAI is wary about the potential misuse of the technology and says it will not release Voice Engine publicly, with it currently only being available to early testers. Australian franchises of KFC have also started testing out AI-driven voice ordering, though customers there have pushed back a bit, saying they prefer human interaction and their orders being misinterpreted by the AI window jockey. Game studio founder and voice AI advisor Mike Sorrenti is bullish on the tech. “Voice ai is an excellent thing. It can be used for translation and many other things and if a very natural interface for kids, and older adults especially those with mild disabilities such as arthritis,” he said in an email. You might think AI at work means typing prompts into ChatGPT or getting a slick summary from your inbox.
How does the new AI voice Cloning feature work?
Generally, this type of fraud can be categorized into personalized scams vs. universal scams. Then, the researchers adopted a more comprehensive approach by integrating spectral features, using an ‘off-the-shelf’ audio wave analysis package, which are more related to the frequency domain. The program extracts over six thousand features, such as summary statistics (mean, standard deviation, etc.), regression coefficients, among others, and then narrows these down to the twenty most significant features. Companies with access to Voice Engine include the education technology company Age of Learning, the visual storytelling platform HeyGen, and the health system Lifespan.
OpenAI’s Voice Engine was first developed in late 2022.
These features require some signal processing or neural network analysis to extract specific characteristics like Mel-frequency cepstral coefficients (MFCCs), commonly used in voice recognition systems. “We noticed that in this domain, there are often artifacts present in cloned voices—frequencies that wouldn’t naturally occur in a person’s voice. These artifacts, although not always perceptible to the human ear, can be detected through analysis, similar to how MP3 compression removes certain frequencies to reduce file size. These frequency domain artifacts can be a telltale sign of synthesized speech when analyzed closely,” Koorma added.
The ALS patient’s voice was cloned from recordings that were made of his voice before the disease took away his ability to speak. But it’s also ripe with potential for abuse, as it could easily be used to commit fraud, spread misinformation and generate fake audio evidence. For more, users would have to take up a paid plan from the company, which starts from $29/month and goes up to $499/month.
- “We noticed that in this domain, there are often artifacts present in cloned voices—frequencies that wouldn’t naturally occur in a person’s voice.
- Further, the Senate Human Rights Subcommittee, chaired by Sen. Jon Ossoff, D-Ga, held a hearing on June 13 to hear from witnesses impacted by AI.
- And the more you share your voice online, the easier it is for threat actors to find and clone your voice.
Not all AI voice cloning is bad, I promise
- ElevenLabs, another major player in the category, offers a feature called Instant Voice Cloning that needs at least a minute of clear audio to generate a clone almost instantly.
- When using Resemble’s web platform, users can create a digital replica of their voice by uploading an audio sample or recording a series of sentences.
- Workers are already adjusting how and when they speak during meetings, knowing that their words might be captured and re-contextualized by AI summaries.
- While Siri and Alexa both have home automation capabilities and standalone devices that can be initiated without a smartphone, ChatGPT’s voice assistant version is astonishing.
- Romit Barua, Machine Learning Engineer and Researcher from UC Berkeley explains that voice cloning involves leveraging advancements in audio signal processing and neural network technologies to replicate a person’s voice.
There is also the option of a pay-as-you-go personal plan or a bigger enterprise plan with custom pricing. The company published a bunch of samples comparing its offering with Microsoft’s VALL-E and XTTS-v2 voice cloning models, complete with the input voice sample and the text used for the clone. However, when we created a free test account to see how the tech works for real, there were some clear gaps. Resemble AI is launching Rapid Voice Cloning, a new feature of its platform that significantly expedites the process of generating voice clones.