In the past year, AI programs have become available that can be instructed to produce visual content like pictures or movies based on a textual description.
Also, the field of artificial intelligence (AI) writing made significant strides, with the release of OpenAI’s ChatGPT inspiring equal parts hope and dread for the future of the written word.
A text-to-voice program that can flawlessly mimic a person’s voice has recently come to the forefront of the news just days into 2023.
There is still a sizable minority (about 5%, or nearly 360 million people) who are unable to speak for various reasons. Microsoft’s open artificial intelligence (AI) based technology, Vall-E, has solved the speech impediment problem for people with disabilities. The software can accurately record a user’s voice for up to three seconds, allowing for transcription and playback in the user’s own voice.
Based on the Text To Speech ( TTS) system, Vall-E has garnered the attention of critics worldwide for the right reasons. Let’s see ‘why”…
Technological Advances in Microsoft AI Tool
Val-E is known for its precise accuracy. It follows a ‘neural codec language model,’ highly competent in generating accurate text. The Microsoft AI-based tool can store a hundred times more vast data in its system than its competitors existing in the market. The tech giant has also published a detailed guide regarding how Vall E can clone a person’s voice.
In fact, Microsoft conducted a study to verify such a tall claim. It found the TTS training data measured to be 60,000 hours of English speech following a ‘zero-shot’ method— which means it can generate text without any prior example or training in any specific situation or context.
Based on Meta’s EnCodec audio compression technology Vall-E can preserve a speaker’s emotional tone and acoustic environment. It is efficient to personalize speech, create audio content, and edit recordings with the help of several other AI components that can self-generate content.
Microsoft has mentioned Vall E’s intellectual emotion capture abilities on their GitHub page. It will detect sleepy, angry, upset, disgusted, etc., moods based on the original tone and generate text corresponding to the mood. On top of it, if users want, they can interlink a specific emotion to the produced text.
An interesting research paper published by Cornell University suggests that “Vall-E outperforms a zero-shot TTS system in terms of speech naturalness and speaker similarity significantly.” In addition, VALL-E could also preserve the speaker’s emotions.
Doubts Regarding The Safety Of Vall-E
However, the AI tool doesn’t enjoy praise alone. There looms a serious concern regarding the misuse of the model. Spoofing, voice identification, or voice impersonation are some privacy concerns that bother the technology experts regarding Vall-E.
Cybersecurity experts have delineated that Vall E can be used to scam people by criminals impersonating familiar voices— either a friend or a celebrity.
More to it, the lack of accent diversity can again hamper the tool’s efficiency and give an easier way for scammers to steal data. Despite Microsoft’s lack of comments on Vall-E being a possible scam tool, researchers are worried that the tool could lead to impersonating or spoofing voice signals.
Positioning Vall-E In The Competitive Market
One of the major competitors of Vall-E in the market is Tarcotron2. Tarcotron uses an end-to-end TTS system proposed by Google’s brain that can generate natural speech-given text input. The tool can also generate natural speech in the most naturalistic manner.
Microsoft has heavily invested in AI technology and is one of the backers of open AI. Almost $1 billion has been invested in Open AI since 2019. In the near future, the software giant is expected to invest another $10 billion. After the successful innovation of ChatGPT and DALL E, it is time to see how Vall-E performs post-launch in a real-world scenario.
Ready To Future-Proof Your Business?
Sign-up for a FREE account and get a sneak peek into our intuitive survey dashboard panel.
Free Trial • No Payment Details Required • Cancel Anytime