J
Joshua Fagbemi
Guest
Microsoft is set to introduce a voice clone feature that allows users on Microsoft Teams to speak in different languages. During Tuesday’s Microsoft Ignite 2024, the company unveiled “Interpreter”, a tool that delivers “real-time, speech-to-speech” interpretation capabilities.
The feature which will be included in early 2025 will see people using Teams for meetings have an Interpreter to stimulate their voice in nine languages such as English, Italian, Japanese, French, German, Korean, Portuguese, Spanish, and Mandarin Chinese.
Part of the plan is to let Microsoft Teams users have their voice sound-alike to speak foreign languages.
“Imagine being able to sound just like you in a different language. Interpreter in Teams provides real-time speech-to-speech translation during meetings, and you can opt to have it simulate your speaking voice for a more personal and engaging experience.” Microsoft CMO Jared Spataro said in a blog post shared with Techcrunch.
A demo of Microsoft Interpreter
On details about the feature, Microsoft said it will only be available to Microsoft 365 subscribers. It also made clarifications that the tool doesn’t store any personal identities such as biometric data. It added that the feature can be disabled through Teams settings and doesn’t add sentiments to what’s “naturally present” in a voice.
“Interpreter is designed to replicate the speaker’s message as faithfully as possible without adding assumptions or extraneous information. Voice simulation can only be enabled when users provide consent via notification during the meeting or by enabling ‘Voice simulation consent’ in settings,” A Microsoft Spokesperson said.
Generally, Interpreter in Teams is a relatively narrow application of voice cloning. This doesn’t mean the tool will be safe from abuse. For example, a negative actor might feed the “Interpreter” with a misleading recording such as asking for bank account information to get a translation into the language of their target.
However, Microsoft has yet to roll out safeguard measures that will guide the usage of its Interpreter.
Like Microsoft, a number of AI powerhouse have developed tech to digitally mimic voices that sound reasonably natural.
In September, during the Meta Connect 2024 developer conference in Menlo Park, Meta also said it’s working on a Meta AI translation tool to automatically translate voices in Instagram Reels. The tool dubs a creator’s speech and auto-lip-syncs it. It then simulates the voice in another language and makes sure the lip movements match.
Meta says that it’s starting with “small tests” of Reels translations on Facebook and Instagram. For now, It will employ some creators’ videos from Latin America in the United States in English and Spanish.
Also, ElevenLabs, an AI voice creator, offers a robust platform for multilingual speech generation.
The latest development of AI voice clones poses more security concerns such as “Deepfakes”.
Over the past few years, Deepfakes have spread across social media, making it harder to distinguish truth from disinformation. Deepfakes have also been used to target individuals through the means of impersonation. According to the US Federal Trade Commission (FTC), about $1 billion was lost to impersonation last year.
This year, deepfakes have also gone widespread featuring President Joe Biden, Taylor Swift – an American singer-songwriter, and US Vice President Kamala Harris which generated a series of shares and conversations.
Another report by Microtime has it that a team of cyber criminals reportedly staged a Teams meeting with a company’s C-level staff. It was so clear that the target company wired $25 million to the criminals.
As part of the AI cloning risks together with deepfakes concerns, OpenAI earlier this year decided against releasing its voice cloning tech, Voice Engine.
AI translators often struggle to accurately convey colloquialisms, analogies, intonations, and cultural nuances. However, the cost savings are attractive enough to make the trade-off worth it for some. According to Markets and Markets, the sector for natural language processing technologies which houses translation technologies could be worth $35.1 billion by 2026.
Also Read: Man bags 18 years in jail for using AI to create child sexual abuse images and selling them online.
The feature which will be included in early 2025 will see people using Teams for meetings have an Interpreter to stimulate their voice in nine languages such as English, Italian, Japanese, French, German, Korean, Portuguese, Spanish, and Mandarin Chinese.
Part of the plan is to let Microsoft Teams users have their voice sound-alike to speak foreign languages.
“Imagine being able to sound just like you in a different language. Interpreter in Teams provides real-time speech-to-speech translation during meetings, and you can opt to have it simulate your speaking voice for a more personal and engaging experience.” Microsoft CMO Jared Spataro said in a blog post shared with Techcrunch.
A demo of Microsoft Interpreter
On details about the feature, Microsoft said it will only be available to Microsoft 365 subscribers. It also made clarifications that the tool doesn’t store any personal identities such as biometric data. It added that the feature can be disabled through Teams settings and doesn’t add sentiments to what’s “naturally present” in a voice.
“Interpreter is designed to replicate the speaker’s message as faithfully as possible without adding assumptions or extraneous information. Voice simulation can only be enabled when users provide consent via notification during the meeting or by enabling ‘Voice simulation consent’ in settings,” A Microsoft Spokesperson said.
Generally, Interpreter in Teams is a relatively narrow application of voice cloning. This doesn’t mean the tool will be safe from abuse. For example, a negative actor might feed the “Interpreter” with a misleading recording such as asking for bank account information to get a translation into the language of their target.
However, Microsoft has yet to roll out safeguard measures that will guide the usage of its Interpreter.
Like Microsoft, a number of AI powerhouse have developed tech to digitally mimic voices that sound reasonably natural.
In September, during the Meta Connect 2024 developer conference in Menlo Park, Meta also said it’s working on a Meta AI translation tool to automatically translate voices in Instagram Reels. The tool dubs a creator’s speech and auto-lip-syncs it. It then simulates the voice in another language and makes sure the lip movements match.
Meta says that it’s starting with “small tests” of Reels translations on Facebook and Instagram. For now, It will employ some creators’ videos from Latin America in the United States in English and Spanish.
Also, ElevenLabs, an AI voice creator, offers a robust platform for multilingual speech generation.
AI voice clone and deepfakes dangers
The latest development of AI voice clones poses more security concerns such as “Deepfakes”.
Over the past few years, Deepfakes have spread across social media, making it harder to distinguish truth from disinformation. Deepfakes have also been used to target individuals through the means of impersonation. According to the US Federal Trade Commission (FTC), about $1 billion was lost to impersonation last year.
This year, deepfakes have also gone widespread featuring President Joe Biden, Taylor Swift – an American singer-songwriter, and US Vice President Kamala Harris which generated a series of shares and conversations.
Another report by Microtime has it that a team of cyber criminals reportedly staged a Teams meeting with a company’s C-level staff. It was so clear that the target company wired $25 million to the criminals.
As part of the AI cloning risks together with deepfakes concerns, OpenAI earlier this year decided against releasing its voice cloning tech, Voice Engine.
AI translators often struggle to accurately convey colloquialisms, analogies, intonations, and cultural nuances. However, the cost savings are attractive enough to make the trade-off worth it for some. According to Markets and Markets, the sector for natural language processing technologies which houses translation technologies could be worth $35.1 billion by 2026.
Also Read: Man bags 18 years in jail for using AI to create child sexual abuse images and selling them online.