How OPPO is using Azure AI Speech to bring new innovative Ai features to their phones

Azure speech to text enables developers to quickly and accurately transcribe audio to text in more than 100 languages and variants. It also supports custom models to enhance accuracy for domain-specific terminology. At Microsoft Build we are announcing a new Fast Transcription API in preview in June which enables developers to create accurate transcripts of audio with 40x RTF processing. This means that for example a 10 minute audio file can be transcribed in 15 seconds.  With this new simple synchronous REST API, scenarios that require quick generation of a transcript from audio become easy to implement. The Azure Speech Service also provides a language ID capability that can be used to identify the spoken language based on the audio which enables developers to simplify the user experience for users that interact with audio in multiple languages.

Azure text to speech enables developers to convert text into human like synthesized speech. Neural TTS is a text to speech system that uses deep neural networks to make the voices of computers nearly indistinguishable from recordings of people. It provides human-like natural prosody and clear articulation of words, which significantly reduces listening fatigue during interaction with AI systems.  Azure AI text to speech offers more than 400 voices and more than 140 languages and locales. A single pre-built realistic neural voice with multilingual support  makes it easy to read content in a broad range of languages in the same voice. You can try the demo and hear the voices in the voice gallery.

OPPO, a leading global technology brand known for its innovative smartphones and smart devices, will announce its new AI phone to pilot a new user experience based on these new technologies. Some of the new features for users are fast transcription of audio recordings for notes and to-dos, as well as read aloud of articles to enable users to use smartphones without eye contact.

With the new fast transcription feature of Azure AI Speech Service, OPPO has been able to use the following architecture to create a smooth user experience for audio recording transcription:

Audio Recording TranscriptionAudio Recording TranscriptionThe “article reading” feature can be easily implemented based on Azure Text to Speech (TTS) Service with the pre-built multilingual neural voices:

Text to speech workflowText to speech workflow

With its outstanding fast speech transcription capabilities and advanced speech synthesis capabilities, Azure AI Speech Service has greatly elevated the AI speech and text experience on OPPO AI phones.


This article was originally published by Microsoft's Azure AI Services Blog. You can find the original article here.