What is Speech to text ?
Speech to text (also called Speech Recognition) is the process of converting speech into its textual equivalent. It allows applying text processing solutions to audio inputs coming from different sources (phone conversations, voice messages, radio ...)
OnMobile Speech Product Unit is actively participating in the progress of Speech to text's exciting technology through development of multiple components as well as the launching of surrounding innovative products.
Where is it used?
Speech to text is a rapidly progressing technology.
It is evolving as fast as the increase of its usage and is replacing manual processing in the following business areas:
OnMobile's technologies for Speech to text
Two core technologies are at the center of OnMobile's speech to text know-how:
teliScribe is OnMobile's core large vocabulary speech decoder for analysis of interactive conversational systems, voice message to text or audio indexing applications.
It delivers state of the art performance for audio documents and conversational speech in English and French today, in most of the other European languages tomorrow. Please let us know if you are looking for other languages (already available in our teliSpeech catalog or not).
teliScribe is a higher performance speech recognition engine, it has the ability to deal with vocabulary sizes up to 300K words.
The only information you have to provide is the language you wish to use. teliScribe comes with a generic statistical language models, so you don't need to write a grammar or give a list of words like you do with teliSpeech.
teliScribe allows streaming audio and models for processing conversational telephony, webcasts, and languages.
teliScribe also allows an application to enhance its performance in a particular domain that has unique expressions or terminologies, for instance voice messages in the support center of a mobile operator, audio files of sport news or conversation recordings in the call center of a bank.
teliScribe's accuracy depends on the domain of speech, the size of the vocabulary, the ratio of proper name and the type of expressions used. For instance, teliScribe can on one hand achieve very good accuracy when transcribing radio news speech but on the other hand delivers less good results when transcribing an informal voice message.
There are three main domains of applications: