Return to previous page

Voice or speech recognition is a machine ability to receive and interpret dictation or to understand and execute speech commands. The task of speech recognition is to convert speech through a computer program to a sequence of words. The ultimate dream of speech recognition as the most natural way of communication for humans is to enable people to communicate more naturally and efficiently. For many of the NLP components discussed in this book, voice recognition is often regarded as the front end. The speech system usually uses context-free grammar (CFG) or statistical n-grams in practice for the same reason that hidden Markov models (HMMs) are used for acoustic modelling. Although the technology initially dealt with applications requiring the scanning of audio data for occurrences of certain keywords, it has become an effective approach to voice recognition for a wide range of applications. Applications for voice recognition are different from any other computer application. It opens up a world of opportunities for developers, especially those who build interactive voice responses (IVRs) and other telephony applications. Vocabulary recognition is also affected by input quality. If a user calls a system, a bad cell phone connection or excessively compressed Internet audio can discover it. In the design of speech recognition applications, handling these types of cases becomes very important.

This book reflects important research into voice recognition approaches. The book mainly focuses on voice recognition and related tasks, such as voice improvement and modeling. This book will provide readers with comprehensive knowledge of modern approaches to speech recognition.