Computer Science, asked by rajputsr6990, 1 year ago

How minimum distance classifier use in speech to text conversion using matlab system?

Answers

Answered by ruhu36

The current work presents a multilingual speech-to-text conversion system. Conversion is based on information in speech signal. Speech is the natural and most important form of communication for human being. Speech-To-Text (STT) system takes a human speech utterance as an input and requires a string of words as output. The objective of this system is to extract, characterize and recognize the information about speech. The proposed system is implemented using Mel-Frequency Cepstral Coefficient (MFCC) feature extraction technique and Minimum Distance Classifier, Support Vector Machine (SVM) methods for speech classification. Speech utterances are pre-recorded and stored in a database. Database mainly divided into two parts testing and training. Samples from training database are passed through training phase and features are extracted. Combining features for each sample forms feature vector which is stored as reference. Sample to be tested from testing part is given to system and its features are extracted. Similarity between these features and reference feature vector is computed and words having maximum similarity are given as output. The system is developed in MATLAB (R2010a) environment.

Introduction

Speech Recognition is the procedure of extracting essential information from input speech signal to make accurate decision about the corresponding text. Speech signal conveys very rich information, such as speaker information, linguistic information which has inspired many researchers to develop the system that automatically process the speech e.g. speech enhancement, speech synthesis, speech compression, speaker recognition, speech recognition and verification. Speech recognition can be further classified as speaker dependent and speaker independent [1]. Computer follows human voice commands with the help of speech recognition mechanism and understand human languages i.e. it acts as good interface for human computer interaction. Generally today's speech recognition technologies are designed for English language. So that illiterate rural communities or educationally under-privileged people are being kept away of computer technology. If the processing of computer technology in native language is made possible i.e. if computer technologies can understand the native language then it will be easy to use computer technologies for illiterate people, people from rural communities or educationally under-privileged. Marathi is a native language of Maharashtra. In a day to day life while speaking we use English words, i.e. most of the time we mix English with native language. So author has designed Multilingual Speech-To-Text conversion system. In which Marathi, English, Marathi-English mix speech has given focus.

Previous Question

Next Question