Please use this identifier to cite or link to this item: http://drsr.daiict.ac.in//handle/123456789/103
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorMitra, Suman K.
dc.contributor.advisorJoshi, Manjunath
dc.contributor.authorManwani, Naresh
dc.date.accessioned2017-06-10T14:36:59Z
dc.date.available2017-06-10T14:36:59Z
dc.date.issued2006
dc.identifier.citationManwani, Naresh (2006). Gaussian mixture models for spoken language identification. Dhirubhai Ambani Institute of Information and Communication Technology, viii, 57 p. (Acc.No: T00066)
dc.identifier.urihttp://drsr.daiict.ac.in/handle/123456789/103
dc.description.abstractLanguage Identification (LID) is the problem of identifying the language of any spoken utterance irrespective of the topic, speaker or the duration of the speech. Although A very huge amount of work has been done for automatic Language Identification, accuracy and complexity of LID systems remains major challenges. People have used different methods of feature extraction of speech and have used different baseline systems for learning purpose. To understand the role of these issues a comparative study was conducted over few algorithms. The results of this study were used to select appropriate feature extraction method and the baseline system for LID. Based on the results of the study mentioned above we have used Gaussian Mixture Models (GMM) as our baseline system which are trained using Expectation Maximization (EM) algorithm. Mel Frequency Cepstral Coefficients (MFCC), its delta and delta-delta cepstral coefficients are used as features of speech applied to the system. English and three Indian languages (Hindi, Gujarati and Telugu) are used to test the performances. In this dissertation we have tried to improve the performance of GMM for LID. Two modified EM algorithms are used to overcome the limitations of EM algorithm. The first approach is Split and Merge EM algorithm The second variation is Model Selection Based Self-Splitting Gaussian Mixture Leaning We have also prepared the speech database for three Indian languages namely Hindi, Gujarati and Telugu and that we have used in our experiments.
dc.publisherDhirubhai Ambani Institute of Information and Communication Technology
dc.subjectAutomatic speech recognition
dc.subjectGaussian mixture models
dc.subjectLanguage identification
dc.subjectSpeech perception
dc.subjectSpeech recognition
dc.subjectSpeech recognition
dc.classification.ddc621.3994 MAN
dc.titleGaussian mixture models for spoken language identification
dc.typeDissertation
dc.degreeM. Tech
dc.student.id200411002
dc.accession.numberT00066
Appears in Collections:M Tech Dissertations

Files in This Item:
File Description SizeFormat 
200411002.pdf
  Restricted Access
452.15 kBAdobe PDFThumbnail
View/Open Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.