Show simple item record

dc.contributor.advisorPatil, Hemant A.
dc.contributor.authorPatel, Chirag R.
dc.date.accessioned2017-06-10T14:39:48Z
dc.date.available2017-06-10T14:39:48Z
dc.date.issued2012
dc.identifier.citationPatel, Chirag R. (2012). Person identification from their hum with inter-session variability compensation. Dhirubhai Ambani Institute of Information and Communication Technology, xiii, 70 p. (Acc.No: T00353)
dc.identifier.urihttp://drsr.daiict.ac.in/handle/123456789/390
dc.description.abstractIn this thesis, design of person recognition system from their hum is discussed. The emphasis is given to the inter-session variability of the recognition system. Standard database is not available for the inter-session variability of humming-based person recognition systems. Therefore, humming database of 50 subjects is collected in two training and six testing sessions. The MFCC (Mel Frequency Cepstral Coefficients) is the state-of-the-art feature set in the field of speech and speaker recognition systems. In this thesis, another cepstral feature viz., VTMFCC (Variable length Teager energy based MFCC) is used along with MFCC. VTMFCC captures the vocal source information. Two modulation-based features, viz., AM-FM and Q-features are introduced in this thesis. The performance of all of the four features in multi-session environment is evaluated using discriminately-trained polynomial classifier. Polynomial classifier uses out-of-class information while creating person- specific person model. Inter-session variability degrades the performance of person recognition systems due to difference in training and test sessions. This variability can be classified as intrinsic variability and extrinsic variability according to its source of origin. Inter-session variability due to speaker’s health, aging, emotional state, etc. is called intrinsic inter-session variability. The session variability due to environment conditions, noise, change in microphone and acoustic channel is called extrinsic inter-session variability. The inter-session variability degrades the performance of all four features, i.e., MFCC, VTMFCC, AM-FM and Qfeature. The difference in % EER (Equal Error Rate) of particular test session to base test session is used as the inter-session variability measure. The base test session is a test session which is collected with the training session. In this thesis, two new approaches have been proposed for the compensation of inter-session variability, viz., feature-level fusion and model-level fusion. These two approaches reduce the degradation in the performance of person recognition system due to inter-session variability and make the system robust.
dc.publisherDhirubhai Ambani Institute of Information and Communication Technology
dc.subjectSpeaker
dc.subjectEmotion Recognition System
dc.subjectSpeaker Recognition
dc.subjectAutomatic Speech Recognition System
dc.subjectMultimodal Integration
dc.subjectAudio-visual recognition
dc.subjectPerson identification
dc.classification.ddc006.454 PAT
dc.titlePerson identification from their hum with inter-session variability compensation
dc.typeDissertation
dc.degreeM. Tech
dc.student.id201011018
dc.accession.numberT00353


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record