• Login
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    Statistics

    View Usage StatisticsView Google Analytics Statistics

    Person recognition from their hum

    Thumbnail
    View/Open
    200911036.pdf (3.056Mb)
    Date
    2011
    Author
    Madhavi, Maulik C.
    Metadata
    Show full item record
    Abstract
    In this thesis, design of person recognition system based on person's hum is presented. As hum is nasalized sound and LP (Linear Predication) model does not characterize nasal sounds sufficiently, our approach in this work is based on using Mel filterbank-based cepstral features for person recognition task. The first task was consisted of data collection and corpus design procedure for humming. For this purpose, humming for old Hindi songs from around 170 subjects are used. Then feature extraction schemes were developed. Mel filterbank follows the human perception for hearing, so MFCC was used as state-of- the-art feature set. Then some modifications in filterbank structure were done in order to compute Gaussian Mel scalebased MFCC (GMFCC) and Inverse Mel scale-based MFCC (IMFCC) feature sets. In this thesis mainly two features are explored. First feature set captures the phase information via MFCC utilizing VTEO (Variable length Teager Energy Operator) in time-domain, i.e., MFCC-VTMP and second captures the vocal-source information called as Variable length Teager Energy Operator based MFCC, i.e., VTMFCC. The proposed feature set MFCCVTMP has two characteristics, viz., it captures phase information and other it uses the property of VTEO. VTEO is extension of TEO and it is a nonlinear energy tracking operator. Feature sets like VTMFCC captures the vocal-source information. This information exhibits the excitation mechanism in the speech (hum) production process. It is found to be having complementary nature of information than the vocal tract information. So the score-level fusion based approach of different source and system features improves the person recognition performance.
    URI
    http://drsr.daiict.ac.in/handle/123456789/338
    Collections
    • M Tech Dissertations [820]

    Related items

    Showing items related by title, author, creator and subject.

    • Person identification using face and speech 

      Parmar, Ajay (Dhirubhai Ambani Institute of Information and Communication Technology, 2012)
      In this thesis, we present a multimodal biometric system using face and speech features. Multimodal biometrics system uses two or more intrinsic physical or behaviour traits to provide better recognition rate than unimodal ...
    • Person identification from their hum with inter-session variability compensation 

      Patel, Chirag R. (Dhirubhai Ambani Institute of Information and Communication Technology, 2012)
      In this thesis, design of person recognition system from their hum is discussed. The emphasis is given to the inter-session variability of the recognition system. Standard database is not available for the inter-session ...
    • Speaker recognition over VoIP network 

      Goswami, Parth A. (Dhirubhai Ambani Institute of Information and Communication Technology, 2011)
      This thesis deals with the Automatic Speaker Recognition (ASR) system over narrowband Voice over Internet Protocol (VoIP) networks. There are several artifacts of VoIP network such as speech codec, packet loss and packet ...

    Resource Centre copyright © 2006-2017 
    Contact Us | Send Feedback
    Theme by 
    Atmire NV
     

     


    Resource Centre copyright © 2006-2017 
    Contact Us | Send Feedback
    Theme by 
    Atmire NV