Please use this identifier to cite or link to this item:
http://drsr.daiict.ac.in//handle/123456789/1198
Title: | Features for Speech Emotion Recognition |
Authors: | Patil, Hemant A. Uthiraa, S. |
Keywords: | Speech Emotion Recognition Constant Q Pitch Coefficients Constant Q Harmonic Coefficients Linear Frequency Residual Cepstral Coefficients Modified Group Delay Cepstral Coefficients Whisper GMM CNN ResNet TDNN |
Issue Date: | 2023 |
Publisher: | Dhirubhai Ambani Institute of Information and Communication Technology |
Citation: | Uthiraa, S. (2023). Features for Speech Emotion Recognition. Dhirubhai Ambani Institute of Information and Communication Technology. xiii, 109 p. (Acc. # T01139). |
Abstract: | The easiest and most effective or natural way of communication is through speech;the emotional aspect of speech leads to effective interpersonal communication.As technological advancements continue to proliferate, the dependence of humanson machines is also increasing, thereby making it imperative to establish efficientmethods for Speech Emotion Recognition (SER) to ensure effective humanmachineinteraction. This thesis focuses on understanding acoustic characteristicsof various emotions and their dependence on the culture and languageused. It then proposes a new feature set, namely, Constant Q Pitch Coefficients(CQPC) and Constant Q Harmonic Coefficients (CQHC) from Constant Q Transform,which captures high resolution pitch and harmonic information, respectively.Further, this thesis focuses on less explored excitation source-based featuresand proposes a novel Linear Frequency Residual Cepstral Coefficients (LFRCC)feature set for the same. Phase-based features, namely Modified Group DelayCepstral Coefficients (MGDCC), is proposed to capture vocal tract and vocal foldinformation well for emotion classification. The recently developed AutomaticSpeech Recognition (ASR) model, Whisper, is used to analyze cross-database SER.This thesis extends the LFRCC idea on the infant cry classification problem. Lastly,a local API is developed for SER. |
URI: | http://drsr.daiict.ac.in//handle/123456789/1198 |
Appears in Collections: | M Tech Dissertations |
Files in This Item:
File | Size | Format | |
---|---|---|---|
202111065.pdf | 12.67 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.