Analysis of nonlinearity in speech production mechanism for speaker verification: phase-based approach

Agrawal, Purvi

Please use this identifier to cite or link to this item: http://drsr.daiict.ac.in//handle/123456789/575

Title:	Analysis of nonlinearity in speech production mechanism for speaker verification: phase-based approach
Authors:	Patil, Hemant A. Agrawal, Purvi
Keywords:	Analysis of nonlinearity Linear systems Linear models Statistics Automatic speech recognition
Issue Date:	2015
Publisher:	Dhirubhai Ambani Institute of Information and Communication Technology
Citation:	Agrawal, Purvi (2015). Analysis of nonlinearity in speech production mechanism for speaker verification: phase-based approach. Dhirubhai Ambani Institute of Information and Communication Technology, xv, 67 p. (Acc.No: T00538)
Abstract:	Many of the real-world signal processing problems can be described using linear models, and can be realized as analog or digital filter, time-invariant filters; finite or infinite impulse response (IIR or FIR) filters. In the recent past, a nonlinear operator called Teager Energy Operator (TEO) has been introduced and investigated as it has a small window in temporal-domain, making it ideal for local time analysis of signals. This thesis aims to explore the nonlinear nature of the speech production mechanism of a speaker. There has been significant advancement in exploring the source and system-based features for speaker recognition attributed to the characteristics of the excitation source and size and shape of the vocal tract. In this work, TEO phase features are derived from fullband speech signal and then on subband speech signal (due to the property of the TEO being a monocomponent operator). In addition, a feature set is derived from residual phase extracted from nonlinear filter designed using Volterra-Weiner (VW) series exploiting higher-order linear as well as nonlinear relationships hidden in the sequence of samples of speech signal. Experiments have been performed on the score-level fusion of the proposed feature sets with state-of-the-art MFCC features for text-independent Speaker Verification (SV) task, based on Gaussian Mixture Model-Universal Background Model (GMM-UBM) system, respectively. The performance of each feature set is evaluated and a comparative study of each of the features is presented. The results obtained provide an evaluation of the nature of the speech production mechanism and provides features to improve performance of SV system.
URI:	http://drsr.daiict.ac.in/handle/123456789/575
Appears in Collections:	M Tech Dissertations

Files in This Item:

File	Description	Size	Format
201311045.pdf Restricted Access		2.42 MB	Adobe PDF	View/Open Request a copy

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets