Phonetic segmentation: unsupervised approach

Vachhani, Bhavikkumar Bhagvanbhai

Please use this identifier to cite or link to this item: http://drsr.daiict.ac.in//handle/123456789/454

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Patil, Hemant A.
dc.contributor.author	Vachhani, Bhavikkumar Bhagvanbhai
dc.date.accessioned	2017-06-10T14:40:57Z
dc.date.available	2017-06-10T14:40:57Z
dc.date.issued	2013
dc.identifier.citation	Vachhani, Bhavikkumar Bhagvanbhai (2013). Phonetic segmentation : unsupervised approach. Dhirubhai Ambani Institute of Information and Communication Technology, xv, 89 p. (Acc.No: T00417)
dc.identifier.uri	http://drsr.daiict.ac.in/handle/123456789/454
dc.description.abstract	Phonetic segmentation can find its potential application for Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) Synthesis systems. In this thesis, we propose use of different spectral features viz., Mel Frequency Cepstral Coefficients (MFCC), Cochlear Filter Cepstral Coefficients (CFCC) and Perceptual Linear Prediction Cepstral Coefficients (PLPCC)-based features to detect spectral transition measure (STM) for automatic phonetic boundaries. We propose a new unsupervised algorithm by combining evidences from state-of-the-art Mel Frequency Cepstral Coefficients (MFCC) and proposed CFCC to improve the accuracy in automatic phonetic boundaries detection process. Using proposed fusion-based approach, we achieve 90 % (i.e., 8 % better than MFCC-based STM alone for 20 ms tolerance interval) accuracy for automatic boundary detection of entire TIMIT database. Using proposed PLPCC-base STM approach, we achieve 85 % (i.e., 3 % better than state-of the art Mel- frequency Cepstral Coefficients (MFCC)-based STM for 20 ms tolerance interval) accuracy and 15 % over-segmentation rate (i.e., 8 % less than MFCC-based STM) for automatic boundary detection of 2, 34, 925 phone boundaries corresponding 630 speakers of entire TIMIT database. The second part of the thesis is focusing on development of various applications using automatically segmented and labeled boundaries.
dc.publisher	Dhirubhai Ambani Institute of Information and Communication Technology
dc.subject	Signal processing
dc.subject	Automatic speech recognition
dc.subject	Speech processing systems
dc.subject	Speech synthesis
dc.classification.ddc	621.3819598 VAC
dc.title	Phonetic segmentation: unsupervised approach
dc.type	Dissertation
dc.degree	M. Tech
dc.student.id	201111042
dc.accession.number	T00417
Appears in Collections:	M Tech Dissertations

Files in This Item:

File	Description	Size	Format
201111042.pdf Restricted Access		2.29 MB	Adobe PDF	View/Open Request a copy

Show simple item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets