Please use this identifier to cite or link to this item:
http://drsr.daiict.ac.in//handle/123456789/614
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Patil, Hemant A. | |
dc.contributor.author | Sharma, Manisha | |
dc.date.accessioned | 2017-06-10T14:44:34Z | |
dc.date.available | 2017-06-10T14:44:34Z | |
dc.date.issued | 2016 | |
dc.identifier.citation | Sharma, Manisha (2016). Automatic speech recognition using deep neural networks. Dhirubhai Ambani Institute of Information and Communication Technology, x, 57p. (Acc.No: T00577) | |
dc.identifier.uri | http://drsr.daiict.ac.in/handle/123456789/614 | |
dc.description.abstract | Automatic Speech Recognition (ASR) is an important field of research because ofits widespread use in various fields such as military, health services, day-to-dayactivities, etc. ASR task was earlier done using GMM-HMM model, where a setof Gaussian Mixture Models (GMMs) were used to statistically model the acousticmodels of speech signals and HMMs (Hidden Markov Models) were used toprovide a framework for constructing models for sequential structure of speechspectral vectors. Mel Frequency Cepstral Coefficients (MFCCs) or Mel-filterbankfeatures were used for feature extraction techniques. However, the GMM-HMMmodel found it quite challenging to model a wide range of speakers, speakingstyles, accents, background noises, etc. Recently, Deep Neural Networks (DNNs)have provided an alternative for GMMs in generating acoustic models becausethe hybrid DNN-HMM system gives significant improvements over state-of-theartGMM-HMM systems on ASR tasks. Different variants of DNNs have beenproposed till date which further reduce the error rates. One such variant is usinggeneralized maxout networks using p??norm for generalization which takes thep??norm over groups of inputs. Deep networks have also been used to extractfeatures from speech signals. One such method, using subband autoencoders(SBAE), has been proposed in this work. This is used in combination with thefilterbank features for ASR tasks. The performance of the combined system wascompared with the system trained with filterbank only and our SBAE features,with p??norm maxout network at p equal to 2 are found to perform best. | |
dc.publisher | Dhirubhai Ambani Institute of Information and Communication Technology | |
dc.subject | Automatic Speech Recognition | |
dc.subject | Deep Neural Network | |
dc.subject | Speech Recognition | |
dc.classification.ddc | 006.454 SHA | |
dc.title | Automatic speech recognition using deep neural networks | |
dc.type | Dissertation | |
dc.degree | M. Tech | |
dc.student.id | 201411028 | |
dc.accession.number | T00577 | |
Appears in Collections: | M Tech Dissertations |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
201411028.pdf Restricted Access | 2.23 MB | Adobe PDF | View/Open Request a copy |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.