Please use this identifier to cite or link to this item: http://drsr.daiict.ac.in//handle/123456789/974
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorPatil, Hemant A.
dc.contributor.authorPurohit, Mirali Virendrabhai
dc.date.accessioned2020-09-22T14:19:41Z
dc.date.available2023-02-17T14:19:41Z
dc.date.issued2020
dc.identifier.citationPurohit, Mirali Virendrabhai (2020). Deep learning techniques for speech pathology applications. Dhirubhai Ambani Institute of Information and Communication Technology. xiv, 113 p. (Acc.No: T00892)
dc.identifier.urihttp://drsr.daiict.ac.in//handle/123456789/974
dc.description.abstractHuman-machine interaction has gained more attention due to its interesting applications in industries and day-to-day life. In recent years, speech technologies have grown rapidly because of the advancement in fields of machine learning and deep learning. Various deep learning architectures have shown state-of-theart results in different areas, such as computer vision, medical domain, etc. We achieved massive success in developing speech-based systems, i.e., Intelligent Personal Assistants (IPAs), chatbots, Text-To-Speech (TTS), etc. However, there are certain limitations to these systems. Speech processing systems efficiently work only on normal-mode speech and hence, show poor performance on the other kinds of speech such as impaired speech, far-field speech, shouted speech, etc. This thesis work is contributed to the improvement of impaired speech. To address this problem, this work has two major approaches: 1) classification, and 2) conversion technique. The new paradigm, namely, weak speech supervision is explored to overcome the data scarcity problem and proposed for the classification task. In addition, the effectiveness of the residual network-based classifier is shown over the traditional convolutional neural network-based model for the multi-class classification of pathological speech. With this, using Voice Conversion (VC)-based techniques, variants of generative adversarial networks are proposed to repair the impaired speech to improve the performance of Voice Assistant (VAs). Performance of these various architectures is shown via objective and subjective evaluations. Inspired by the work done using the VC-based technique, this thesis is also contributed in the voice conversion field. To that effect, a state-of-the-art system, namely, adaptive generative adversarial network is proposed and analyzed via comparing it with the recent state-of-the-art method for voice conversion.
dc.subjectMachine learning
dc.subjectDeep learning
dc.subjectWeak supervision
dc.subjectGenerative adversarial network
dc.subjectDysarthria
dc.subjectWhisper
dc.subjectVoice conversion
dc.classification.ddc006.454 PUR
dc.titleDeep learning techniques for speech pathology applications
dc.typeDissertation
dc.degreeM. Tech
dc.student.id201811067
dc.accession.numberT00892
Appears in Collections:M Tech Dissertations

Files in This Item:
File Description SizeFormat 
201811067.pdf
  Restricted Access
5.41 MBAdobe PDFView/Open Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.