Show simple item record

dc.contributor.advisorPatil, Hemant A.
dc.contributor.authorPurohit, Mirali Virendrabhai
dc.date.accessioned2020-09-22T14:19:41Z
dc.date.available2023-02-17T14:19:41Z
dc.date.issued2020
dc.identifier.citationPurohit, Mirali Virendrabhai (2020). Deep learning techniques for speech pathology applications. Dhirubhai Ambani Institute of Information and Communication Technology. xiv, 113 p. (Acc.No: T00892)
dc.identifier.urihttp://drsr.daiict.ac.in//handle/123456789/974
dc.description.abstractHuman-machine interaction has gained more attention due to its interesting applications in industries and day-to-day life. In recent years, speech technologies have grown rapidly because of the advancement in fields of machine learning and deep learning. Various deep learning architectures have shown state-of-theart results in different areas, such as computer vision, medical domain, etc. We achieved massive success in developing speech-based systems, i.e., Intelligent Personal Assistants (IPAs), chatbots, Text-To-Speech (TTS), etc. However, there are certain limitations to these systems. Speech processing systems efficiently work only on normal-mode speech and hence, show poor performance on the other kinds of speech such as impaired speech, far-field speech, shouted speech, etc. This thesis work is contributed to the improvement of impaired speech. To address this problem, this work has two major approaches: 1) classification, and 2) conversion technique. The new paradigm, namely, weak speech supervision is explored to overcome the data scarcity problem and proposed for the classification task. In addition, the effectiveness of the residual network-based classifier is shown over the traditional convolutional neural network-based model for the multi-class classification of pathological speech. With this, using Voice Conversion (VC)-based techniques, variants of generative adversarial networks are proposed to repair the impaired speech to improve the performance of Voice Assistant (VAs). Performance of these various architectures is shown via objective and subjective evaluations. Inspired by the work done using the VC-based technique, this thesis is also contributed in the voice conversion field. To that effect, a state-of-the-art system, namely, adaptive generative adversarial network is proposed and analyzed via comparing it with the recent state-of-the-art method for voice conversion.
dc.subjectMachine learning
dc.subjectDeep learning
dc.subjectWeak supervision
dc.subjectGenerative adversarial network
dc.subjectDysarthria
dc.subjectWhisper
dc.subjectVoice conversion
dc.classification.ddc006.454 PUR
dc.titleDeep learning techniques for speech pathology applications
dc.typeDissertation
dc.degreeM. Tech
dc.student.id201811067
dc.accession.numberT00892


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record