Residual Neural Network precisely quantifies dysarthria severity-level based on short-duration speech segments

Gupta, Siddhant; Patil, Ankur T; Purohit, Mirali; Patel, Maitreya; Guido, Rodrigo Capobianco; Patil, Hemant

Publication:
Residual Neural Network precisely quantifies dysarthria severity-level based on short-duration speech segments

dc.contributor.affiliation	DA-IICT, Gandhinagar
dc.contributor.author	Gupta, Siddhant
dc.contributor.author	Patil, Ankur T
dc.contributor.author	Purohit, Mirali
dc.contributor.author	Patel, Maitreya
dc.contributor.author	Guido, Rodrigo Capobianco
dc.contributor.author	Patil, Hemant
dc.contributor.researcher	Gupta, Siddhant (201911007)
dc.contributor.researcher	Patil, Ankur T (201621008)
dc.contributor.researcher	Purohit, Mirali (201811067)
dc.contributor.researcher	Patel, Maitreya (201601160)
dc.date.accessioned	2025-08-01T13:09:01Z
dc.date.issued	01-07-2021
dc.description.abstract	Recently, we have witnessed Deep Learning methodologies gaining significant attention for severity-based classification of dysarthric speech. Detecting dysarthria, quantifying its severity, are of paramount importance in various real-life applications, such as the assessment of patients' progression in treatments, which includes an adequate planning of their therapy and the improvement of speech-based interactive systems in order to handle pathologically-affected voices automatically. Notably, current speech-powered tools often deal with short-duration speech segments and, consequently, are less efficient in dealing with impaired speech, even by using Convolutional Neural Networks (CNNs). Thus, detecting dysarthria severity-level based on short speech segments might help in improving the performance and applicability of those systems. To achieve this goal, we propose a novel Residual Network (ResNet)-based technique which receives short-duration speech segments as input. Statistically meaningful objective analysis of our experiments, reported over standard Universal Access corpus, exhibits average values of 21.35% and 22.48% improvement, compared to the baseline CNN, in terms of classification accuracy and F1-score, respectively. For additional comparisons, tests with Gaussian Mixture Models and Light CNNs were also performed. Overall, the values of 98.90% and 98.00% for classification accuracy and F1-score, respectively, were obtained with the proposed ResNet approach, confirming its efficacy and reassuring its practical applicability.
dc.format.extent	105-117
dc.identifier.citation	Siddhant Gupta, Ankur T. Patil, Mirali Purohit, Maitreya Patel, Patil, Hemant A,, and Rodrigo Capobianco Guido,"Residual Neural Network precisely quantifies dysarthria severity-level based on short-duration speech segments," Neural Networks, Elsevier, vol. 139,pp. 105-117, Jul. 2021. doi:10.1016/j.neunet.2021.02.008.
dc.identifier.doi	10.1016/j.neunet.2021.02.008
dc.identifier.issn	0893-6080
dc.identifier.scopus	2-s2.0-85102061061
dc.identifier.uri	https://ir.daiict.ac.in/handle/dau.ir/1556
dc.identifier.wos	WOS:000652750100009
dc.language.iso	en
dc.publisher	Elsevier
dc.relation.ispartofseries	Vol. 139; No.
dc.source	Neural Networks
dc.source.uri	https://www.sciencedirect.com/science/article/pii/S0893608021000502?via%3Dihub
dc.title	Residual Neural Network precisely quantifies dysarthria severity-level based on short-duration speech segments
dspace.entity.type	Publication
relation.isAuthorOfPublication	fdb7041b-280e-498b-b2ee-34f9bc351f4c
relation.isAuthorOfPublication.latestForDiscovery	fdb7041b-280e-498b-b2ee-34f9bc351f4c

Collections

Journal Article

Publication: Residual Neural Network precisely quantifies dysarthria severity-level based on short-duration speech segments

Files

Collections

Publication:
Residual Neural Network precisely quantifies dysarthria severity-level based on short-duration speech segments