Publication: Replay Spoof Detection Using Energy Separation Based Instantaneous Frequency Estimation From Quadrature and In-Phase Components
dc.contributor.affiliation | DA-IICT, Gandhinagar | |
dc.contributor.author | Gupta, Priyanka | |
dc.contributor.author | Chodingala, Piyush | |
dc.contributor.author | Patil, Hemant | |
dc.contributor.researcher | Gupta, Priyanka (201721001) | |
dc.contributor.researcher | Chodingala, Piyush (202015002) | |
dc.date.accessioned | 2025-08-01T13:09:02Z | |
dc.date.issued | 01-01-2023 | |
dc.description.abstract | Replay attacks in speech are becoming easier to mount with the advent of high quality of recording and playback devices. This makes these replay attacks a major concern for the security of Automatic Speaker Verification (ASV) systems and�voice assistants. In the past, auditory transform-based as well as�Instantaneous Frequency�(IF)-based features have been proposed for replay spoofed speech detection (SSD). In this context, IF has been estimated either by derivative of analytic phase via�Hilbert transform, or by using high temporal resolution Teager Energy Operator (TEO)-based Energy Separation Algorithm (ESA). However, excellent temporal resolution of ESA comes with lacking in using�relative phase�information and vice-versa. To that effect, we propose novel Cochlear Filter Cepstral Coefficients-based�Instantaneous Frequency�using Quadrature Energy Separation Algorithm (CFCCIF-QESA) features, with excellent temporal resolution as well as relative�phase information. CFCCIF-QESA is designed by exploiting�relative phase shift�to estimate IF, without estimating phase explicitly from the signal. To motivate and validate effectiveness of proposed QESA approach for IF estimation, we have employed information-theoretic measures, such as�Mutual Information�(MI), Kullback�Leibler (KL) divergence, and Jensen�Shannon (JS) divergence. The proposed CFCCIF-QESA feature set is extensively evaluated on standard statistically meaningful ASVSpoof 2017 version2.0 dataset. When evaluated on the ASVSpoof 2017 v2.0 dataset, CFCCIF-QESA achieves improved performance as compared to CFCCIF-ESA and CQCC feature sets on�GMM,�CNN, and LCNN classifiers. Furthermore, in the case of cross-database evaluation using ASVSpoof 2017 v2.0 and VSDC, CFCCIF-QESA also performs relatively better as compared to CFCCIF-ESA and CQCC on�GMM�classifier. However, for the case of self-classification on the ASVSpoof 2019 PA data, CFCCIF-QESA only outperforms CFCCIF-ESA. Whereas, on BTAS 2016 dataset, it performs relatively close to CFCCIF-ESA. Finally, results are presented for the case when the ASV system is not under attack. | |
dc.format.extent | 1-23 | |
dc.identifier.citation | Priyanka Gupta, Piyush Chodingala and Patil, Hemant A, "Replay Spoof Detection Using Energy Separation Based Instantaneous Frequency Estimation From Quadrature and In-Phase Components," Computer, Speech and Language, Elsevier, ISSN: 0885-2308, vol. 77, Jan. 2023, article no. 101423, pp. 1-23, doi: 10.1016/j.csl.2022.101423. [Published Date: 16 Jun. 2022] | |
dc.identifier.doi | 10.1016/j.csl.2022.101423 | |
dc.identifier.issn | 0885-2308 | |
dc.identifier.scopus | 2-s2.0-85134568965 | |
dc.identifier.uri | https://ir.daiict.ac.in/handle/dau.ir/1564 | |
dc.identifier.wos | WOS:000828692000001 | |
dc.language.iso | en | |
dc.publisher | Elsevier | |
dc.relation.ispartofseries | Vol. 77; No. | |
dc.source | Computer, Speech and Language | |
dc.source.uri | https://www.sciencedirect.com/science/article/pii/S0885230822000559?via%3Dihub | |
dc.title | Replay Spoof Detection Using Energy Separation Based Instantaneous Frequency Estimation From Quadrature and In-Phase Components | |
dspace.entity.type | Publication | |
relation.isAuthorOfPublication | fdb7041b-280e-498b-b2ee-34f9bc351f4c | |
relation.isAuthorOfPublication.latestForDiscovery | fdb7041b-280e-498b-b2ee-34f9bc351f4c |