Publication:
Replay Spoof Detection Using Energy Separation Based Instantaneous Frequency Estimation From Quadrature and In-Phase Components

dc.contributor.affiliationDA-IICT, Gandhinagar
dc.contributor.authorGupta, Priyanka
dc.contributor.authorChodingala, Piyush
dc.contributor.authorPatil, Hemant
dc.contributor.researcherGupta, Priyanka (201721001)
dc.contributor.researcherChodingala, Piyush (202015002)
dc.date.accessioned2025-08-01T13:09:02Z
dc.date.issued01-01-2023
dc.description.abstractReplay attacks in speech are becoming easier to mount with the advent of high quality of recording and playback devices. This makes these replay attacks a major concern for the security of Automatic Speaker Verification (ASV) systems and�voice assistants. In the past, auditory transform-based as well as�Instantaneous Frequency�(IF)-based features have been proposed for replay spoofed speech detection (SSD). In this context, IF has been estimated either by derivative of analytic phase via�Hilbert transform, or by using high temporal resolution Teager Energy Operator (TEO)-based Energy Separation Algorithm (ESA). However, excellent temporal resolution of ESA comes with lacking in using�relative phase�information and vice-versa. To that effect, we propose novel Cochlear Filter Cepstral Coefficients-based�Instantaneous Frequency�using Quadrature Energy Separation Algorithm (CFCCIF-QESA) features, with excellent temporal resolution as well as relative�phase information. CFCCIF-QESA is designed by exploiting�relative phase shift�to estimate IF, without estimating phase explicitly from the signal. To motivate and validate effectiveness of proposed QESA approach for IF estimation, we have employed information-theoretic measures, such as�Mutual Information�(MI), Kullback�Leibler (KL) divergence, and Jensen�Shannon (JS) divergence. The proposed CFCCIF-QESA feature set is extensively evaluated on standard statistically meaningful ASVSpoof 2017 version2.0 dataset. When evaluated on the ASVSpoof 2017 v2.0 dataset, CFCCIF-QESA achieves improved performance as compared to CFCCIF-ESA and CQCC feature sets on�GMM,�CNN, and LCNN classifiers. Furthermore, in the case of cross-database evaluation using ASVSpoof 2017 v2.0 and VSDC, CFCCIF-QESA also performs relatively better as compared to CFCCIF-ESA and CQCC on�GMM�classifier. However, for the case of self-classification on the ASVSpoof 2019 PA data, CFCCIF-QESA only outperforms CFCCIF-ESA. Whereas, on BTAS 2016 dataset, it performs relatively close to CFCCIF-ESA. Finally, results are presented for the case when the ASV system is not under attack.
dc.format.extent1-23
dc.identifier.citationPriyanka Gupta, Piyush Chodingala and Patil, Hemant A, "Replay Spoof Detection Using Energy Separation Based Instantaneous Frequency Estimation From Quadrature and In-Phase Components," Computer, Speech and Language, Elsevier, ISSN: 0885-2308, vol. 77, Jan. 2023, article no. 101423, pp. 1-23, doi: 10.1016/j.csl.2022.101423. [Published Date: 16 Jun. 2022]
dc.identifier.doi10.1016/j.csl.2022.101423
dc.identifier.issn0885-2308
dc.identifier.scopus2-s2.0-85134568965
dc.identifier.urihttps://ir.daiict.ac.in/handle/dau.ir/1564
dc.identifier.wosWOS:000828692000001
dc.language.isoen
dc.publisherElsevier
dc.relation.ispartofseriesVol. 77; No.
dc.source Computer, Speech and Language
dc.source.urihttps://www.sciencedirect.com/science/article/pii/S0885230822000559?via%3Dihub
dc.titleReplay Spoof Detection Using Energy Separation Based Instantaneous Frequency Estimation From Quadrature and In-Phase Components
dspace.entity.typePublication
relation.isAuthorOfPublicationfdb7041b-280e-498b-b2ee-34f9bc351f4c
relation.isAuthorOfPublication.latestForDiscoveryfdb7041b-280e-498b-b2ee-34f9bc351f4c

Files

Collections