Handcrafted Feature Design for Voice Liveness Detection and Countermeasures for Spoof Attacks

Khoria, Kuldeep

dc.contributor.advisor	Patil, Hemant A.
dc.contributor.author	Khoria, Kuldeep
dc.date.accessioned	2022-05-06T12:58:12Z
dc.date.available	2023-02-27T12:58:12Z
dc.date.issued	2021
dc.identifier.citation	Khoria, Kuldeep (2021). Handcrafted Feature Design for Voice Liveness Detection and Countermeasures for Spoof Attacks. Dhirubhai Ambani Institute of Information and Communication Technology. xiii, 99 p. (Acc.No: T00941)
dc.identifier.uri	http://drsr.daiict.ac.in//handle/123456789/1061
dc.description.abstract	Automatic Speaker Verification (ASV) systems are highly vulnerable to the spoofing attacks. Spoof attacks are the attacks when an imposter tries to manipulate the biometric system and to get the access of the system by some unfair practice. ASV systems are vulnerable to several kinds of spoofing attacks, namely, Speech Synthesis (SS), Voice Conversion (VC), Impersonation, Twins, and Replay. Replay attack on voice biometric can be constructed by surreptitiously recording the genuine speech signal and then presenting it as if it were authentic to the ASV system. Among all the spoofing attack, replay attack is the most simple to execute (or mount) but hard to detect. In particular, replay attack on ASV system done using a high quality recording and playback device is very hard to detect as it is very similar to the genuine speaker. Given this vulnerabilities of replayed spoofing attacks on ASV, this thesis aims at voice liveness detection (VLD) task to verify whether the speaker is live in front of ASV system or speaker’s voice is replayed. In addition to that this thesis is also an attempt to develop effective countermeasures to protect these systems from spoof attacks, namely, Speech Synthesis (SS) and Voice Conversion (VC). In this thesis, two novel feature sets are developed for voice liveness detection (VLD) task as countermeasures for replay attack, namely, Constant-Q Transform (CQT) and Spectral Root Cepstral Coefficients (SRCC). Performance of the proposed feature sets is evaluted using recently released POp noise COrpus (POCO). Short-Time Fourier Transform (STFT)-based feature set is considered as baseline feature to compare results. Further a noval feature set, namely, Cochlear Filter Cepstral Coefficient- Instantaneous Frequency feature set using Energy Separation Algorithm (CFCCIF-ESA), is proposed for detection of SS and VC based spoofing attacks. The experiments to evaluate the performance of CFCCIF-ESA feature set is performed on ASVSpoof 2015 dataset. The results obtained are further compared with the baseline Constant Q Cepstral Coefficients (CQCC), Linear Frequency Cepstral Coefficients (LFCC), and state-of-the-art Mel Frequency Cepstral Coefficients (MFCC) feature sets.
dc.subject	Automatic Speaker Verification
dc.subject	Spoof attacks
dc.subject	Speech Synthesis (SS)
dc.subject	Voice Conversion (VC)
dc.subject	Spectral Root Cepstral Coefficients
dc.subject	POp noise COrpus (POCO)
dc.subject	Short-Time Fourier Transform (STFT)
dc.subject	Constant Q Cepstral Coefficients (CQCC)
dc.classification.ddc	006.454 KHO
dc.title	Handcrafted Feature Design for Voice Liveness Detection and Countermeasures for Spoof Attacks
dc.type	Dissertation
dc.degree	M. Tech
dc.student.id	201911014
dc.accession.number	T00941

Files in this item

Name:: 201911014_Kuldeep_Khoria - hemant ...
Size:: 4.983Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

M Tech Dissertations [923]

Show simple item record