Replay spoof detection using handcrafted features

Tapkir, Prasad Anil

dc.contributor.advisor	Patil, Hemant A.
dc.contributor.author	Tapkir, Prasad Anil
dc.date.accessioned	2019-03-19T09:30:54Z
dc.date.available	2019-03-19T09:30:54Z
dc.date.issued	2018
dc.identifier.citation	Tapkir, Prasad Anil (2018). Replay Spoof Detection using Handcrafted Features. Dhirubhai Ambani Institute of Information and Communication Technology, xiii, 68 p. (Acc. No: T00715)
dc.identifier.uri	http://drsr.daiict.ac.in//handle/123456789/749
dc.description.abstract	In the past few years, there has been noteworthy demand in the use of Automatic Speaker Verification (ASV) system for numerous applications. The increased use of the ASV systems for voice biometrics purpose comes with the threat of spoofing attacks. The ASV systems are vulnerable to five types of spoofing attacks, namely, impersonation, Voice Conversion (VC), Speech Synthesis (SS), twins and replay. Among which, replay possess a greater threat to the ASV system than any other spoofing attacks, as it neither require any specific expertise nor a sophisticated equipment. Replay attacks require low efforts and most accessible attacks. The replay speech can be modeled as a convolution of the genuine speech with the impulse response of microphone, multimedia speaker, recording environment and playback environment. The detection difficulty of replay attacks increases with a high quality intermediate devices, clean recording and playback environment. In this thesis, we have propose three novel handcrafted cepstral feature sets for replay spoof detection task, namely, Magnitude-based Spectral Root Cepstral Coefficients (MSRCC), Phase-based Spectral Root Cepstral Coefficients (PSRCC) and Empirical Mode Decomposition Cepstral Coefficients (EMDCC). In addition, we explored the significance of Teager Energy Operator (TEO) phase feature for replay spoof detection. The EMDCC feature set replace the filterbank structure with Empirical Mode Decomposition (EMD) technique to obtain the subband signals. The number of subbands obtained for the replay speech signal using EMD is more as compared to the genuine speech signal. The MSRCC and PSRCC feature sets are extracted using spectral root cepstrum of speech signal. The spectral root cepstrum spreads the effect of additional impulse responses in replay speech over entire quefrencydomain. The TEO phase feature set provides the high security information when fused with other magnitude-based features, such as Mel Frequency Cepstral Coefficients (MFCC). The experiments are performed on ASV spoof 2017 challenge database and all the systems are implemented using Gaussian Mixture Model (GMM) as a classifier. All the feature set performs better than the ASV spoof 2017 challenge baseline Constant Q Cepstral Coefficients (CQCC) system.
dc.publisher	Dhirubhai Ambani Institute of Information and Communication Technology
dc.subject	Speech synthesis
dc.subject	Voice conversion
dc.subject	Artificial intelligence
dc.classification.ddc	006.3 TAP
dc.title	Replay spoof detection using handcrafted features
dc.type	Dissertation
dc.degree	M. Tech
dc.student.id	201611027
dc.accession.number	T00715

Files in this item

Name:: 201611027_Prasad Anil Tapkir.pdf
Size:: 2.775Mb
Format:: PDF
Description:: 201611027

View/Open

This item appears in the following Collection(s)

M Tech Dissertations [923]

Show simple item record