Show simple item record

dc.contributor.advisorPatil, Hemant A.
dc.contributor.authorTapkir, Prasad Anil
dc.date.accessioned2019-03-19T09:30:54Z
dc.date.available2019-03-19T09:30:54Z
dc.date.issued2018
dc.identifier.citationTapkir, Prasad Anil (2018). Replay Spoof Detection using Handcrafted Features. Dhirubhai Ambani Institute of Information and Communication Technology, xiii, 68 p. (Acc. No: T00715)
dc.identifier.urihttp://drsr.daiict.ac.in//handle/123456789/749
dc.description.abstractIn the past few years, there has been noteworthy demand in the use of Automatic Speaker Verification (ASV) system for numerous applications. The increased use of the ASV systems for voice biometrics purpose comes with the threat of spoofing attacks. The ASV systems are vulnerable to five types of spoofing attacks, namely, impersonation, Voice Conversion (VC), Speech Synthesis (SS), twins and replay. Among which, replay possess a greater threat to the ASV system than any other spoofing attacks, as it neither require any specific expertise nor a sophisticated equipment. Replay attacks require low efforts and most accessible attacks. The replay speech can be modeled as a convolution of the genuine speech with the impulse response of microphone, multimedia speaker, recording environment and playback environment. The detection difficulty of replay attacks increases with a high quality intermediate devices, clean recording and playback environment. In this thesis, we have propose three novel handcrafted cepstral feature sets for replay spoof detection task, namely, Magnitude-based Spectral Root Cepstral Coefficients (MSRCC), Phase-based Spectral Root Cepstral Coefficients (PSRCC) and Empirical Mode Decomposition Cepstral Coefficients (EMDCC). In addition, we explored the significance of Teager Energy Operator (TEO) phase feature for replay spoof detection. The EMDCC feature set replace the filterbank structure with Empirical Mode Decomposition (EMD) technique to obtain the subband signals. The number of subbands obtained for the replay speech signal using EMD is more as compared to the genuine speech signal. The MSRCC and PSRCC feature sets are extracted using spectral root cepstrum of speech signal. The spectral root cepstrum spreads the effect of additional impulse responses in replay speech over entire quefrencydomain. The TEO phase feature set provides the high security information when fused with other magnitude-based features, such as Mel Frequency Cepstral Coefficients (MFCC). The experiments are performed on ASV spoof 2017 challenge database and all the systems are implemented using Gaussian Mixture Model (GMM) as a classifier. All the feature set performs better than the ASV spoof 2017 challenge baseline Constant Q Cepstral Coefficients (CQCC) system.
dc.publisherDhirubhai Ambani Institute of Information and Communication Technology
dc.subjectSpeech synthesis
dc.subjectVoice conversion
dc.subjectArtificial intelligence
dc.classification.ddc006.3 TAP
dc.titleReplay spoof detection using handcrafted features
dc.typeDissertation
dc.degreeM. Tech
dc.student.id201611027
dc.accession.numberT00715


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record