Replay spoof detection using handcrafted features
Abstract
In the past few years, there has been noteworthy demand in the use of Automatic
Speaker Verification (ASV) system for numerous applications. The increased use
of the ASV systems for voice biometrics purpose comes with the threat of spoofing
attacks. The ASV systems are vulnerable to five types of spoofing attacks,
namely, impersonation, Voice Conversion (VC), Speech Synthesis (SS), twins and
replay. Among which, replay possess a greater threat to the ASV system than any
other spoofing attacks, as it neither require any specific expertise nor a sophisticated
equipment. Replay attacks require low efforts and most accessible attacks.
The replay speech can be modeled as a convolution of the genuine speech with
the impulse response of microphone, multimedia speaker, recording environment
and playback environment. The detection difficulty of replay attacks increases
with a high quality intermediate devices, clean recording and playback environment.
In this thesis, we have propose three novel handcrafted cepstral feature sets
for replay spoof detection task, namely, Magnitude-based Spectral Root Cepstral
Coefficients (MSRCC), Phase-based Spectral Root Cepstral Coefficients (PSRCC)
and Empirical Mode Decomposition Cepstral Coefficients (EMDCC). In addition,
we explored the significance of Teager Energy Operator (TEO) phase feature for
replay spoof detection.
The EMDCC feature set replace the filterbank structure with Empirical Mode
Decomposition (EMD) technique to obtain the subband signals. The number of
subbands obtained for the replay speech signal using EMD is more as compared
to the genuine speech signal. The MSRCC and PSRCC feature sets are extracted
using spectral root cepstrum of speech signal. The spectral root cepstrum spreads
the effect of additional impulse responses in replay speech over entire quefrencydomain.
The TEO phase feature set provides the high security information when
fused with other magnitude-based features, such as Mel Frequency Cepstral Coefficients
(MFCC). The experiments are performed on ASV spoof 2017 challenge
database and all the systems are implemented using Gaussian Mixture Model
(GMM) as a classifier. All the feature set performs better than the ASV spoof 2017
challenge baseline Constant Q Cepstral Coefficients (CQCC) system.
Collections
- M Tech Dissertations [923]