Analysis of voice biometric attacks: detection of synthetic vs natural speech
Abstract
The improvement in text-to-speech (TTS) synthesis also poses the problem of biometric attack on speaker verification system. In this context, it is required to analyse the performance of these system for false acceptance rate to impostor using artificial speech and incorporate features in the system to make it robust to these attacks. The aim of this study, is to understand different aspects and hence extract appropriate features for distinction of natural and synthetic speech. The study focuses on understanding those aspects which gives naturalness to human speech that the present day TTS systems fail to capture. Three different aspects, viz., Fourier transform phase, nonlinearity and speech prosody are analysed. The performance of each feature is evaluated and a comparative study of each of the features is presented. The results obtained provides an evaluation of the naturalness of the synthetic speech used and provides features to improve robustness against biometric attacks in speaker verification systems.
Collections
- M Tech Dissertations [923]