Please use this identifier to cite or link to this item: http://drsr.daiict.ac.in//handle/123456789/615
Title: Novel nonlinear prediction-based features for spoofed speech detection
Authors: Patil, Hemant A.
Bhavsar, Himanshu N.
Keywords: Automatic Speaker Verification Systems
Spoofed Speech
Linear Prediction
Gaussian Mixture Model
Issue Date: 2016
Publisher: Dhirubhai Ambani Institute of Information and Communication Technology
Citation: Bhavsar, Himanshu N. (2016). Novel nonlinear prediction-based features for spoofed speech detection. Dhirubhai Ambani Institute of Information and Communication Technology, ix, 52p. (Acc.No: T00578)
Abstract: Automatic Speaker Verification (ASV) systems are prone to various spoofing attacks.<p/>Spoofing is one type of technique in which fake speech signal is given to<p/>the ASV system to get the access of that system without the permission of an<p/>authorized person. There are four types of spoofing attacks, namely, impersonation,<p/>replay, speech synthesis (SS) and voice conversion (VC). In impersonation<p/>attack, source speaker alter their voice (i.e., mimicking), replay attack record the<p/>speech from target speaker voice, using any arbitrary text spoof speech can be<p/>generated in speech synthesis, VC changes the voice of source-to-target speaker.<p/>SS and VC are more practical and they create more threat to the ASV system and<p/>hence, in this thesis work, we concentrate on the SS and VC only. For the detection<p/>of spoofed speech, we develop various countermeasures, after analysis of<p/>various plots and histograms of these features. We came up with the observation<p/>that these countermeasures might work well, to classify whether these features<p/>are from natural or spoofed speech. For that purpose, we use Gaussian<p/>Mixture Model (GMM)-based classifier. We built two different GMM for natural<p/>and spoofed speech. At the time of verification, when an unknown speech signal<p/>is given as an input to the ASV system, first features are extracted and after<p/>that, we find the likelihood score from both GMM models, which indicates the<p/>probability of these features are from both the models. If this score is greater than<p/>some threshold value, then it is classified as natural otherwise it is detected as<p/>spoofed speech. In this work, we propose linear prediction-nonlinear prediction<p/>(LP-NLP)-based countermeasure for the detection of spoofed speech signal. For<p/>experiments reported in this thesis, we used ASVspoof challenge 2015 database<p/>and database of Blizzard challenge 2012 and 2014. For the measurement of performance<p/>of the system, we use Detection Error Tradeoff (DET) curve and Equal<p/>Error Rate (EER).
URI: http://drsr.daiict.ac.in/handle/123456789/615
Appears in Collections:M Tech Dissertations

Files in This Item:
File Description SizeFormat 
201411029.pdf
  Restricted Access
794.95 kBAdobe PDFThumbnail
View/Open Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.