Replay spoofing countermeasure using autoencoder and siamese networks on ASVspoof 2019 challenge

Mohammad Adiban, Hossein Sameti, Saeedreza Shehnepoor

Research output: Contribution to journalArticlepeer-review

11 Citations (Scopus)


Automatic Speaker Verification (ASV) is authentication of individuals by analyzing their speech signals. Different synthetic approaches allow spoofing to deceive ASV systems (ASVs), whether using techniques to imitate a voice or reconstruct the features. Attackers beat up the ASVs using four general techniques; impersonation, speech synthesis, voice conversion, and replay. The last technique is considered as a common and high potential tool for spoofing purposes since replay attacks are more accessible and require no technical knowledge of adversaries. In this study, we introduce a novel replay spoofing countermeasure for ASVs. Accordingly, we use the Constant Q Cepstral Coefficient (CQCC) features fed into an autoencoder to attain more informative features and to consider the noise information of spoofed utterances for discrimination purpose. Finally, different configurations of the Siamese network are used for the first time in this context for classification. The experiments performed on ASVspoof challenge 2019 dataset using Equal Error Rate (EER) and Tandem Detection Cost Function (t-DCF) as evaluation metrics show that the proposed system improved the results over the baseline by 10.73% and 0.2344 in terms of EER and t-DCF, respectively.

Original languageEnglish
Article number101105
JournalComputer Speech and Language
Publication statusPublished - Nov 2020


Dive into the research topics of 'Replay spoofing countermeasure using autoencoder and siamese networks on ASVspoof 2019 challenge'. Together they form a unique fingerprint.

Cite this