Theses and Dissertations

Permanent URI for this collectionhttp://ir.daiict.ac.in/handle/123456789/1

Browse

Search Results

Now showing 1 - 2 of 2
  • ItemOpen Access
    Voice Liveness Detection Using Modified Group Delay Functions
    (2021) Singh, Shrishti; Patil, Hemant A.
    Automatic Speaker Verification (ASV) system is a key element of the voice biometric system. With the recent advancement in technology, spoofing attacks have also progressed drastically. In spoofing attack, attacker tries to get access of the biometric system by impersonating speech of genuine speaker. A successful attack would lead to serious consequences, such as stealing of information, fabricating false credentials for use in further attacks, gaining unauthorized access. In response to these attacks, researchers have come up with a solution to build more advanced ASV methods or develop effective spoofing countermeasures. The efforts has been made to develop algorithm to detect spoofing attacks effectively. Recently, new method has been developed to detect spoofing attacks known as Voice Liveness Detection (VLD). VLD has emerged as a successful technique to detect spoofing attacks in ASV system. VLD system verifies whether the input speech signal is from live speaker or it is the recorded samples by detecting popnoise in the speech. Pop noise comes out as a burst and is captured by the microphone as a breathing sound, which gets poorly produced by the loudspeakers. This noise acts as a cue for liveness in the input signal of an authentication system. VLD uses this strategy to distinguish between live or played speech. In this thesis, significance of the phase information in the signal is explored. The dynamics of the phase variation in the speech signal are represented using the group delay function. The feature sets developed using group delay function are used for VLD. The group delay function possess limitations for nonminimum phase signals and those limitations are overcome by Modified Group Delay (MGD) function. Two different approaches to develop modified group delay function are proposed, namely, Spectral Root (SR) and Linear Prediction (LP) smoothing-based group delay function. Performance of the proposed methods are evaluated on the POCO (POp noise COrpus) dataset. Moreover, Spectral Root Cepstral Coefficients (SRCC) has been also explored for the VLD task. In addition, the thesis work also proposes the Voice Privacy system-based on signal processing techniques as a countermeasure to prevent ASV system from various attacks.
  • ItemOpen Access
    Handcrafted Feature Design for Voice Liveness Detection and Countermeasures for Spoof Attacks
    (2021) Khoria, Kuldeep; Patil, Hemant A.
    Automatic Speaker Verification (ASV) systems are highly vulnerable to the spoofing attacks. Spoof attacks are the attacks when an imposter tries to manipulate the biometric system and to get the access of the system by some unfair practice. ASV systems are vulnerable to several kinds of spoofing attacks, namely, Speech Synthesis (SS), Voice Conversion (VC), Impersonation, Twins, and Replay. Replay attack on voice biometric can be constructed by surreptitiously recording the genuine speech signal and then presenting it as if it were authentic to the ASV system. Among all the spoofing attack, replay attack is the most simple to execute (or mount) but hard to detect. In particular, replay attack on ASV system done using a high quality recording and playback device is very hard to detect as it is very similar to the genuine speaker. Given this vulnerabilities of replayed spoofing attacks on ASV, this thesis aims at voice liveness detection (VLD) task to verify whether the speaker is live in front of ASV system or speaker’s voice is replayed. In addition to that this thesis is also an attempt to develop effective countermeasures to protect these systems from spoof attacks, namely, Speech Synthesis (SS) and Voice Conversion (VC). In this thesis, two novel feature sets are developed for voice liveness detection (VLD) task as countermeasures for replay attack, namely, Constant-Q Transform (CQT) and Spectral Root Cepstral Coefficients (SRCC). Performance of the proposed feature sets is evaluted using recently released POp noise COrpus (POCO). Short-Time Fourier Transform (STFT)-based feature set is considered as baseline feature to compare results. Further a noval feature set, namely, Cochlear Filter Cepstral Coefficient- Instantaneous Frequency feature set using Energy Separation Algorithm (CFCCIF-ESA), is proposed for detection of SS and VC based spoofing attacks. The experiments to evaluate the performance of CFCCIF-ESA feature set is performed on ASVSpoof 2015 dataset. The results obtained are further compared with the baseline Constant Q Cepstral Coefficients (CQCC), Linear Frequency Cepstral Coefficients (LFCC), and state-of-the-art Mel Frequency Cepstral Coefficients (MFCC) feature sets.