Repository logo
Collections
Browse
Statistics
  • English
  • हिंदी
Log In
New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Theses and Dissertations
  3. M Tech Dissertations
  4. Representation learning for speech technology applications using deep learning methods

Representation learning for speech technology applications using deep learning methods

Files

201411038.pdf (4.78 MB)

Date

2016

Authors

Soni, Meetkumar H.

Journal Title

Journal ISSN

Volume Title

Publisher

Dhirubhai Ambani Institute of Information and Communication Technology

Abstract

In context of speech, deep learning architectures such as autoencoder and ConvolutionalNeural Network (CNN) are being used in the field of Automatic SpeechRecognition (ASR). However, they are seldom used as representation learning architectureto extract features, rather they are being used as performance enhancementof handcrafted features or as a strong back-end for performance improvement.Mainly, autoencoder features, which are proven to have excellent reconstructionability of speech spectrum are still unexplored in many speech technologyapplication. The lack of use of autoencoder features motivated the authors toexplore their usefulness in various speech technology applications such as nonintrusiveobjective quality assessment of noise suppressed speech, detection ofspoofed speech as a front-end of Automatic Speaker Verification (ASV) systemsand in Automatic Speech Recognition (ASR) systems as primary acoustic features.Moreover, the reasons for unpopularity of using autoencoder features motivatedauthors to overcome their limitations by modifying architecture of autoencoderand developing a new one, namely, subband autoencoder (SBAE). Proposed SBAEarchitecture is inspired by Human Auditory System (HAS) and extracts more interpretablefeatures from speech spectrum than autoencoder features in nonlinear,unsupervised manner. The performance of autoencoder features and SBAEfeatures is compared with state-of-the-art handcrafted features used in that particularapplication. Results of experiments for quality assessment of noise suppressedspeech suggest that autoencoder features and SBAE features perform significantlybetter and give more robust performance than state-of-the-art Mel filterbankenergies (FBE) with SBAE features giving the best performance. For spoofdetection task, SBAE features gave overall better performance than state-of-theartMel-Frequency Cepstral Coefficients (MFCC). However the best performancewas achieved using score-level fusion of both features. Autoencoder features performedpoorly in spoof detection task. In ASR experiments, FBE performed betterthan autoencoder features and SBAE features. However, when system-level combinationwas done, SBAE features improved performance of FBEs significantly,which suggests complementary nature of information captured by both features. System combination of SBAE features and FBE gave better performance than systemcombination of FBE and autoencoder features. For quality assessment of synthesizedspeech task, SBAE features performed significantly better than FBE. Theresults of these experiments suggest that proposed SBAE architecture improvesover traditional autoencoder in terms of usefulness of extracted features. The natureof the features extracted by SBAE was complementary to that of MFCC orFBE due to nonlinear processing involved in their extraction.

Description

Keywords

Learning Method, Artificial Neural Network, Subband Autoencoder, Natural and Spoofed Speech

Citation

Soni, Meetkumar H. (2016). Representation learning for speech technology applications using deep learning methods. Dhirubhai Ambani Institute of Information and Communication Technology, xiii, 97p. (Acc.No: T00587)

URI

http://ir.daiict.ac.in/handle/123456789/624

Collections

M Tech Dissertations

Endorsement

Review

Supplemented By

Referenced By

Full item page
 
Quick Links
  • Home
  • Search
  • Research Overview
  • About
Contact

DAU, Gandhinagar, India

library@dau.ac.in

+91 0796-8261-578

Follow Us

© 2025 Dhirubhai Ambani University
Designed by Library Team