Repository logo
Collections
Browse
Statistics
  • English
  • हिंदी
Log In
New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Theses and Dissertations
  3. M Tech Dissertations
  4. Vowel landmark detection for speech recognition

Vowel landmark detection for speech recognition

Files

201211049.pdf (1.94 MB)

Date

2014

Authors

Undhad, Ankur G.

Journal Title

Journal ISSN

Volume Title

Publisher

Dhirubhai Ambani Institute of Information and Communication Technology

Abstract

Landmarks are the time instants in a speech utterance which marks the important events such as vowels, glides and consonants. This thesis proposes a novel Vowel Landmark Detection (VLD) algorithm to locate vowel landmarks and hence the nucleus of a vowel. VLD can find its potential application for Automatic Speech Recognition (ASR) and Automatic Phonetic Segmentation (APS). The proposed VLD method uses speech source information to detect the vowel landmarks which are points of high sonority. The excitation peaks in Hilbert envelope (HE) of Teager energy profile of zero frequency filtered (ZFF) speech signal can be interpreted as perceptually significant feature which contribute to the loudness. The performance of proposed VLD method is compared with existing loudness-based method. The results are reported on TIMIT and NTIMIT corpora. The proposed VLD algorithm has detection rate of 85.48 % (83.97 %) which is 5.06 % (7.51 %) more as compared to existing loudness-based method for TIMIT (NTIMIT) corpus, respectively. In addition, this thesis proposes use of VLD algorithm for low resource languages, viz., Gujarati and Marathi, Indian languages. The results are reported on speech recorded in three different modes, viz., read, spontaneous and lecture followed by manual phonetic transcription by the transcribers (to be used as ground truth) for Gujarati as well as Marathi. The proposed VLD algorithm has detection rate of 78.92 %, 76.40 % and 73.89 %, which has jump of 8.79 %, 7.23 % and 7.17 % more as compared to loudness-based method in lecture, spontaneous and read mode, respectively for Gujarati. Similarly, the proposed VLD algorithm has detection rate of 76.93 %, 75.16 % and 73.93 %, which has jump of 7.52 %, 7.43 % and 7.82 % more as compared to loudness-based method in lecture, spontaneous and read mode, respectively (for Marathi). The proposed algorithm is also shown to be robust against signal degradation such as white noise. The second part of the thesis is to recognize the detected vowel landmarks.Formant-based technique is used to recognize the detected vowels. The results are reported on phonetically transcribed TIMIT corpus. The recognition rate is 32.16 % on the correctly detected vowels (i.e., out of 78374 vowels, 66994 number of vowels are detected correctly and out of that 21545 vowels are recognized). Proposed method is very fast and requires no training.

Description

Keywords

Speech recognition, speech Recognition Landmark, Vowel Landmark Detection

Citation

Undhad, Ankur G. (2014). Vowel landmark detection for speech recognition. Dhirubhai Ambani Institute of Information and Communication Technology, xviii, 89 p. (Acc.No: T00477)

URI

http://ir.daiict.ac.in/handle/123456789/514

Collections

M Tech Dissertations

Endorsement

Review

Supplemented By

Referenced By

Full item page
 
Quick Links
  • Home
  • Search
  • Research Overview
  • About
Contact

DAU, Gandhinagar, India

library@dau.ac.in

+91 0796-8261-578

Follow Us

© 2025 Dhirubhai Ambani University
Designed by Library Team