Repository logo
Collections
Browse
Statistics
  • English
  • हिंदी
Log In
New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Theses and Dissertations
  3. M Tech Dissertations
  4. Features for Speech Emotion Recognition

Features for Speech Emotion Recognition

Files

202111065.pdf (12.37 MB)

Date

2023

Authors

Uthiraa, S.

Journal Title

Journal ISSN

Volume Title

Publisher

Dhirubhai Ambani Institute of Information and Communication Technology

Abstract

The easiest and most effective or natural way of communication is through speech;the emotional aspect of speech leads to effective interpersonal communication.As technological advancements continue to proliferate, the dependence of humanson machines is also increasing, thereby making it imperative to establish efficientmethods for Speech Emotion Recognition (SER) to ensure effective humanmachineinteraction. This thesis focuses on understanding acoustic characteristicsof various emotions and their dependence on the culture and languageused. It then proposes a new feature set, namely, Constant Q Pitch Coefficients(CQPC) and Constant Q Harmonic Coefficients (CQHC) from Constant Q Transform,which captures high resolution pitch and harmonic information, respectively.Further, this thesis focuses on less explored excitation source-based featuresand proposes a novel Linear Frequency Residual Cepstral Coefficients (LFRCC)feature set for the same. Phase-based features, namely Modified Group DelayCepstral Coefficients (MGDCC), is proposed to capture vocal tract and vocal foldinformation well for emotion classification. The recently developed AutomaticSpeech Recognition (ASR) model, Whisper, is used to analyze cross-database SER.This thesis extends the LFRCC idea on the infant cry classification problem. Lastly,a local API is developed for SER.

Description

Keywords

Speech Emotion Recognition, Constant Q Pitch Coefficients, Constant Q Harmonic Coefficients, Linear Frequency Residual Cepstral Coefficients, Modified Group Delay Cepstral Coefficients, Whisper, GMM, CNN, ResNet, TDNN

Citation

Uthiraa, S. (2023). Features for Speech Emotion Recognition. Dhirubhai Ambani Institute of Information and Communication Technology. xiii, 109 p. (Acc. # T01139).

URI

http://ir.daiict.ac.in/handle/123456789/1198

Collections

M Tech Dissertations

Endorsement

Review

Supplemented By

Referenced By

Full item page
 
Quick Links
  • Home
  • Search
  • Research Overview
  • About
Contact

DAU, Gandhinagar, India

library@dau.ac.in

+91 0796-8261-578

Follow Us

© 2025 Dhirubhai Ambani University
Designed by Library Team