Publication: CQT-Based Cepstral Features for Classification of Normal vs. Pathological Infant Cry
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Research Projects
Organizational Units
Journal Issue
Abstract
Infant cry classification is an important area of research that involves distinguishing between normal and pathological cries. Traditional feature sets, such as Short-Time Fourier Transform (STFT) and Mel Frequency Cepstral Coefficients (MFCC) have shown limitations due to poor spectral resolution caused by quasi-periodic sampling in high pitch-source harmonics. To address this, we propose to use Constant-Q Cepstral Coefficients (CQCC), which leverage geometrically-spaced frequency bins for improved representation of the fundamental frequency (F0) and its harmonics for infant cry classification. Two datasets, Baby Chilanto and In-House DA-IICT, were employed to evaluate the proposed feature set. We compared the CQCC against state-of-the-art feature sets, such as MFCC and Linear Frequency Cepstral Coefficients (LFCC) using Gaussian Mixture Model (GMM) and Support Vector Machine (SVM) classifiers, with 10-fold cross-validation. The CQCC-GMM architecture achieved relatively better accuracy of 99.8% on the Baby Chilanto dataset and 98.24% on the In-House DA-IICT dataset. This work demonstrates the effectiveness of CQCC's form-invariance over traditional STFT-based spectrograms. Additionally, it explores parameter tuning and the impact of feature vector dimensions. The study presents cross-database and combined dataset scenarios, yielding an overall performance improvement of 1.59%. CQCC's robustness was also evaluated under various signal degradation conditions, including additive babble noise at different Signal-to-Noise Ratios (SNR). The performance was further compared with other feature sets using statistical measures, including F1-score, J-statistics, and latency analysis for practical deployment. Lastly, CQCC's results were compared with existing studies on the Baby Chilanto dataset.