CQT-Based Cepstral Features for Classification of Normal vs. Pathological Infant Cry

Patil, Hemant; Kachhi, Aastha; Patil, Ankur T

Publication:
CQT-Based Cepstral Features for Classification of Normal vs. Pathological Infant Cry

dc.contributor.affiliation	DA-IICT, Gandhinagar
dc.contributor.author	Patil, Hemant
dc.contributor.author	Kachhi, Aastha
dc.contributor.author	Patil, Ankur T
dc.contributor.researcher	Patil, Ankur T (201621008)
dc.contributor.researcher	Kachhi, Aastha
dc.date.accessioned	2025-08-01T13:09:02Z
dc.date.issued	27-10-2023
dc.description.abstract	Infant cry classification is an important area of research that involves distinguishing between normal and pathological cries. Traditional feature sets, such as Short-Time Fourier Transform (STFT) and Mel Frequency Cepstral Coefficients (MFCC) have shown limitations due to poor spectral resolution caused by quasi-periodic sampling in high pitch-source harmonics. To address this, we propose to use Constant-Q Cepstral Coefficients (CQCC), which leverage geometrically-spaced frequency bins for improved representation of the fundamental frequency (F0) and its harmonics for infant cry classification. Two datasets, Baby Chilanto and In-House DA-IICT, were employed to evaluate the proposed feature set. We compared the CQCC against state-of-the-art feature sets, such as MFCC and Linear Frequency Cepstral Coefficients (LFCC) using Gaussian Mixture Model (GMM) and Support Vector Machine (SVM) classifiers, with 10-fold cross-validation. The CQCC-GMM architecture achieved relatively better accuracy of 99.8% on the Baby Chilanto dataset and 98.24% on the In-House DA-IICT dataset. This work demonstrates the effectiveness of CQCC's form-invariance over traditional STFT-based spectrograms. Additionally, it explores parameter tuning and the impact of feature vector dimensions. The study presents cross-database and combined dataset scenarios, yielding an overall performance improvement of 1.59%. CQCC's robustness was also evaluated under various signal degradation conditions, including additive babble noise at different Signal-to-Noise Ratios (SNR). The performance was further compared with other feature sets using statistical measures, including F1-score, J-statistics, and latency analysis for practical deployment. Lastly, CQCC's results were compared with existing studies on the Baby Chilanto dataset.
dc.format.extent	4713 - 4726
dc.identifier.citation	Patil, Hemant A, Aastha Kachhi, and Ankur T. Patil, "CQT-Based Cepstral Features for Classification of Normal vs. Pathological Infant Cry," IEEE/ACM Transactions on Audio, Speech, and Language Processing, IEEE, ISSN: 2329-9304, pp. 1-14, 27 Oc. 2023, doi: 10.1109/TASLP.2023.3325971
dc.identifier.doi	10.1109/TASLP.2023.3325971
dc.identifier.issn	2329-9304
dc.identifier.scopus	2-s2.0-85181556679
dc.identifier.uri	https://ir.daiict.ac.in/handle/dau.ir/1561
dc.identifier.wos	WOS:001346763000003
dc.language.iso	en
dc.publisher	IEEE
dc.relation.ispartofseries	Vol. 32; No.
dc.source	IEEE/ACM Transactions on Audio, Speech
dc.source.uri	https://ieeexplore.ieee.org/document/10298803/
dc.title	CQT-Based Cepstral Features for Classification of Normal vs. Pathological Infant Cry
dspace.entity.type	Publication
relation.isAuthorOfPublication	fdb7041b-280e-498b-b2ee-34f9bc351f4c
relation.isAuthorOfPublication.latestForDiscovery	fdb7041b-280e-498b-b2ee-34f9bc351f4c

Collections

Journal Article

Publication: CQT-Based Cepstral Features for Classification of Normal vs. Pathological Infant Cry

Files

Collections

Publication:
CQT-Based Cepstral Features for Classification of Normal vs. Pathological Infant Cry