Publication:
CONCORD: Enhancing COVID-19 Research with Weak-Supervision based Numerical Claim Extraction

dc.contributor.affiliationDA-IICT, Gandhinagar
dc.contributor.authorShah, Dhwanil
dc.contributor.authorShah, Krish
dc.contributor.authorJagani, Manan
dc.contributor.authorShah, Agam
dc.contributor.authorChaudhury, Bhaskar
dc.contributor.authorChaudhury, Bhaskar
dc.contributor.authorChaudhury, Bhaskar
dc.contributor.authorChaudhury, Bhaskar
dc.contributor.authorChaudhury, Bhaskar
dc.contributor.authorChaudhury, Bhaskar
dc.contributor.researcherShah, Dhwanil (201901450)
dc.contributor.researcherShah, Krish (201901465)
dc.contributor.researcherJagani, Manan (201901295)
dc.date.accessioned2025-08-01T13:09:36Z
dc.date.issued18-03-2024
dc.description.abstractThe COVID-19 Numerical Claims Open Research Dataset (CONCORD) is a comprehensive, open-source dataset that extracts numerical claims from academic papers on COVID-19 research. To extract numerical claims, a weak-supervision based model is employed, leveraging its white-box, explainable nature and advantages over transformer-based models in terms of computational and manual annotation costs. Labelling functions are used to programmatically generate labels, incorporating techniques like pattern matching, external knowledge bases, phrase matching, and third-party models. An aggregator function reconciles overlapping or contradictory labels. The weak-supervision model is evaluated against established baselines and transformer based models, achieving a weighted F1-score of 0.932 and micro F1-score of 0.930 in extracting numerical claims.While the weak-supervision model showcases superior performance compared to baseline models, it is observed that transformer-based models achieve comparable results.CONCORD, comprising around 200,000 numerical claims extracted from over 57,000 COVID-19 research articles, serves as a valuable tool for knowledge discovery and understanding the chronological developments in various research areas associated with COVID-19. In conclusion, CONCORD, alongside the weak-supervision methodology, offers researchers a valuable resource, enhancing advancements in COVID-19 research while highlighting the significant potential of weak-supervision models within the broader biomedical domain.
dc.identifier.citationDhwanil Shah, Krish Shah, Manan Jagani, Agam Shah, and Chaudhury, Bhaskar, "CONCORD: Enhancing COVID-19 Research with Weak-Supervision based Numerical Claim Extraction," Research Square, ISSN: 2693-5015, 18 Mar. 2024, doi: 10.21203/rs.3.rs-4076902/v1. [Preprint]
dc.identifier.doi10.21203/rs.3.rs-4076902/v1
dc.identifier.issn2693-5015
dc.identifier.scopus2-s2.0-85204292352
dc.identifier.urihttps://ir.daiict.ac.in/handle/dau.ir/2066
dc.identifier.wosWOS:001314794600001
dc.language.isoen
dc.publisherResearch Square
dc.sourceResearch Square
dc.source.urihttps://www.researchsquare.com/article/rs-4076902/v1
dc.titleCONCORD: Enhancing COVID-19 Research with Weak-Supervision based Numerical Claim Extraction
dspace.entity.typePublication
relation.isAuthorOfPublicationd0ffe8b6-980b-4a74-bb54-7408522e6da7
relation.isAuthorOfPublicationd0ffe8b6-980b-4a74-bb54-7408522e6da7
relation.isAuthorOfPublication.latestForDiscoveryd0ffe8b6-980b-4a74-bb54-7408522e6da7

Files

Collections