Publication: CONCORD: Enhancing COVID-19 Research with Weak-Supervision based Numerical Claim Extraction
dc.contributor.affiliation | DA-IICT, Gandhinagar | |
dc.contributor.author | Shah, Dhwanil | |
dc.contributor.author | Shah, Krish | |
dc.contributor.author | Jagani, Manan | |
dc.contributor.author | Shah, Agam | |
dc.contributor.author | Chaudhury, Bhaskar | |
dc.contributor.author | Chaudhury, Bhaskar | |
dc.contributor.author | Chaudhury, Bhaskar | |
dc.contributor.author | Chaudhury, Bhaskar | |
dc.contributor.author | Chaudhury, Bhaskar | |
dc.contributor.author | Chaudhury, Bhaskar | |
dc.contributor.researcher | Shah, Dhwanil (201901450) | |
dc.contributor.researcher | Shah, Krish (201901465) | |
dc.contributor.researcher | Jagani, Manan (201901295) | |
dc.date.accessioned | 2025-08-01T13:09:36Z | |
dc.date.issued | 18-03-2024 | |
dc.description.abstract | The COVID-19 Numerical Claims Open Research Dataset (CONCORD) is a comprehensive, open-source dataset that extracts numerical claims from academic papers on COVID-19 research. To extract numerical claims, a weak-supervision based model is employed, leveraging its white-box, explainable nature and advantages over transformer-based models in terms of computational and manual annotation costs. Labelling functions are used to programmatically generate labels, incorporating techniques like pattern matching, external knowledge bases, phrase matching, and third-party models. An aggregator function reconciles overlapping or contradictory labels. The weak-supervision model is evaluated against established baselines and transformer based models, achieving a weighted F1-score of 0.932 and micro F1-score of 0.930 in extracting numerical claims.While the weak-supervision model showcases superior performance compared to baseline models, it is observed that transformer-based models achieve comparable results.CONCORD, comprising around 200,000 numerical claims extracted from over 57,000 COVID-19 research articles, serves as a valuable tool for knowledge discovery and understanding the chronological developments in various research areas associated with COVID-19. In conclusion, CONCORD, alongside the weak-supervision methodology, offers researchers a valuable resource, enhancing advancements in COVID-19 research while highlighting the significant potential of weak-supervision models within the broader biomedical domain. | |
dc.identifier.citation | Dhwanil Shah, Krish Shah, Manan Jagani, Agam Shah, and Chaudhury, Bhaskar, "CONCORD: Enhancing COVID-19 Research with Weak-Supervision based Numerical Claim Extraction," Research Square, ISSN: 2693-5015, 18 Mar. 2024, doi: 10.21203/rs.3.rs-4076902/v1. [Preprint] | |
dc.identifier.doi | 10.21203/rs.3.rs-4076902/v1 | |
dc.identifier.issn | 2693-5015 | |
dc.identifier.scopus | 2-s2.0-85204292352 | |
dc.identifier.uri | https://ir.daiict.ac.in/handle/dau.ir/2066 | |
dc.identifier.wos | WOS:001314794600001 | |
dc.language.iso | en | |
dc.publisher | Research Square | |
dc.source | Research Square | |
dc.source.uri | https://www.researchsquare.com/article/rs-4076902/v1 | |
dc.title | CONCORD: Enhancing COVID-19 Research with Weak-Supervision based Numerical Claim Extraction | |
dspace.entity.type | Publication | |
relation.isAuthorOfPublication | d0ffe8b6-980b-4a74-bb54-7408522e6da7 | |
relation.isAuthorOfPublication | d0ffe8b6-980b-4a74-bb54-7408522e6da7 | |
relation.isAuthorOfPublication.latestForDiscovery | d0ffe8b6-980b-4a74-bb54-7408522e6da7 |