Benchmarking for Public Health Surveillance tasks on Social Media with a Domain-Specific Pretrained Language Model

Naseem, U; Lee, B; Khushi, M; Kim, J; Dunn, AG

Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/27036

Title:	Benchmarking for Public Health Surveillance tasks on Social Media with a Domain-Specific Pretrained Language Model
Authors:	Naseem, U Lee, B Khushi, M Kim, J Dunn, AG
Issue Date:	26-May-2022
Publisher:	Association for Computational Linguistics
Citation:	Naseem, U. et al. (2022) 'Benchmarking for Public Health Surveillance tasks on Social Media with a Domain-Specific Pretrained Language Model', Proceedings of NLP Power! The First Workshop on Efficient Benchmarking in NLP, Virtual, 26 May, pp. 22 - 31. doi: 10.18653/v1/2022.nlppower-1.3.
Abstract:	A user-generated text on social media enables health workers to keep track of information, identify possible outbreaks, forecast disease trends, monitor emergency cases, and ascertain disease awareness and response to official health correspondence. This exchange of health information on social media has been regarded as an attempt to enhance public health surveillance (PHS). Despite its potential, the technology is still in its early stages and is not ready for widespread application. Advancements in pretrained language models (PLMs) have facilitated the development of several domain-specific PLMs and a variety of downstream applications. However, there are no PLMs for social media tasks involving PHS. We present and release PHS-BERT, a transformer-based PLM, to identify tasks related to public health surveillance on social media. We compared and benchmarked the performance of PHS-BERT on 25 datasets from different social medial platforms related to 7 different PHS tasks. Compared with existing PLMs that are mainly evaluated on limited tasks, PHS-BERT achieved state-of-the-art performance on all 25 tested datasets, showing that our PLM is robust and generalizable in the common PHS tasks. By making PHS-BERT available, we aim to facilitate the community to reduce the computational cost and introduce new baselines for future works across various PHS-related tasks.
Description:	Video: https://aclanthology.org/2022.nlppower-1.3.mp4
URI:	https://bura.brunel.ac.uk/handle/2438/27036
DOI:	https://doi.org/10.18653/v1/2022.nlppower-1.3
ISBN:	978-1-955917-47-6
Other Identifiers:	ORCID iD: Matloob Khushli https://orcid.org/0000-0001-7792-2327
Appears in Collections:	Dept of Computer Science Research Papers

Files in This Item:

File	Description	Size	Format
FullText.pdf	Copyright © 2022 Association for Computational Linguistics. Permission is granted to make copies for the purposes of teaching and research. Materials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/).	416.49 kB	Adobe PDF	View/Open

Show full item record

This item is licensed under a Creative Commons License