CFP ProfNER shared task:
Identification of professions & occupations in Health-related Social Media (SMM4H at NAACL)
CFP SMM4H-SPANISH: ProfNER Shared Task (SMM4H - NAACL 2021)
ProfNER: Identification of professions & occupations in Health-related Social Media (SMM4H at NAACL)
https://temu.bsc.es/smm4h-spanish/
We are organizing the first shared task specifically focusing on named entity recognition of professions & occupations in Social Media in Spanish. Specifically, we focus on Twitter data related to Covid-19 and lock-downs.
ProfNER is part of The Social Media Mining for Health Applications (#SMM4H) Shared Task 2021.
The ProfNER sub-tracks:
-
Tweet binary classification: Participants must determine whether a tweet contains a mention of occupation, or not.
-
NER offset detection and classification: Participants must find the beginning and end of occupation mentions and classify them in the corresponding category
Key information:
- ProfNER web: https://temu.bsc.es/smm4h-spanish/
- Datasets: https://doi.org/10.5281/zenodo.4309356
- Registration: https://forms.gle/1qs3rdNLDxAph88n6
Task motivation
Some workers are at the forefront of the battle against the COVID-19 pandemic. Detecting vulnerable occupations is critical to prepare preventive measures related to exposure to the virus as well as indirect mental health issues due to fear of infection, confinement, etc.
NLP systems benefit from recent NLP technologies such as transformers, novel language technologies and transfer learning and from the vast production of real-time data in social media.
Following the previous organization of shared task with high impact with a considerable number of participants [Cantemist], [CodiEsp], [Meddocan] we are organizing the ProfNER track. It promotes the development of profession & occupation-related text mining resources in Spanish social media due to the special relevance of professions in the definition of at-risk groups.
Systems capable of automatically processing social media texts are of interest to the medical user community, researchers, the pharmaceutical industry as well as patients. The detection of profession & occupation information is relevant for general NLP, occupational data mining, etc.
Competing systems have the potential to generalize to alike use cases in other content types such as medical reports and in other languages.
Important dates
- Dec, 15: Training & Development set release
- Feb, 15: Validation set submission due [Required]
- Mar, 1: Test set & background set release
- Mar, 4: Test set predictions due
- Mar, 15: System descriptions due
- Apr, 1: Acceptance notification
- Apr, 12: Camera-ready system descriptions
- June 6–11: NAACL 2021 conference
Publications and workshop
Each participating team will have the opportunity to submit a system description which will be published as part of the shared task proceedings.
The 6th SMM4H Workshop, co-located at NAACL 2021 More details are available at https://healthlanguageprocessing.org/smm4h-2021/
Track Organizers
- Martin Krallinger, Barcelona Supercomputing Center, Spain
- Antonio Miranda-Escalada, Barcelona Supercomputing Center, Spain
- Eulàlia Farré, Barcelona Supercomputing Center, Spain
- Salvador Lima, Barcelona Supercomputing Center, Spain
SMM4H Organizers
- Graciela Gonzalez-Hernandez, University of Pennsylvania, USA
- Davy Weissenbacher, University of Pennsylvania, USA
- Ari Z. Klein, University of Pennsylvania, USA
- Karen O’Connor, University of Pennsylvania, USA
- Abeed Sarker, Emory University, USA
- Elena Tutubalina, Kazan Federal University, Russia
- Zulfat Miftahutdinov, Kazan Federal University, Russia
- Ilsear Alimova, Kazan Federal University, Russia
- Martin Krallinger, Barcelona Supercomputing Center, Spain
- Juan Banda, Georgia State University, USA
For additional information, SMM4H website: https://temu.bsc.es/smm4h-spanish