CFP MESINESP2 track: Medical Semantic Indexing (BioASQ – CLEF 2021)
Plan TL Award for MESINESP2 winners
There is a pressing need to improve information access, retrieval, classification, semantic annotation as well as integration across multiple document types, in particular for health-related content such as literature, clinical trials and medicinal patents.
This is especially true for multilingual content from heterogeneous sources (cross-genre), where for instance many of the initially reported COVID-19 case reports were published in a variety of languages, a considerable fraction being non-English publications.
Due to the significant practical impact of advanced semantic indexing technologies in health, and the direct collaboration and interest in the generated results by collaborating international and national healthcare organizations (BIREME/WHO, ISCIII/Spain) we are organizing the MESINESP2 shared task in collaboration with the well-established BioASQ (CLEF2021) initiative.
A variety of complementary strategies were explored so far for semantic indexing of health-content including (extreme) multi-label classification, multilingual X-BERT, transformers, graph matching, text similarity, string matching/term indexing, named entity recognition or machine translation components.
Inspired by the settings of past BioASQ tracks and our BioCreative corpora (CHEMPROT, BC4CHEMD/CHEMDNER) included in popular benchmark datasets like BioBERT, we propose the following three MESINESP2 subtracks:
- MESINESP-L – Scientific Literature (sub-track 1): This track will require automatic indexing with DeCS terms (similar to MeSH) of abstracts using two highly used databases in Spanish (IBECS and LILACS).
- MESINESP-T – Clinical trials (sub-track 2): This track will require automatic indexing with DeCS terms of clinical trials from REEC (Registro Español de Estudios Clínicos).
- MESINESP-P – Patents (sub-track 3): This track will require automatic indexing with DeCS terms the content of Spanish patents extracted from Google Patents.
- MESINESP2 web, info & detailed description: https://temu.bsc.es/mesinesp2
- Registration for MESINESP2: http://clef2021-labs-registration.dei.unipd.it and register to Task 3 – Task MESINESP: Medical Semantic Indexing In Spanish (Which is part of the workshop “BioASQ - Large-scale biomedical semantic indexing and question answering”)
- Datasets: https://zenodo.org/record/4634129#.YFu0MZ1KiUl
We foresee that the systems resulting from MESINESP2 will provide directly useful for a variety of use case scenarios beyond literature indexing, including competitive intelligence, prior art searches, complex search queries for systematic reviews, evidence-based medicine, decision making, as well database curation, elaboration of clinical practice guidelines. Moreover the document selection criteria of MESINESP2 considered additional scenarios of future tasks on semantic indexing of medical records.
- March, 17: Train set and guidelines release
- March, 17: First development set release
- April, 15: Test and Background set release
- April, 30: BioASQ9 Lab @CLEF 2021 Registration Deadline
- April, 30: End of the evaluation period
- May, 28: Submission of Participant Papers at CLEF2021
- July, 2: Camera-ready paper submission
- Sep 21-24: CLEF 2021 Conference
Publications and workshop
The MESINESP2 track results will be presented at the BioASQ workshop allocated at CLEF 2021 (http://clef2021.clef-initiative.eu). Participating teams will be invited to present their systems and obtained results. Moreover, participating teams will be invited to submit their system description papers for publication at the CLEF 2021 Working Notes proceedings.
There will be awards for the top-scoring teams promoted by the Spanish Plan for the Advancement of Language Technology (Plan TL) and the Barcelona Supercomputing Center (BSC).
Main Track organizers
- Martin Krallinger, Barcelona Supercomputing Center (BSC), Spain.
- Luis Gascó, Barcelona Supercomputing Center (BSC), Spain.
- Anastasios Nentidis, National Center for Scientific Research Demokritos, Greece.
- Elena Primo-Peña, Biblioteca Nacional de Ciencias de Salud. Instituto de Salud Carlos III, Spain.
- Cristina Bojo Canales, Biblioteca Nacional de Ciencias de la Salud. Instituto de Salud Carlos III, Spain.
- George Paliouras, National Center for Scientific Research Demokritos, Greece.
- Anastasia Krithara, National Center for Scientific Research Demokritos, Greece.
- Renato Murasaki, BIREME – Organización Panamericana de la Salud (WHO), Brasil.
- David Camacho, Applied Intelligence and Data Analysis Research Group, Universidad Politécnica de Madrid (Spain)
- Oscar Corcho, Ontology Engineering Group, Universidad Politécnica de Madrid (Spain)
- Parminder Batia, Amazon Health AI (USA)
- Irena Spasic, School of Computer Science & Informatics, co-Director of the Data Innovation Research Institute, Cardiff University (UK)
- Jose Luis Redondo García, Amazon Alexa, Amazon (UK)
- Carlos Badenes-Olmedo, Ontology Engineering Group, Universidad Politécnica de Madrid (Spain)
- Xavier Tannier, Sorbonne Université and LIMICS (France)
- Tristan Naumann, Microsoft Research (USA)
- Allan Hanbury, Technical University of Vienna (Austria)