Call for Participation ClinSpEn @ Biomedical WMT Shared Task (WMT/EMNLP 2022)

Event start date
07-12-2022 09:00
Event end date
08-12-2022 17:00
CLINSPEN overview image


Call for Participation ClinSpEn @ Biomedical WMT Shared Task (WMT/EMNLP 2022) 

Automatic Translation of Clinical cases, ontologies & medical entities:  Spanish - English


ClinSpEn is part of the Biomedical WMT 2022 shared task, having the aim to promote the development and evaluation of machine translation systems adapted to the medical domain with three highly relevant sub-tracks: clinical cases, medical controlled vocabularies/ontologies, and clinical terms and entities extracted from medical content.

Key information




Machine translation applied to the clinical domain is a challenging task due to the complexity of medical language and the heavy use of health-related technical terms and medical expressions. Therefore, there is a large community of specialized medical translators, able to deal with medical narratives, terminologies or the use of ambiguous abbreviations and acronyms. 

Taking into account the relevance, impact and diversity of health-related content, as well as the rapidly growing number of publications, EHRs, clinical trials, informed consent documents and medical terminologies there is a pressing need to be able to generate more robust medical machine translation resources together with independent quality evaluation scenarios.  

Recent advances in machine translation technologies, together with the use of other NLP components, are showing promising results, thus domain adaptation of MT approaches can have a significant impact in unlocking key information from medical content. 

The ClinSpEn sub-task of Biomedical WMT proposes three different highly relevant sub-tracks, each associated with highly relevant medical machine translation application scenarios::

  • ClinSpEn-CC (Clinical Cases) subtask: translation of clinical case documents from English to Spanish, a type of document relevant both for processing medical literature as well as clinical records.
  • ClinSpEn-CT (Clinical Terms): translation of clinical terms and entity mentions from Spanish to English. The used terms were directly extracted from medical literature and clinical records, with particular focus on diseases, symptoms, findings, procedures and professions.
  • ClinSpEn-OC (Ontology Concepts): translation of clinical controlled vocabularies and ontology concepts from English to Spanish. Ontologies and structured vocabularies represent a key resource for semantic interoperability, entity linking, biomedical knowledge bases and precision medicine, and thus there is a pressing need to generate multilingual biomedical ontologies for a range of clinical applications.

A decently-sized sample set for each data type has been released. Participants may use it to test their existing systems or try out new ones.

In addition to the manually translated test set by professional medical translators, participants will also have access to a larger background collection for each of the three substracks, which might serve as additional resources and promote scalability and robustness assessment of machine translation technology. 


[All deadlines are in AoE (Anywhere on Earth)]



Participants must register using the official BioWMT Registration Form, which is available at

Additionally, we’ve created a registration form specific for the ClinSpEn sub-tracks which will be used to keep participants updated. Register at:

Publications and WMT workshop

Teams participating in the ClinSpEn subtrack of  Biomedical WMT will be invited to contribute a systems description paper for the WMT 2022 Working Notes proceedings. More information on the paper’s specifications, formatting guidelines and review process at:

If you are interested in Machine Translation, the biomedical domain or other language combinations, remember to check out the Biomedical WMT site and the rest of this year’s sub-tracks and language pairs:

ClinSpEn Organizers


Biomedical WMT Organizers