[scikit-learn] CFP: DISTEMIST (BioASQ/CLEF2022) shared task on detection & normalization of disease mentions

Martin Krallinger krallinger.martin at gmail.com
Sat Apr 23 05:54:12 EDT 2022

(Apologies for cross-posting)

Call for Participation DISTEMIST Shared Task (CLEF 2022)

 Detection and normalization of diseases mentions


DISTEMIST is the first track focusing specifically on the automatic
detection of disease mentions and their normalization (Snomed CT) in
Spanish clinical case reports. The DISTEMIST data was tested to develop
disease taggers previously applied on a diversity of medical records.

Key information:


   Web: https://temu.bsc.es/distemist/

   Data: https://doi.org/10.5281/zenodo.6408476

   Annotation guidelines: https://doi.org/10.5281/zenodo.6458078

   DISTEMIST gazetteer: https://doi.org/10.5281/zenodo.6458114

   Registration: https://temu.bsc.es/distemist/registration/

Systems able to detect and normalize disease mentions from medical content
are crucial for a diversity of applications such as semantic indexing for
improved retrieval/classification, clinical coding,  drug-repurposing,
relation extraction (disease-symptom, disease-drug/treatment,
disease-gene/mutation), etc. It was estimated that around 20% of PubMed
queries are related to diseases, disorders, and anomalies, stressing the
importance for different users (researchers, clinicians, Pharma,
biologists, healthcare practitioners,..) to extract this key information.
Disease mention recognition tools are also relevant to process other kinds
of content like social media (e.g. SMM4H/COLING2022 track - SocialDisNER).

Disease mention detection systems have been implemented and used to process
a diversity of content types, including scientific publications, clinical
records, clinical trials, patient forums or social media, resulting in a
component integrated into a diversity of practically relevant application
types, such as:


   health data analytics software and study of  disease trajectories

   disease outbreak monitoring/surveillance and epidemiology tools

   extraction of disease phenotype or comorbidities

   drug discovery, repurposing and off label indications

   occupational health studies


   clinical coding of diagnosis

The DISTEMIST organizers will release multilingual resources to foster the
development of multilingual tools and generate systems not only for Spanish
but also for content in English and Romance languages (French, Portuguese,
Italian and Romanian): DISTEMIST-English, DISTEMIST-Italian, DISTEMIST-French,
DISTEMIST-Portuguese, DISTEMIST-Catalan and DISTEMIST-Romanian.

We foresee that participation in the DISTEMIST track will contribute to
generate resources that will improve the exploitation of clinical
unstructured data and thus unlock valuable health information, assist data
curation and facilitate quality evaluation and interpretability of disease
mention detection systems.

Inspired by previous initiatives (n2c2, BioCreative) and shared tasks
(CANTEMIST, PharmaCoNER, or CodiEsp), we are launching the DISTEMIST shared
task as part of the BioASQ 2022 evaluation initiative (co-located with CLEF
2022), with the following two sub-tracks:


   DISTEMIST-entities: automatic detection of mentions of diseases.


   DISTEMIST-linking: finding mentions of diseases and normalizing them to
   their Snomed-CT concept identifiers.



   DISTEMIST-linking 2nd Training Set Release: April 23th, 2022

   Test Set Release (DISTEMIST-entities and linking): May 10th, 2022

   Participant Test Prediction Due (DISTEMIST-entities and linking): May
   15th, 2022 ("Anywhere on Earth")

   Working papers submission: May 27th, 2022

   Notification of acceptance (peer-reviews): June 13th, 2022

   Camera-ready system descriptions: July 1st, 2022

   BioASQ @ CLEF 2022: September 2022

Publications and BioASQ/CLEF2022 workshop

Teams participating in DISTEMIST will be invited to contribute a systems
description paper for the CLEF 2022 Working Notes proceedings (published on
CEUR-WS) and a short presentation of their approach at the CLEF 2022

Main Organizers


   Martin Krallinger, Barcelona Supercomputing Center, Spain

   Eulàlia Farré-Maduell, Barcelona Supercomputing Center, Spain

   Luis Gascó, Barcelona Supercomputing Center, Spain

   Anastasios Nentidis, National Center for Scientific Research Demokritos,

   Salvador Lima, Barcelona Supercomputing Center, Spain

   Antonio Miranda-Escalada, Barcelona Supercomputing Center, Spain

Martin Krallinger, Dr.
Head of Biological Text Mining Unit
Barcelona Supercomputing Center (BSC-CNS)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20220423/58d727dc/attachment-0001.html>

More information about the scikit-learn mailing list