[scikit-learn] CFP: DISTEMIST (BioASQ/CLEF2022) shared task on detection & normalization of disease mentions
Martin Krallinger
krallinger.martin at gmail.com
Sat Apr 23 05:54:12 EDT 2022
(Apologies for cross-posting)
Call for Participation DISTEMIST Shared Task (CLEF 2022)
Detection and normalization of diseases mentions
https://temu.bsc.es/distemist/
DISTEMIST is the first track focusing specifically on the automatic
detection of disease mentions and their normalization (Snomed CT) in
Spanish clinical case reports. The DISTEMIST data was tested to develop
disease taggers previously applied on a diversity of medical records.
Key information:
-
Web: https://temu.bsc.es/distemist/
-
Data: https://doi.org/10.5281/zenodo.6408476
-
Annotation guidelines: https://doi.org/10.5281/zenodo.6458078
-
DISTEMIST gazetteer: https://doi.org/10.5281/zenodo.6458114
-
Registration: https://temu.bsc.es/distemist/registration/
Motivation
Systems able to detect and normalize disease mentions from medical content
are crucial for a diversity of applications such as semantic indexing for
improved retrieval/classification, clinical coding, drug-repurposing,
relation extraction (disease-symptom, disease-drug/treatment,
disease-gene/mutation), etc. It was estimated that around 20% of PubMed
queries are related to diseases, disorders, and anomalies, stressing the
importance for different users (researchers, clinicians, Pharma,
biologists, healthcare practitioners,..) to extract this key information.
Disease mention recognition tools are also relevant to process other kinds
of content like social media (e.g. SMM4H/COLING2022 track - SocialDisNER).
Disease mention detection systems have been implemented and used to process
a diversity of content types, including scientific publications, clinical
records, clinical trials, patient forums or social media, resulting in a
component integrated into a diversity of practically relevant application
types, such as:
-
health data analytics software and study of disease trajectories
-
disease outbreak monitoring/surveillance and epidemiology tools
-
extraction of disease phenotype or comorbidities
-
drug discovery, repurposing and off label indications
-
occupational health studies
-
pharmacogenomics
-
clinical coding of diagnosis
The DISTEMIST organizers will release multilingual resources to foster the
development of multilingual tools and generate systems not only for Spanish
but also for content in English and Romance languages (French, Portuguese,
Italian and Romanian): DISTEMIST-English, DISTEMIST-Italian, DISTEMIST-French,
DISTEMIST-Portuguese, DISTEMIST-Catalan and DISTEMIST-Romanian.
We foresee that participation in the DISTEMIST track will contribute to
generate resources that will improve the exploitation of clinical
unstructured data and thus unlock valuable health information, assist data
curation and facilitate quality evaluation and interpretability of disease
mention detection systems.
Inspired by previous initiatives (n2c2, BioCreative) and shared tasks
(CANTEMIST, PharmaCoNER, or CodiEsp), we are launching the DISTEMIST shared
task as part of the BioASQ 2022 evaluation initiative (co-located with CLEF
2022), with the following two sub-tracks:
-
DISTEMIST-entities: automatic detection of mentions of diseases.
-
DISTEMIST-linking: finding mentions of diseases and normalizing them to
their Snomed-CT concept identifiers.
Schedule
-
DISTEMIST-linking 2nd Training Set Release: April 23th, 2022
-
Test Set Release (DISTEMIST-entities and linking): May 10th, 2022
-
Participant Test Prediction Due (DISTEMIST-entities and linking): May
15th, 2022 ("Anywhere on Earth")
-
Working papers submission: May 27th, 2022
-
Notification of acceptance (peer-reviews): June 13th, 2022
-
Camera-ready system descriptions: July 1st, 2022
-
BioASQ @ CLEF 2022: September 2022
Publications and BioASQ/CLEF2022 workshop
Teams participating in DISTEMIST will be invited to contribute a systems
description paper for the CLEF 2022 Working Notes proceedings (published on
CEUR-WS) and a short presentation of their approach at the CLEF 2022
workshop.
Main Organizers
-
Martin Krallinger, Barcelona Supercomputing Center, Spain
-
Eulàlia Farré-Maduell, Barcelona Supercomputing Center, Spain
-
Luis Gascó, Barcelona Supercomputing Center, Spain
-
Anastasios Nentidis, National Center for Scientific Research Demokritos,
Greece
-
Salvador Lima, Barcelona Supercomputing Center, Spain
-
Antonio Miranda-Escalada, Barcelona Supercomputing Center, Spain
--
=======================================
Martin Krallinger, Dr.
Head of Biological Text Mining Unit
Barcelona Supercomputing Center (BSC-CNS)
=======================================
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20220423/58d727dc/attachment-0001.html>
More information about the scikit-learn
mailing list