<div dir="ltr"><div dir="ltr">



















<p class="MsoNormal" style="text-align:center;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria" align="center"><b><span style="font-size:11pt;font-family:Arial;color:black">IberLEF/SEPLN: CFP MEDDOCAN
track & task prize: named entity recognition and sensitive personal
information identification</span></b> <b><span style="font-size:11pt;font-family:Arial;color:black"><span></span></span></b></p>

<p class="MsoNormal" style="text-align:center;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria" align="center"><b><span style="font-size:11pt;font-family:Arial;color:black"><span> </span></span></b></p>

<p class="MsoNormal" style="text-align:center;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria" align="center"><b><span style="font-size:11pt;font-family:Arial;color:black"><span> </span></span></b></p>

<p class="MsoNormal" style="text-align:center;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria" align="center"><b><span style="font-size:11pt;font-family:Arial;color:black">***</span></b><span style="font-size:11pt;font-family:Arial;color:black"> <b>CFP MEDDOCAN track </b>***</span><span style="font-size:10pt;font-family:Times"><span></span></span></p>

<p class="MsoNormal" style="text-align:center;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria" align="center"><b><span style="font-size:11pt;font-family:Arial;color:black">First Medical Document
Anonymization </span></b><span style="font-size:10pt;font-family:Times"><span></span></span></p>

<p class="MsoNormal" style="text-align:center;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria" align="center"><span style="font-size:10pt;font-family:Times"><span><u><span style="font-size:11pt;font-family:Arial;color:black"><a href="http://temu.bsc.es/meddocan">http://temu.bsc.es/meddocan</a></span></u></span><span></span></span></p>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:10pt;font-family:Times"><span> </span></span></p>

<p class="MsoNormal" style="text-align:center;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria" align="center"><b><span style="font-family:Arial;color:black">SEAD – Plan TL Sponsoring Track Awards</span></b><span style="font-size:10pt;font-family:Times"><span></span></span></p>

<p class="MsoNormal" style="text-align:center;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria" align="center"><span style="font-family:Arial;color:black">Sub-tracks: 1,000€, 500€ and 200€ (first,
second, third team)</span><span style="font-size:10pt;font-family:Times"><span></span></span></p>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:10pt;font-family:Times"><span> </span></span></p>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><b><span style="font-size:11pt;font-family:Arial;color:rgb(34,34,34)">Task description</span></b></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria">



















</p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span>Scikit-Learn has been successfully used
for Named Entity Recognition and Classification tasks in the past, showing that
it is specially competitive for fining mentions of entities in running text. <span></span></span></p>





<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><br><b><span style="font-size:11pt;font-family:Arial;color:rgb(34,34,34)"></span></b><span style="font-size:10pt;font-family:Times"><span></span></span></p>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:11pt;font-family:Arial;color:rgb(34,34,34)">Clinical
records with protected health information (PHI) cannot be directly shared as
is, due to privacy constraints, making it particularly cumbersome to carry out
NLP research in the medical domain. A necessary precondition for accessing
clinical records outside of hospitals is their de-identification, i.e., the
exhaustive removal (or replacement) of all mentioned PHI phrases.<span></span></span></p>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:10pt;font-family:Times"><span> </span></span></p>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:11pt;font-family:Arial;color:rgb(34,34,34)">The
practical relevance of anonymization or de-identification of clinical texts
motivated the proposal of two shared tasks, the 2006 and 2014 de-identification
tracks, organized under the umbrella of the i2b2 (</span><span style="font-size:10pt;font-family:Times"><span><u><span style="font-size:11pt;font-family:Arial;color:rgb(17,85,204)"><a href="http://i2b2.org">i2b2.org</a></span></u></span></span><span style="font-size:11pt;font-family:Arial;color:rgb(34,34,34)">) community evaluation
effort. The i2b2 effort has deeply influenced the clinical NLP community
worldwide, but was focused on documents in English and covering characteristics
of US-healthcare data providers.</span><span style="font-size:10pt;font-family:Times"><span></span></span></p>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:10pt;font-family:Times"><span> </span></span></p>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:11pt;font-family:Arial;color:rgb(34,34,34)">As
part of the IberLEF 2019 (</span><span style="font-size:10pt;font-family:Times"><span><u><span style="font-size:11pt;font-family:Arial;color:rgb(17,85,204)"><a href="https://sites.google.com/view/iberlef-2019">https://sites.google.com/view/iberlef-2019</a></span></u></span></span><span style="font-size:11pt;font-family:Arial;color:rgb(34,34,34)">) initiative, we
announce  <u>the first community challenge task specifically devoted to
the anonymization of medical documents in Spanish</u>, called the MEDDOCAN
(Medical Document Anonymization) track.</span><span style="font-size:10pt;font-family:Times"><span></span></span></p>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:10pt;font-family:Times"><span> </span></span></p>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:11pt;font-family:Arial;color:rgb(34,34,34)">In
order to carry out these tasks we have prepared a synthetic corpus of 1000
clinical case studies. This corpus was selected manually by a practicing
physician and augmented with PHI information from discharge summaries and
medical genetics clinical records.</span><span style="font-size:10pt;font-family:Times"><span></span></span></p>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:10pt;font-family:Times"><span> </span></span></p>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:11pt;font-family:Arial;color:rgb(34,34,34)">The
MEDDOCAN task will be structured into <b>two sub-tracks</b>:</span><span style="font-size:10pt;font-family:Times"><span></span></span></p>

<ul style="margin-bottom:0cm" type="disc"><li class="MsoNormal" style="color:rgb(34,34,34);vertical-align:baseline;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:11pt;font-family:Arial">NER
     offset and entity type classification <span></span></span></li><li class="MsoNormal" style="color:rgb(34,34,34);vertical-align:baseline;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:11pt;font-family:Arial">Sensitive
     span detection. <span></span></span></li></ul>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:10pt;font-family:Times"><span> </span></span></p>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><b><span style="font-size:11pt;font-family:Arial;color:rgb(34,34,34)">Publications</span></b><span style="font-size:10pt;font-family:Times"><span></span></span></p>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:10pt;font-family:Times">Teams
will be invited to send a workshop proceedings systems description paper,
similarly to previous <i>IberEval</i> events. <span></span></span></p>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:10pt;font-family:Times">We plan to<b> invite
selected works </b>for full publication in a <b>Q1 Journal – Special Issue
devoted to MEDDOCAN</b>.  Invitation to the special issue will consider
multiple aspects such as performance, novelty of the system,
availability of the underlying system (software/web-service) as well as
the workshop presentation.<span></span></span></p>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:10pt;font-family:Times"><span> </span></span></p>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:10pt;font-family:Times"><span> </span></span></p>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><b><span style="font-size:11pt;font-family:Arial;color:rgb(34,34,34)">Important Dates</span></b><span style="font-size:10pt;font-family:Times"> <span></span></span></p>

<ul style="margin-bottom:0cm" type="disc"><li class="MsoNormal" style="color:blue;vertical-align:baseline;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:11pt;font-family:Arial;color:black">March 18, 2019: Sample set and Evaluation script released.</span><span style="font-size:11pt;font-family:Arial"><span></span></span></li><li class="MsoNormal" style="color:blue;vertical-align:baseline;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:11pt;font-family:Arial;color:black">March 20, 2019: Training set released</span><span style="font-size:11pt;font-family:Arial">.
     <span></span></span></li><li class="MsoNormal" style="color:rgb(34,34,34);vertical-align:baseline;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:11pt;font-family:Arial">April
     4, 2019: Development set released. <span></span></span></li><li class="MsoNormal" style="color:rgb(34,34,34);vertical-align:baseline;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:11pt;font-family:Arial">April
     29, 2019: Test set released (includes background set). <span></span></span></li><li class="MsoNormal" style="color:rgb(34,34,34);vertical-align:baseline;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:11pt;font-family:Arial">May
     17, 2019: End of evaluation period (system submissions). <span></span></span></li><li class="MsoNormal" style="color:rgb(34,34,34);vertical-align:baseline;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:11pt;font-family:Arial">May
     20, 2019: Results posted and Test set with GS annotations released. <span></span></span></li><li class="MsoNormal" style="color:rgb(34,34,34);vertical-align:baseline;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:11pt;font-family:Arial">May
     31, 2019:  Working notes paper submission. <span></span></span></li><li class="MsoNormal" style="color:rgb(34,34,34);vertical-align:baseline;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:11pt;font-family:Arial">June
     14, 2019: Notification of acceptance (peer-reviews). <span></span></span></li><li class="MsoNormal" style="color:rgb(34,34,34);vertical-align:baseline;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:11pt;font-family:Arial">June
     28, 2019: Camera ready paper submission. <span></span></span></li><li class="MsoNormal" style="color:rgb(34,34,34);vertical-align:baseline;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:11pt;font-family:Arial">September
     24, 2019:  IberLEF 2019 Workshop, Bilbao Spain<span></span></span></li></ul>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:10pt;font-family:Times"><span> </span></span></p>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:10pt;font-family:Times"><span> </span></span></p>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><b><span style="font-size:11pt;font-family:Arial;color:rgb(34,34,34)">Task organizers</span></b><span style="font-size:10pt;font-family:Times"> <span></span></span></p>

<ul style="margin-bottom:0cm" type="disc"><li class="MsoNormal" style="color:rgb(34,34,34);vertical-align:baseline;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:11pt;font-family:Arial">Aitor
     Gonzalez-Agirre, Barcelona Supercomputing Center. <span></span></span></li><li class="MsoNormal" style="color:rgb(34,34,34);vertical-align:baseline;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:11pt;font-family:Arial">Ander
     Intxaurrondo, Barcelona Supercomputing Center. <span></span></span></li><li class="MsoNormal" style="color:rgb(34,34,34);vertical-align:baseline;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:11pt;font-family:Arial">Jose
     Antonio Lopez-Martin, Hospital 12 de Octubre. <span></span></span></li><li class="MsoNormal" style="color:rgb(34,34,34);vertical-align:baseline;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:11pt;font-family:Arial">Montserrat
     Marimon, Barcelona Supercomputing Center. <span></span></span></li><li class="MsoNormal" style="color:rgb(34,34,34);vertical-align:baseline;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:11pt;font-family:Arial">Felipe
     Soares, Barcelona Supercomputing Center. <span></span></span></li><li class="MsoNormal" style="color:rgb(34,34,34);vertical-align:baseline;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:11pt;font-family:Arial">Marta
     Villegas, Barcelona Supercomputing Center. <span></span></span></li><li class="MsoNormal" style="color:rgb(34,34,34);vertical-align:baseline;margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:11pt;font-family:Arial">Martin
     Krallinger, Barcelona Supercomputing Center. <span></span></span></li></ul>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:10pt;font-family:Times"><span> </span></span></p>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:10pt;font-family:Times"> <span></span></span></p>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><b><span style="font-size:11pt;font-family:Arial;color:rgb(34,34,34)">Scientific committee </span></b><span style="font-size:10pt;font-family:Times"><span></span></span></p>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:10pt;font-family:Tahoma">• Hercules Dalianis, DSV/Stockholm
University, Sweden<br>
• Christoph Dieterich, Klaus-Tschira-Institute for Computational Cardiology,
University Hospital Heidelberg, Germany<br>
• Jelena Jacimovic, University of Belgrade, Serbia<br>
• Bradley Malin, Vanderbilt University Medical Center, USA<br>
• Øystein Nytrø, Norwegian University of Science and Technology, Norway<br>
• Patrick Ruch, SIB Text Mining, HES-SO & Swiss Institute of
Bioinformatics, Switzerland<br>
• Angus Roberts, King’s College London, UK<br>
• Arturo Romero Gutiérrez, Ministerio de Sanidad, Servicios Sociales e
Igualdad, Spain <br>
• Ozlem Uzuner, George Mason University, USA<br>
• Alfonso Valencia, Barcelona Supercomputing Center, Spain<span></span></span></p>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:10pt;font-family:Tahoma"><span> </span></span></p>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span style="font-size:10pt;font-family:Tahoma"><br>
============================<br>
Martin Krallinger, Dr.<br>
--------------------------------------------------------------------<br>
Head of Biological Text Mining Unit<br>
Structural Biology and BioComputing Programme<br>
Spanish National Cancer Research Centre (CNIO)<br>
--------------------------------------------------------------------<br>
Oficina Técnica General (OTG) del Plan TL en el <br>
área de Biomedicina de la Secretaria de Estado de <br>
Telecomunicaciones y para la Sociedad de la <br>
Información<br>
============================<span></span></span></p>

<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Cambria"><span> </span></p>





</div></div>