lxml.etree.XMLSyntaxError: Namespace prefix * for * on * is not defined, line *, column *
I got the following error when I process the following xml input. Is there a way to let lxml ignore the error. lxml.etree.XMLSyntaxError: Namespace prefix xsi for nil on ED_INST_TYPE is not defined, line 1, column 254 <row><APPLICATION_ID>9347073</APPLICATION_ID><ACTIVITY>DP2</ACTIVITY><ADMINISTERING_IC>NS</ADMINISTERING_IC><APPLICATION_TYPE>1</APPLICATION_TYPE><ARRA_FUNDED>N</ARRA_FUNDED><AWARD_NOTICE_DATE>09/13/2017</AWARD_NOTICE_DATE><BUDGET_START>09/30/2017</BUDGET_START><BUDGET_END>06/30/2022</BUDGET_END><CFDA_CODE>853</CFDA_CODE><CORE_PROJECT_NUM>DP2NS106663</CORE_PROJECT_NUM><ED_INST_TYPE xsi:nil='true'/><FOA_NUMBER>RFA-RM-16-004</FOA_NUMBER><FULL_PROJECT_NUM>1DP2NS106663-01</FULL_PROJECT_NUM></row> -- Regards, Peng
Peng Yu schrieb am 26.01.2018 um 00:25:
I got the following error when I process the following xml input. Is there a way to let lxml ignore the error.
lxml.etree.XMLSyntaxError: Namespace prefix xsi for nil on ED_INST_TYPE is not defined, line 1, column 254
<row><APPLICATION_ID>9347073</APPLICATION_ID><ACTIVITY>DP2</ACTIVITY><ADMINISTERING_IC>NS</ADMINISTERING_IC><APPLICATION_TYPE>1</APPLICATION_TYPE><ARRA_FUNDED>N</ARRA_FUNDED><AWARD_NOTICE_DATE>09/13/2017</AWARD_NOTICE_DATE><BUDGET_START>09/30/2017</BUDGET_START><BUDGET_END>06/30/2022</BUDGET_END><CFDA_CODE>853</CFDA_CODE><CORE_PROJECT_NUM>DP2NS106663</CORE_PROJECT_NUM><ED_INST_TYPE xsi:nil='true'/><FOA_NUMBER>RFA-RM-16-004</FOA_NUMBER><FULL_PROJECT_NUM>1DP2NS106663-01</FULL_PROJECT_NUM></row>
If you only need to parse this once, are really stuck with this and have no way to get the XML file corrected, you might get away with enabling the "recover=True" option in the parser to make it try to recover from errors and keep going, at the risk of losing data along the way. However, I highly recommend to contact the source of the file instead and make them fix it. Ignoring errors means that you cannot be sure which errors were actually hidden and what data you received or lost. Parser errors are there exactly for allowing you to reject the file and blame the source, so that you can comfortably request a correct input file from them. Stefan
participants (2)
-
Peng Yu
-
Stefan Behnel