
I’m trying to extend PyErrorLog, and since I’m using XMLParser( recover=True ), I’m trying to change all of the reported Levels to WARNING and log XML Syntax errors via etree.use_global_python_log(XMLErrorLog(logger=logging.getLogger(__name__).getChild('XMLParser')))
When I try to add level_map as an instance variable in my classes __init__() method, I get an error message saying that it’s not a writable attribute.
When I add a level_map as a class variable, it doesn’t complain, but it doesn’t appear to use it in the mapping. With this mapping, everything is still tagged as CRITICAL or ERROR.
class XMLErrorLog(etree.PyErrorLog): level_map = { etree.ErrorLevels.WARNING : logging.WARNING, etree.ErrorLevels.ERROR : logging.WARNING, etree.ErrorLevels.FATAL : logging.WARNING }
I have then tried to modify level of the _LogEntry passed to receive before calling log method, but that also does not appear to be possible.
I’ve finally managed to get something to work by using:
class LogEntry(object): level = 1
And in my XMLErrorLog class:
def receive(self, log_entry ): logrepr = "[ %s:%d:%d:%s:%s:%s: %s ]" % ( '', log_entry.line, log_entry.column, "Warning", log_entry.domain_name, log_entry.type_name, log_entry.message) self.log( LogEntry(), logrepr )
But it seemed from the documentation that providing the level_map in my class should have been enough. Am I missing something, or is the documentation incorrect ?
— Steve Majewski

Hi,
When I try to add level_map as an instance variable in my classes __init__() method, I get an error message saying that it’s not a writable attribute.
When I add a level_map as a class variable, it doesn’t complain, but it doesn’t appear to use it in the mapping. With this mapping, everything is still tagged as CRITICAL or ERROR.
class XMLErrorLog(etree.PyErrorLog): level_map = { etree.ErrorLevels.WARNING : logging.WARNING, etree.ErrorLevels.ERROR : logging.WARNING, etree.ErrorLevels.FATAL : logging.WARNING }
The PyError logs level_map attribute is indeed read-only:
# https://github.com/lxml/lxml/blob/582b598fd7aa49fecd64fea2ad88e969832f2beb/s... cdef class PyErrorLog(_BaseErrorLog): # ... cdef readonly dict level_map
But you can update the level map dict in a subclass as it's a mutable:
class XMLErrorLog(etree.PyErrorLog):
... def __init__(self, *args, **kwargs): ... super(XMLErrorLog, self).__init__(self, *args, **kwargs) ... self.level_map.update({ ... etree.ErrorLevels.WARNING : logging.WARNING, ... etree.ErrorLevels.ERROR : logging.WARNING, ... etree.ErrorLevels.FATAL : logging.WARNING, ... }) ...
etree.use_global_python_log(XMLErrorLog(logger=logging.getLogger
(__name__).getChild('XMLParser')))
etree.fromstring('<root><x></root>')
WARNING:__main__.XMLParser:<string>:1:17:FATAL:PARSER:ERR_TAG_NAME_MISMATCH: Opening and ending tag mismatch: x line 1 and root WARNING:__main__.XMLParser:<string>:1:17:FATAL:PARSER:ERR_TAG_NOT_FINISHED: Premature end of data in tag root line 1 <Element root at 0x7fa592896560>
Holger
Landesbank Baden-Wuerttemberg Anstalt des oeffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim, Mainz HRA 12704 Amtsgericht Stuttgart HRA 4356, HRA 104 440 Amtsgericht Mannheim HRA 40687 Amtsgericht Mainz
Die LBBW verarbeitet gemaess Erfordernissen der DSGVO Ihre personenbezogenen Daten. Informationen finden Sie unter https://www.lbbw.de/datenschutz.

Thanks: I didn’t think of trying update.
What I have now is working great:
class XMLErrorLog( etree.PyErrorLog ): new_map = { etree.ErrorLevels.WARNING : logging.WARNING, etree.ErrorLevels.ERROR : logging.WARNING, etree.ErrorLevels.FATAL : logging.WARNING, } def __init__( self, *args, **kwargs ): etree.PyErrorLog.__init__( self, *args, **kwargs ) self.level_map.update( self.new_map ) def receive(self, log_entry ): logrepr = "%s:%d:%d:%s%s.%s:[%s]" % ( log_entry.filename, log_entry.line, log_entry.column, "", log_entry.domain_name, log_entry.type_name, log_entry.message) self.log( log_entry, logrepr )
etree.use_global_python_log(XMLErrorLog(logger=logging.getLogger(__name__).getChild('XMLParser')))
WARNING <string>:2:511:PARSER.ERR_NAME_REQUIRED:[xmlParseEntityRef: no name] DEBUG Writing to file /usr/local/projects/Archivespace/OAI/tmp/oai:jmu%2F%2Frepositories%2F4%2Fresources%2F569.oai_ead.xml WARNING Recoverable XMLParser error on: oai:jmu//repositories/4/resources/569
With that last line produced by checking parser error_log not empty. That check is deferred until after parse so that I can extract the identifier from the header:
if client.XMLParser.error_log : logging.getLogger(__name__).getChild('XMLParser').warning( 'Recoverable XMLParser error on: %s', header.identifier() )
[ Trying to harvest an OAI feed, where some of the metadata payloads are bad XML. Mostly unescaped ampersands. I don’t want one bad file to halt harvesting, but I still want to log and track errors so I can notify feed maintainers upstream. ]
— Steve Majewski
On Apr 17, 2019, at 9:02 AM, Holger Joukl Holger.Joukl@LBBW.de wrote:
Hi,
When I try to add level_map as an instance variable in my classes __init__() method, I get an error message saying that it’s not a writable attribute.
When I add a level_map as a class variable, it doesn’t complain, but it doesn’t appear to use it in the mapping. With this mapping, everything is still tagged as CRITICAL or ERROR.
class XMLErrorLog(etree.PyErrorLog): level_map = { etree.ErrorLevels.WARNING : logging.WARNING, etree.ErrorLevels.ERROR : logging.WARNING, etree.ErrorLevels.FATAL : logging.WARNING }
The PyError logs level_map attribute is indeed read-only:
# https://github.com/lxml/lxml/blob/582b598fd7aa49fecd64fea2ad88e969832f2beb/s... cdef class PyErrorLog(_BaseErrorLog): # ... cdef readonly dict level_map
But you can update the level map dict in a subclass as it's a mutable:
class XMLErrorLog(etree.PyErrorLog):
... def __init__(self, *args, **kwargs): ... super(XMLErrorLog, self).__init__(self, *args, **kwargs) ... self.level_map.update({ ... etree.ErrorLevels.WARNING : logging.WARNING, ... etree.ErrorLevels.ERROR : logging.WARNING, ... etree.ErrorLevels.FATAL : logging.WARNING, ... }) ...
etree.use_global_python_log(XMLErrorLog(logger=logging.getLogger
(__name__).getChild('XMLParser')))
etree.fromstring('<root><x></root>')
WARNING:__main__.XMLParser:<string>:1:17:FATAL:PARSER:ERR_TAG_NAME_MISMATCH: Opening and ending tag mismatch: x line 1 and root WARNING:__main__.XMLParser:<string>:1:17:FATAL:PARSER:ERR_TAG_NOT_FINISHED: Premature end of data in tag root line 1 <Element root at 0x7fa592896560>
Holger
Landesbank Baden-Wuerttemberg Anstalt des oeffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim, Mainz HRA 12704 Amtsgericht Stuttgart HRA 4356, HRA 104 440 Amtsgericht Mannheim HRA 40687 Amtsgericht Mainz
Die LBBW verarbeitet gemaess Erfordernissen der DSGVO Ihre personenbezogenen Daten. Informationen finden Sie unter https://www.lbbw.de/datenschutz. _________________________________________________________________ Mailing list for the lxml Python XML toolkit - http://lxml.de/ lxml@lxml.de https://mailman-mail5.webfaction.com/listinfo/lxml
participants (2)
-
Holger Joukl
-
Majewski, Steven Dennis (sdm7g)