Interpretation of AssertionError

Hi! lxml seems to raise AssertionErrors quite often. How should they be interpreted? My understanding is that AssertionErrors in general should be interpreted as "this thing that shouldn't happen just happened", i.e. basically a bug rather than an error that's part of the interface. Since I get them as often as I do I'm wondering if lxml uses a different interpretation, that AssertionErrors are part of lxml's interface. For example here's a code snippet that raises an AssertionError (lxml 4.5.2, macOS 10.15.6): parser = XMLParser(target=TreeBuilder()) parser.feed('</a>') And this is the exception stack: Traceback (most recent call last): File "src/lxml/parsertarget.pxi", line 158, in lxml.etree._TargetParserContext._handleParseResult File "src/lxml/parser.pxi", line 654, in lxml.etree._raiseParseError File "<string>", line 1 lxml.etree.XMLSyntaxError: StartTag: invalid element name, line 1, column 2 During handling of the above exception, another exception occurred: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "src/lxml/parser.pxi", line 1256, in lxml.etree._FeedParser.feed File "src/lxml/parser.pxi", line 1384, in lxml.etree._FeedParser.feed File "src/lxml/parsertarget.pxi", line 167, in lxml.etree._TargetParserContext._handleParseResult File "src/lxml/saxparser.pxi", line 811, in lxml.etree.TreeBuilder.close AssertionError: missing toplevel element Should I interpret this AssertionError as a bug in lxml? Thank you! Per

Hi, Per schrieb am 27.07.20 um 13:01:
It's a bit more complex than that. lxml inherits much of its Python interface from xml.etree.ElementTree, including a couple of places where it raises AssertionErrors. You hit one of them. I agree that a different, more specific exception would often be nicer, such as an XMLSyntaxError in this case, but changing that would introduce inconsistencies with ElementTree. ElementTree used assertions also for user side programming errors, which your example kind of is, just as an internal usage issue. The parser closes the target in an inconsistent state. I personally think that this should be ok, calling .close() on the target should always be possible and leave things cleaned up, but that's not how the original interface was designed. I pushed a change that IMHO goes in the right direction: https://github.com/lxml/lxml/commit/1b993ad7c11d23b623ce2cd79b02e732a3a8fcf1 Your example seems a little contrieved – why would you ever pass a plain "TreeBuilder" to the parser rather than a subclass, in which case you'd also have control over the .close() method. Would you have a more real-world use case where the assertions hurt you? Stefan

Hi, Per schrieb am 27.07.20 um 13:01:
It's a bit more complex than that. lxml inherits much of its Python interface from xml.etree.ElementTree, including a couple of places where it raises AssertionErrors. You hit one of them. I agree that a different, more specific exception would often be nicer, such as an XMLSyntaxError in this case, but changing that would introduce inconsistencies with ElementTree. ElementTree used assertions also for user side programming errors, which your example kind of is, just as an internal usage issue. The parser closes the target in an inconsistent state. I personally think that this should be ok, calling .close() on the target should always be possible and leave things cleaned up, but that's not how the original interface was designed. I pushed a change that IMHO goes in the right direction: https://github.com/lxml/lxml/commit/1b993ad7c11d23b623ce2cd79b02e732a3a8fcf1 Your example seems a little contrieved – why would you ever pass a plain "TreeBuilder" to the parser rather than a subclass, in which case you'd also have control over the .close() method. Would you have a more real-world use case where the assertions hurt you? Stefan
participants (2)
-
Per
-
Stefan Behnel