Undocumented or unwanted change in keyword parameter?
Hello, I don't want to report a bug (yet), as I don't know as what this should be reported - but it's definitely a problem. My info on the system where I reproduced it: Python : sys.version_info(major=3, minor=4, micro=3, releaselevel='final', serial=0) lxml.etree : (3, 5, 0, 0) libxml used : (2, 9, 1) libxml compiled : (2, 9, 1) libxslt used : (1, 1, 28) libxslt compiled : (1, 1, 28) In lxml==3.2.1 under Python 2.7 I instantiated it succesfully like this XmlParser([...], XMLSchema_schema=None, [...]) (note the underscore) In lxml==3.5.0 under Python 3.4 this started failing with the error: Traceback (most recent call last): File "/path/to/my/script.py", line 7, in <module> XMLSchema_schema=xmlSchema) File "src/lxml/parser.pxi", line 1437, in lxml.etree.XMLParser.__init__ (src/lxml/lxml.etree.c:120522) TypeError: __init__() got an unexpected keyword argument 'XMLSchema_schema' from lxml import etree The code looks like this: from lxml import etree xmlSchema = etree.XMLSchema(file='/path/to/some/schema.xsd') parser = etree.XMLParser( remove_blank_text=True, attribute_defaults=True, XMLSchema_schema=xmlSchema) It can be fixed by just using "schema" instead of "XMLSchema_schema" as keyword argument: XmlParser([...], schema=xmlSchema, [...]) --- I had a look at the sources of both versions. The affected source line In file src/lxml/lxml.etree.pyx:1437 in both versions looks the same def __init__([...], XMLSchema schema=None, [...]): so I guess somewhere between 3.21. and 3.5.0 the type hint does not get baked into the keyword argument anymore? My question would be: Is this a bug in the documentation that fails to identify the new keyword parameter as "schema" or is this an unwanted change of the parameter that should actually remain as "XMLSchema_schema" and turned accidentally into "schema"? cheers Oliver
Hi,
In lxml==3.2.1 under Python 2.7 I instantiated it succesfully like this XmlParser([...], XMLSchema_schema=None, [...]) (note the underscore)
Unfortunately I can't try this with 3.2.1 (the .tgz package seems broken on lxml.de) but the only reason I can imagine why this should ever have worked is that arbitrary keyword arguments were possible then. But the source code doesn't support this theory, so maybe a cython quirk back then? Are you sure your report is accurate?
In lxml==3.5.0 under Python 3.4 this started failing with the error:
Traceback (most recent call last): File "/path/to/my/script.py", line 7, in <module> XMLSchema_schema=xmlSchema) File "src/lxml/parser.pxi", line 1437, in lxml.etree.XMLParser.__init__ (src/lxml/lxml.etree.c:120522) TypeError: __init__() got an unexpected keyword argument 'XMLSchema_schema' from lxml import etree
Which is sane and expected imho so my take on it is this: Not a bug, works as intended. If it ever worked with keyword argument 'XMLSchema_schema' then this was a bug and it is now fixed. Holger Landesbank Baden-Wuerttemberg Anstalt des oeffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim, Mainz HRA 12704 Amtsgericht Stuttgart
Hello Holger,
thanks to your questions I got to the root of the problem. You are correct
tin assuming that XMLSchema_schema very likely never has worked. So this is
purely a documentation problem. If you look at the API documentation at
http://lxml.de/api/lxml.etree.XMLParser-class.html
XMLParser(self, encoding=None, attribute_defaults=False,
dtd_validation=False, load_dtd=False, no_network=True, ns_clean=False,
recover=False, XMLSchema schema=None, remove_blank_text=False,
resolve_entities=True, remove_comments=False, remove_pis=False,
strip_cdata=True, collect_ids=True, target=None, compact=True)
schema is the only parameter that is containing what I guess is a Cython
type hint, which IMO does not belong in the API documentation. This is not
only confusing for humans who don't know about Cython type hints. It is
also causing the auto generation of stubs in the PyCharm IDE to generate
the wrong keyword argument "XMLSchema_schema" which then in combination
with the API doc can lead one to believe that the keyword argument is in
fact "XMLSchema_schema". But this is basically a PyCharm bug ...
Cheers
Oliver
On Thu, 17 Mar 2016 at 09:32 Holger Joukl
Hi,
In lxml==3.2.1 under Python 2.7 I instantiated it succesfully like this XmlParser([...], XMLSchema_schema=None, [...]) (note the underscore)
Unfortunately I can't try this with 3.2.1 (the .tgz package seems broken on lxml.de) but the only reason I can imagine why this should ever have worked is that arbitrary keyword arguments were possible then. But the source code doesn't support this theory, so maybe a cython quirk back then? Are you sure your report is accurate?
In lxml==3.5.0 under Python 3.4 this started failing with the error:
Traceback (most recent call last): File "/path/to/my/script.py", line 7, in <module> XMLSchema_schema=xmlSchema) File "src/lxml/parser.pxi", line 1437, in lxml.etree.XMLParser.__init__ (src/lxml/lxml.etree.c:120522) TypeError: __init__() got an unexpected keyword argument 'XMLSchema_schema' from lxml import etree
Which is sane and expected imho so my take on it is this: Not a bug, works as intended. If it ever worked with keyword argument 'XMLSchema_schema' then this was a bug and it is now fixed.
Holger
Landesbank Baden-Wuerttemberg Anstalt des oeffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim, Mainz HRA 12704 Amtsgericht Stuttgart
_________________________________________________________________ Mailing list for the lxml Python XML toolkit - http://lxml.de/ lxml@lxml.de https://mailman-mail5.webfaction.com/listinfo/lxml
XMLParser(self, encoding=None, attribute_defaults=False, dtd_validation=False, load_dtd=False, no_network=True, ns_clean=False, recover=False, XMLSchema schema=None, remove_blank_text=False, resolve_entities=True, remove_comments=False, remove_pis=False, strip_cdata=True, collect_ids=True, target=None, compact=True)
schema is the only parameter that is containing what I guess is a Cython type hint, which IMO does not belong in the API documentation.
I respectfully disagree. I think the API docs should provide whatever information it can give. The notation resembles C function signatures here so I think it's reasonably intuitive. Maybe API docs with Python 3 type annotation syntax instead may be nicer for a Python API, regardless if it's implemented in Cython. Of course, one could also argue that it's actually a Cython API, which happens to be importable & usable in Python ;-) Holger Landesbank Baden-Wuerttemberg Anstalt des oeffentlichen Rechts Hauptsitze: Stuttgart, Karlsruhe, Mannheim, Mainz HRA 12704 Amtsgericht Stuttgart
Holger Joukl schrieb am 17.03.2016 um 17:23:
XMLParser(self, encoding=None, attribute_defaults=False, dtd_validation=False, load_dtd=False, no_network=True, ns_clean=False, recover=False, XMLSchema schema=None, remove_blank_text=False, resolve_entities=True, remove_comments=False, remove_pis=False, strip_cdata=True, collect_ids=True, target=None, compact=True)
schema is the only parameter that is containing what I guess is a Cython type hint, which IMO does not belong in the API documentation.
I respectfully disagree. I think the API docs should provide whatever information it can give. The notation resembles C function signatures here so I think it's reasonably intuitive.
Maybe API docs with Python 3 type annotation syntax instead may be nicer for a Python API, regardless if it's implemented in Cython.
If it helps ... https://github.com/lxml/lxml/commit/477e721dc083ed512182d9a2339bb64ee9f37937 Stefan
Hi Holger and Stefan,
thanks for the explanations and the change. I am looking forward to see
what PyCharm will make of that :)
cheers
Oliver
On Thu, 17 Mar 2016 at 17:43 Stefan Behnel
Holger Joukl schrieb am 17.03.2016 um 17:23:
XMLParser(self, encoding=None, attribute_defaults=False, dtd_validation=False, load_dtd=False, no_network=True, ns_clean=False, recover=False, XMLSchema schema=None, remove_blank_text=False, resolve_entities=True, remove_comments=False, remove_pis=False, strip_cdata=True, collect_ids=True, target=None, compact=True)
schema is the only parameter that is containing what I guess is a Cython type hint, which IMO does not belong in the API documentation.
I respectfully disagree. I think the API docs should provide whatever information it can give. The notation resembles C function signatures here so I think it's reasonably intuitive.
Maybe API docs with Python 3 type annotation syntax instead may be nicer for a Python API, regardless if it's implemented in Cython.
If it helps ...
https://github.com/lxml/lxml/commit/477e721dc083ed512182d9a2339bb64ee9f37937
Stefan
_________________________________________________________________ Mailing list for the lxml Python XML toolkit - http://lxml.de/ lxml@lxml.de https://mailman-mail5.webfaction.com/listinfo/lxml
participants (3)
-
Holger Joukl
-
Oliver Bestwalter
-
Stefan Behnel