[lxml-dev] greetings, and another bug...

Itamar told me I'd get best results by joining this list.
First off lxml.etree is great. I'm relatively new to both Python and XML, and this is the only way to code XML stuff.
I have found a few bugs, the first set of which Itamar may have already forwarded along. Today I found another. A valid XSD construct fails to validate. (Not the document to be validated, but the schema doc itself). If the minInclusive/maxInclusive facets are removed, the problem goes away. xmllint running against the same libxml2 shlibs has no problem with this.
This has been consistent across 1.1.2, 1.2.1, and 1.3.beta.
import lxml.etree as ET import sys
trivial_schema = """<?xml version="1.0" encoding="UTF-8"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema%22%3E <xsd:simpleType name="typePercentage"> <xsd:restriction base="xsd:decimal"> <xsd:minInclusive value="0"/> <xsd:maxInclusive value="100"/> </xsd:restriction> </xsd:simpleType> <xsd:element name="Percentage" type="typePercentage"/> </xsd:schema> """
schematree = ET.XML(trivial_schema) validator = ET.XMLSchema(schematree)
trivial_document = """<?xml version="1.0" encoding="UTF-8"?> <Percentage>99.99999999999999999999</Percentage> """
doctree = ET.XML(trivial_document)
validator.assertValid(doctree)
print "Okay."

Hi,
Jim Rees wrote:
Itamar told me I'd get best results by joining this list.
Definitely the best place for it.
First off lxml.etree is great. I'm relatively new to both Python and XML, and this is the only way to code XML stuff.
Not sure what you mean with "the only way", but I guess you were just rephrasing the obvious "the best way". ;)
I have found a few bugs, the first set of which Itamar may have already forwarded along.
I don't think he did. I would like to see them reported on the list so that we can see what to do about them.
Today I found another. A valid XSD construct fails to validate. (Not the document to be validated, but the schema doc itself). If the minInclusive/maxInclusive facets are removed, the problem goes away. xmllint running against the same libxml2 shlibs has no problem with this.
This has been consistent across 1.1.2, 1.2.1, and 1.3.beta.
import lxml.etree as ET import sys
trivial_schema = """<?xml version="1.0" encoding="UTF-8"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema%22%3E <xsd:simpleType name="typePercentage"> <xsd:restriction base="xsd:decimal"> <xsd:minInclusive value="0"/> <xsd:maxInclusive value="100"/> </xsd:restriction> </xsd:simpleType> <xsd:element name="Percentage" type="typePercentage"/> </xsd:schema> """
schematree = ET.XML(trivial_schema) validator = ET.XMLSchema(schematree)
trivial_document = """<?xml version="1.0" encoding="UTF-8"?> <Percentage>99.99999999999999999999</Percentage> """
doctree = ET.XML(trivial_document)
validator.assertValid(doctree)
print "Okay."
Okay, I tested this and I can't see any problems with the current trunk nor with 1.2. I'm using libxml2 2.6.27 here, what's the version reported by lxml on your side?
Regards, Stefan

On Apr 13, 2007, at 12:05 PM, Stefan Behnel wrote:
I have found a few bugs, the first set of which Itamar may have already forwarded along.
I don't think he did. I would like to see them reported on the list so that we can see what to do about them.
Here's my original bug script for the first set of bugs.
It reproduces against libxml version at least up to 2.6.20, and lxml version at least up to 1.3.beta. The issues here are what seem to be improper caching of successful validation results, and a minor one regarding inconsistent empty element representations.

Hi,
thanks for the reports. A quick shot on the easy one:
Jim Rees wrote:
emptynode = ET.Element("Empty") emptynode2 = ET.Element("Empty") emptynode2.text = ''
print "An empty node with unset text outputs as", ET.tostring(emptynode) print "That string parses back in with text set to", str(ET.fromstring(ET.tostring(emptynode)).text) print
print "An empty node with text set to the empty string outputs as", ET.tostring(emptynode2) print "That string parses back in with text set to", str(ET.fromstring(ET.tostring(emptynode2)).text) print "... and re-outputs as", ET.tostring(ET.fromstring(ET.tostring(emptynode2)))
On my side, this writes:
An empty node with unset text outputs as <Empty/>
I like that.
That string parses back in with text set to None
Nice.
An empty node with text set to the empty string outputs as <Empty></Empty>
Cool.
That string parses back in with text set to None
Not really a bug as XML does not distinguish between <Empty/> and <Empty></Empty>, so technically, this is ok.
... and re-outputs as <Empty/>
As expected.
I'm pretty far from calling this a bug. I'd rather see it as a nice feature of lxml that it tries to map the empty Python string to something meaningful. I believe, if you want to make a text empty, you're well off with setting it to None. So, if you rather pass the empty string, there's likely a reason for it.
Stefan
participants (2)
-
Jim Rees
-
Stefan Behnel