Partly erratic wrong behaviour, Python 3, lxml
Jussi Piitulainen
jpiitula at ling.helsinki.fi
Thu Mar 4 16:47:49 EST 2010
This is the full data file on which my regress/Tribug exhibits the
behaviour that I find incomprehensible, described in the first post in
this thread. The comment in the beginning of the file below was
written before I commented out some records in the data, so the actual
numbers now are not ten expected, thirty sometimes observed, but the
wrong number is always the correct number tripled (5 and 15, I think).
---regress/tridata.py follows---
# Exercise lxml.etree.parse(body).xpath(title)
# which I think should always return a list of
# ten elements but sometimes returns thirty,
# with each of the ten in triplicate. And this
# seems impossible to me. Yet I see it happening.
body = b'''<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
<responseDate>2010-03-02T09:38:47Z</responseDate>
<request verb="ListRecords" from="2004-01-01T00:00:00Z" until="2004-12-31T23:59:59Z" metadataPrefix="oai_dc">http://localhost/pmh/que</request>
<ListRecords>
<record>
<header><!-- x --><!-- -->
<identifier>jrc32003R0055-pl.xml/2/0</identifier>
<datestamp>2004-08-15T19:45:00Z</datestamp>
<setSpec>pl</setSpec>
</header>
<metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>RozporzÄ…dzenie</dc:title>
</oai_dc:dc>
</metadata>
</record>
<!-- <record>
<header>
<identifier>jrc32003R0055-pl.xml/2/1</identifier>
<datestamp>2004-08-15T19:45:00Z</datestamp>
<setSpec>pl</setSpec>
</header>
<metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Komisji</dc:title>
</oai_dc:dc>
</metadata>
</record>
<record>
<header>
<identifier>jrc32003R0055-pl.xml/2/2</identifier>
<datestamp>2004-08-15T19:45:00Z</datestamp>
<setSpec>pl</setSpec>
</header>
<metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>(WE)</dc:title>
</oai_dc:dc>
</metadata>
</record>
<record>
<header>
<identifier>jrc32003R0055-pl.xml/2/3</identifier>
<datestamp>2004-08-15T19:45:00Z</datestamp>
<setSpec>pl</setSpec>
</header>
<metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>nr</dc:title>
</oai_dc:dc>
</metadata>
</record>
<record>
<header>
<identifier>jrc32003R0055-pl.xml/2/4</identifier>
<datestamp>2004-08-15T19:45:00Z</datestamp>
<setSpec>pl</setSpec>
</header>
<metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>55/2003</dc:title>
</oai_dc:dc>
</metadata>
</record>
<record>
<header>
<identifier>jrc32003R0055-pl.xml/3/0</identifier>
<datestamp>2004-08-15T19:45:00Z</datestamp>
<setSpec>pl</setSpec>
</header>
<metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>z</dc:title>
</oai_dc:dc>
</metadata>
</record> -->
<record>
<header>
<identifier>jrc32003R0055-pl.xml/3/1</identifier>
<datestamp>2004-08-15T19:45:00Z</datestamp>
<setSpec>pl</setSpec>
</header>
<metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>dnia</dc:title>
</oai_dc:dc>
</metadata>
</record>
<record>
<header>
<identifier>jrc32003R0055-pl.xml/3/2</identifier>
<datestamp>2004-08-15T19:45:00Z</datestamp>
<setSpec>pl</setSpec>
</header>
<metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>13</dc:title>
</oai_dc:dc>
</metadata>
</record>
<record>
<header>
<identifier>jrc32003R0055-pl.xml/3/3</identifier>
<datestamp>2004-08-15T19:45:00Z</datestamp>
<setSpec>pl</setSpec>
</header>
<metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>stycznia</dc:title>
</oai_dc:dc>
</metadata>
</record>
<record>
<header>
<identifier>jrc32003R0055-pl.xml/3/4</identifier>
<datestamp>2004-08-15T19:45:00Z</datestamp>
<setSpec>pl</setSpec>
</header>
<metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>2003</dc:title>
</oai_dc:dc>
</metadata>
</record>
</ListRecords>
</OAI-PMH>
'''
title = '//*[name()="record"]//*[name()="dc:title"]'
More information about the Python-list
mailing list