[pypy-issue] [issue1357] lxml not returning the same proxy object.

Simon Sapin tracker at bugs.pypy.org
Tue Jan 1 17:37:21 CET 2013


New submission from Simon Sapin <simon.sapin at kozea.fr>:

(Reported here because I’ve never seen this bug on CPython.)
WeasyPrint uses lxml with some code that, much simplified, looks like this:

    style_by_element = {}
    for element in tree.iter():
        parent_style = style_by_element[element.getparent()]
        style_by_element[element] = compute_style(element, parent_style)

This code relies on a guarantee documented in lxml:

http://lxml.de/element_classes.html#element-initialization

> There is one important guarantee regarding Element proxies. Once a proxy has
been instantiated, it will keep alive as long as there is a Python reference to
it, and any access to the XML element in the tree will return this very instance.

However, this is not always the case with lxml on PyPy 2.0.0-beta1.
Semi-randomly, the code above will raise a KeyError. Apparently,
element.getparent() sometimes does not return the same Python proxy object that
was previously yielded by tree.iter() for the parent node.

Very small documents (as in the test suite) will render without issues. Bigger
documents will reliably fail, but not always on the same element.
`len(style_by_element)` on failure goes from dozens to low hundreds.

Unfortunately I’m not sure how to make a smaller test case that "try the cffi
branch of WeasyPrint" https://github.com/Kozea/WeasyPrint/tree/cffi

----------
messages: 5118
nosy: SimonSapin, pypy-issue
priority: bug
status: unread
title: lxml not returning the same proxy object.

________________________________________
PyPy bug tracker <tracker at bugs.pypy.org>
<https://bugs.pypy.org/issue1357>
________________________________________


More information about the pypy-issue mailing list