[lxml-dev] segmentation fault getting attrib on element
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, When doing this :
import lxml.etree doc = lxml.etree.parse('./publication.xpdl') root = doc.getroot() root.attrib Segmentation fault
But if :
import lxml.etree doc = lxml.etree.parse('./publication.xpdl') root = doc.getroot() print root.attrib.keys() ['Id', 'schemaLocation']
It works while accessing the keys / values / items of attrib. It's not harmful but wanted to report this if it's hidding something. Or is it normal ? Config : - -------- Python 2.3.5 libxml2-2.6.16-3 libxml2-python-2.6.16-5_17.rhfc3.at libxslt-1.1.11-1 libxslt-python-1.1.11-1 libxslt-devel-1.1.11-1 on FC3. J. - -- Julien Anguenot | Nuxeo R&D (Paris, France) CPS Plateform : http://www.cps-project.org mail: anguenot at nuxeo.com; tel: +33 (0) 6 72 57 57 66 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org iD8DBQFCe70mGhoG8MxZ/pIRAhjQAJ4gqKOBzmkShNRI6w7vRanZfRa9sgCfbGoK UfJ1hfstnZj/qWfDrtL1yyo= =IT0p -----END PGP SIGNATURE-----
i'm using lxml in a self made program / linux (ubuntu) it works very well ... and i'm happy to use it ... (i love the elementtree way, and the power of libxml2 (speed, xpath and xslt transformations), many thanks for your great work !!!) i'd like to release a version under win32, but i'm not able to build alone lxml on that platform ... i've see some of you have already done that well ... is there a way to get this win32 version of lxml somewhere ? another stupid question, is there another way to get the parent of an element without doing that : parentElement = element.xpath("..")[0] i'm pretty sure there should be a better way ;-), but i'm agree it's a elementtree specific question ...
manatlan wrote:
another stupid question, is there another way to get the parent of an element without doing that :
parentElement = element.xpath("..")[0]
i'm pretty sure there should be a better way ;-), but i'm agree it's a elementtree specific question ...
AFAIK elementtree doesn't keep parent pointers. http://effbot.org/zone/element.htm seems to agree with me and suggests: The element structure has no parent pointers. If you need to keep track of child/parent relations, you can either structure your program to work on parents rather than the children, or use a separate data structure to map from child elements to their parents. On Python 2.4, the following one-liner creates a child/parent map for an entire tree: parent_map = dict((c, p) for p in tree.getiterator() for c in p) Philipp
manatlan wrote:
i'm using lxml in a self made program / linux (ubuntu) it works very well ... and i'm happy to use it ... (i love the elementtree way, and the power of libxml2 (speed, xpath and xslt transformations), many thanks for your great work !!!)
i'd like to release a version under win32, but i'm not able to build alone lxml on that platform ... i've see some of you have already done that well ... is there a way to get this win32 version of lxml somewhere ?
Someone did compile a win32 platform and I'll send it to you in a bit. We don't have something in place to make sure win32 versions get built for each release though -- still working in trying to find volunteers for this.
another stupid question, is there another way to get the parent of an element without doing that :
parentElement = element.xpath("..")[0]
i'm pretty sure there should be a better way ;-), but i'm agree it's a elementtree specific question ...
Actually elementtree itself to my knowledge doesn't support this. Your xpath trick is pretty clear. It wouldn't be hard to add a parent method to lxml, but I'm worried about breaking compatbility. I need to think about this, and perhaps discuss this with Fredrik Lundh -- it's his ElementTree design decision, unless I missed something obvious. Regards, Martijn
Julien Anguenot wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hi,
When doing this :
import lxml.etree doc = lxml.etree.parse('./publication.xpdl') root = doc.getroot() root.attrib
Segmentation fault
But if :
import lxml.etree doc = lxml.etree.parse('./publication.xpdl') root = doc.getroot() print root.attrib.keys()
['Id', 'schemaLocation']
It works while accessing the keys / values / items of attrib.
It's not harmful but wanted to report this if it's hidding something.
Or is it normal ?
Segmentation faults in Python code should *never* be normal. That's one thing about the default Python bindings that lxml is trying to fix. You're very right in reporting this, thanks! I've tried to reproduce this just now but I think I'm still missing something -- I can print out attributes just fine (or str() or repr()). There must be something with that particular data that is triggering this.. Could you try to reduce this to a small test case (unit test would be great!) that reproduces this problem? I need to be able to reproduce the problem on my system. Regards, Martijn
On May 9, 2005, at 2:44 PM, Martijn Faassen wrote:
Julien Anguenot wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, When doing this :
import lxml.etree doc = lxml.etree.parse('./publication.xpdl') root = doc.getroot() root.attrib
Segmentation fault But if :
import lxml.etree doc = lxml.etree.parse('./publication.xpdl') root = doc.getroot() print root.attrib.keys()
['Id', 'schemaLocation'] It works while accessing the keys / values / items of attrib. It's not harmful but wanted to report this if it's hidding something. Or is it normal ?
Segmentation faults in Python code should *never* be normal. That's one thing about the default Python bindings that lxml is trying to fix.
You're very right in reporting this, thanks! I've tried to reproduce this just now but I think I'm still missing something -- I can print out attributes just fine (or str() or repr()). There must be something with that particular data that is triggering this.. Could you try to reduce this to a small test case (unit test would be great!) that reproduces this problem? I need to be able to reproduce the problem on my system.
Perhaps those of us reporting errors should check into the lxml area some small test cases? I don't know squat about unit testing, so perhaps something lightweight: input doc, output doc, .py with a main() function, and a README. --Paul
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Martijn Faassen wrote:
Julien Anguenot wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hi,
When doing this :
import lxml.etree doc = lxml.etree.parse('./publication.xpdl') root = doc.getroot() root.attrib
Segmentation fault
But if :
import lxml.etree doc = lxml.etree.parse('./publication.xpdl') root = doc.getroot() print root.attrib.keys()
['Id', 'schemaLocation']
It works while accessing the keys / values / items of attrib.
It's not harmful but wanted to report this if it's hidding something.
Or is it normal ?
Segmentation faults in Python code should *never* be normal. That's one thing about the default Python bindings that lxml is trying to fix.
right.
You're very right in reporting this, thanks! I've tried to reproduce this just now but I think I'm still missing something -- I can print out attributes just fine (or str() or repr()). There must be something with that particular data that is triggering this.. Could you try to reduce this to a small test case (unit test would be great!) that reproduces this problem? I need to be able to reproduce the problem on my system.
I wrote a test including the xml file reproducing the problem. (within the test_etree.py (ETreeTestCase) Can you provide me an svn account to check this in ? (or should I send you a patch and the test file ?) J.
Regards,
Martijn
- -- Julien Anguenot | Nuxeo R&D (Paris, France) CPS Plateform : http://www.cps-project.org mail: anguenot at nuxeo.com; tel: +33 (0) 6 72 57 57 66 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org iD8DBQFCf1/IGhoG8MxZ/pIRAiDoAJ9IrhSCWGX4rJ+kV1b2RLqYT7ujYQCfa1gD Bdw3VEr3DommJO8TENA/kbQ= =013Y -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 http://codespeak.net/svn/lxml/testcase/anguenot/segfault/ There is the example of the segmentation fault. You can check : http://codespeak.net/svn/lxml/testcase/anguenot/segfault/README.txt for my environnement. Hope you can reproduce this on your boxes... J. Julien Anguenot wrote:
Hi,
When doing this :
import lxml.etree doc = lxml.etree.parse('./publication.xpdl') root = doc.getroot() root.attrib
Segmentation fault
But if :
import lxml.etree doc = lxml.etree.parse('./publication.xpdl') root = doc.getroot() print root.attrib.keys()
['Id', 'schemaLocation']
It works while accessing the keys / values / items of attrib.
It's not harmful but wanted to report this if it's hidding something.
Or is it normal ?
Config : --------
Python 2.3.5
libxml2-2.6.16-3 libxml2-python-2.6.16-5_17.rhfc3.at
libxslt-1.1.11-1 libxslt-python-1.1.11-1 libxslt-devel-1.1.11-1
on FC3.
J.
-- Julien Anguenot | Nuxeo R&D (Paris, France) CPS Plateform : http://www.cps-project.org mail: anguenot at nuxeo.com; tel: +33 (0) 6 72 57 57 66
lxml-dev mailing list lxml-dev@codespeak.net http://codespeak.net/mailman/listinfo/lxml-dev - -- Julien Anguenot | Nuxeo R&D (Paris, France) CPS Plateform : http://www.cps-project.org mail: anguenot at nuxeo.com; tel: +33 (0) 6 72 57 57 66 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org iD8DBQFCf5lLGhoG8MxZ/pIRAuYDAJ42gRaNQ4rtZXFHbLwn4u4VtgEB0QCfRjrK /paO+wHuB62q1T9eC4Mse+Q= =bgnH -----END PGP SIGNATURE-----
Julien Anguenot wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
http://codespeak.net/svn/lxml/testcase/anguenot/segfault/
There is the example of the segmentation fault.
You can check :
http://codespeak.net/svn/lxml/testcase/anguenot/segfault/README.txt
for my environnement.
Hope you can reproduce this on your boxes...
Thanks for the report! As Marc-Antoine Parent also said, the test case is a lot simpler in fact. The problem occurred for all namespaced attributes. keys(), values() and items() were also not performing correctly, and I even found the XXX in the code where it said this still was to be done. :) I added a bunch of tests and fixed the issues now on the trunk. Thanks Julien and Marc-Antoine! Regards, Martijn
participants (5)
-
Julien Anguenot
-
manatlan
-
Martijn Faassen
-
Paul Everitt
-
Philipp von Weitershausen