[lxml-dev] lxml 1.1 problems with python 2.3
Hi there, On Python 2.3 this segfaults with lxml 1.1 (it works with lxml 1.0):
from lxml import etree etree.parse('sfadfdfd')
On python 2.4 we get an error as we should (as the file sfafdfd doesn't exist). Additionally, the tests don't work anymore under Python 2.3. For lxml 1.1 some dependencies on Python 2.4's doctest module exist that don't work on Python 2.3, probably because we dropped the custom doctest that I added initially. For lxml 1.0 this is less bad, but there are still some dependencies on 'sorted()' and such in the tests. I don't think we actually ever explictly dropped support for Python 2.3. Perhaps we should for a particular version of lxml, but it'd be nice if we could track down this bug. It might indicate something wrong in Python 2.4 that just doesn't show up right away, I don't know. Regards, Martijn
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Martijn Faassen wrote:
Hi there,
On Python 2.3 this segfaults with lxml 1.1 (it works with lxml 1.0):
from lxml import etree etree.parse('sfadfdfd')
On python 2.4 we get an error as we should (as the file sfafdfd doesn't exist).
Additionally, the tests don't work anymore under Python 2.3. For lxml 1.1 some dependencies on Python 2.4's doctest module exist that don't work on Python 2.3, probably because we dropped the custom doctest that I added initially. For lxml 1.0 this is less bad, but there are still some dependencies on 'sorted()' and such in the tests.
I don't think we actually ever explictly dropped support for Python 2.3. Perhaps we should for a particular version of lxml, but it'd be nice if we could track down this bug. It might indicate something wrong in Python 2.4 that just doesn't show up right away, I don't know.
Maybe we should return the custom doctest and wire it in via a conditional import, e.g.:: try: import doctest except ImportError: # Python < 2.4 from lxml.bbb import doctest Tres. - -- =================================================================== Tres Seaver +1 202-558-7113 tseaver@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFFZywb+gerLs4ltQ4RAsE1AJ9NAx70MKNtcFVkIo0/Tm574XbxvgCfe2iS WStbWOKepYWKu7N4KT8huA8= =eIAB -----END PGP SIGNATURE-----
Tres Seaver wrote:
Maybe we should return the custom doctest and wire it in via a conditional import, e.g.::
try: import doctest except ImportError: # Python < 2.4 from lxml.bbb import doctest
The problem is not that it can't be imported in 2.3, it's rather that we seem to be using some features in the tests that were not yet available in 2.3's doctest module. That's harder to test... Stefan
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Stefan Behnel wrote:
Tres Seaver wrote:
Maybe we should return the custom doctest and wire it in via a conditional import, e.g.::
try: import doctest except ImportError: # Python < 2.4 from lxml.bbb import doctest
The problem is not that it can't be imported in 2.3, it's rather that we seem to be using some features in the tests that were not yet available in 2.3's doctest module. That's harder to test...
I'm attaching a patch which "fixes" this, assuming that we put the non-standard-for-Python-2.3 doctest.py into a new 'lxml.bbb' package. It still segfaults, but the tests do import from the bbb module. Tres. - -- =================================================================== Tres Seaver +1 202-558-7113 tseaver@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFFaHLP+gerLs4ltQ4RAgicAJ999DxRyl4qTkDicTFE677vkbna3QCfXMRt Bw1coTGxCWzmcsd/AqI5vAE= =D/Dz -----END PGP SIGNATURE-----
Hi Tres, Tres Seaver wrote:
Stefan Behnel wrote:
Tres Seaver wrote:
Maybe we should return the custom doctest and wire it in via a conditional import, e.g.::
try: import doctest except ImportError: # Python < 2.4 from lxml.bbb import doctest The problem is not that it can't be imported in 2.3, it's rather that we seem to be using some features in the tests that were not yet available in 2.3's doctest module. That's harder to test...
I'm attaching a patch which "fixes" this, assuming that we put the non-standard-for-Python-2.3 doctest.py into a new 'lxml.bbb' package. It still segfaults, but the tests do import from the bbb module.
Thanks for the patch. However, that's the kind of thing the "common_imports" module is there for. So here's a somewhat simpler patch that should do the same thing. Stefan
Stefan Behnel wrote:
Hi Tres,
Tres Seaver wrote:
Stefan Behnel wrote:
Tres Seaver wrote:
Maybe we should return the custom doctest and wire it in via a conditional import, e.g.::
try: import doctest except ImportError: # Python < 2.4 from lxml.bbb import doctest The problem is not that it can't be imported in 2.3, it's rather that we seem to be using some features in the tests that were not yet available in 2.3's doctest module. That's harder to test... I'm attaching a patch which "fixes" this, assuming that we put the non-standard-for-Python-2.3 doctest.py into a new 'lxml.bbb' package. It still segfaults, but the tests do import from the bbb module.
Thanks for the patch. However, that's the kind of thing the "common_imports" module is there for. So here's a somewhat simpler patch that should do the same thing.
Did you forget to check in local_doctest.py? I can't make the trunk's tests work on Python 2.3. I tried copying over Python 2.4's doctest.py into src/lxml/local_doctests.py Things then fail with what looks like a new, unrelated issue: Traceback (most recent call last): File "test.py", line 591, in ? exitcode = main(sys.argv) File "test.py", line 554, in main test_cases = get_test_cases(test_files, cfg, tracer=tracer) File "test.py", line 254, in get_test_cases module = import_module(file, cfg, tracer=tracer) File "test.py", line 197, in import_module mod = __import__(modname) File "/home/faassen/working/lxml/lxml-trunk/src/lxml/tests/test_objectify.py", line 16, in ? from lxml import objectify ImportError: /home/faassen/working/lxml/lxml-trunk/src/lxml/objectify.so: undefined symbol: previousElement everything works fine in Python 2.4 though so this is rather mysterious. Regards, Martijn
Hi, Martijn Faassen wrote:
Did you forget to check in local_doctest.py?
No, it's right in src/, revision 35023. It's a copy of the one you put into revision 8449. Maybe it's just not found in the PYTHONPATH? (Well, test.py should do that for us, right?)
Things then fail with what looks like a new, unrelated issue:
Traceback (most recent call last): File "test.py", line 591, in ? exitcode = main(sys.argv) File "test.py", line 554, in main test_cases = get_test_cases(test_files, cfg, tracer=tracer) File "test.py", line 254, in get_test_cases module = import_module(file, cfg, tracer=tracer) File "test.py", line 197, in import_module mod = __import__(modname) File "/home/faassen/working/lxml/lxml-trunk/src/lxml/tests/test_objectify.py", line 16, in ? from lxml import objectify ImportError: /home/faassen/working/lxml/lxml-trunk/src/lxml/objectify.so: undefined symbol: previousElement
That's rather bizarre, previousElement is definitely a public function (i.e. defined in etree.so). I have no idea how that could be missing. Stefan
Hey Stefan, Stefan Behnel wrote:
Martijn Faassen wrote:
Did you forget to check in local_doctest.py?
No, it's right in src/, revision 35023. It's a copy of the one you put into revision 8449.
Maybe it's just not found in the PYTHONPATH? (Well, test.py should do that for us, right?)
Stupid of me not to see it earlier, but that's because it's trying to import from lxl.local_doctest and you added it as local_doctest. I fixed the import not to import from the lxml namespace anymore and checked it in. That still leaves the next error.
Things then fail with what looks like a new, unrelated issue:
Traceback (most recent call last): File "test.py", line 591, in ? exitcode = main(sys.argv) File "test.py", line 554, in main test_cases = get_test_cases(test_files, cfg, tracer=tracer) File "test.py", line 254, in get_test_cases module = import_module(file, cfg, tracer=tracer) File "test.py", line 197, in import_module mod = __import__(modname) File "/home/faassen/working/lxml/lxml-trunk/src/lxml/tests/test_objectify.py", line 16, in ? from lxml import objectify ImportError: /home/faassen/working/lxml/lxml-trunk/src/lxml/objectify.so: undefined symbol: previousElement
That's rather bizarre, previousElement is definitely a public function (i.e. defined in etree.so). I have no idea how that could be missing.
It's consistently missing though in Python 2.3. Perhaps it accidentally gets turned off together with thread support? I did try to test this theory yesterday though on Python 2.4 by explicitly disabling tests, and that didn't help. Regards, Martijn
Hi Martijn, Martijn Faassen wrote:
Stupid of me not to see it earlier, but that's because it's trying to import from lxl.local_doctest and you added it as local_doctest.
Ah, stupid me then. :)
Things then fail with what looks like a new, unrelated issue:
Traceback (most recent call last): File "test.py", line 591, in ? exitcode = main(sys.argv) File "test.py", line 554, in main test_cases = get_test_cases(test_files, cfg, tracer=tracer) File "test.py", line 254, in get_test_cases module = import_module(file, cfg, tracer=tracer) File "test.py", line 197, in import_module mod = __import__(modname) File "/home/faassen/working/lxml/lxml-trunk/src/lxml/tests/test_objectify.py",
line 16, in ? from lxml import objectify ImportError: /home/faassen/working/lxml/lxml-trunk/src/lxml/objectify.so: undefined symbol: previousElement
That's rather bizarre, previousElement is definitely a public function (i.e. defined in etree.so). I have no idea how that could be missing.
It's consistently missing though in Python 2.3. Perhaps it accidentally gets turned off together with thread support? I did try to test this theory yesterday though on Python 2.4 by explicitly disabling tests, and that didn't help.
Ok, then, first thing to check: does "previousElement" turn up as a static function in the generated src/lxml/etree.h? Could you check what the preprocessor sees in objectify.c (gcc -E)? On my side (Py 2.5), it sees the following: ----------------------- ... static xmlNode (*((*nextElement)(xmlNode (*)))); static xmlNode (*((*previousElement)(xmlNode (*)))); ... {"nextElement", &nextElement}, {"previousElement", &previousElement}, ... __pyx_v_next = nextElement; ... __pyx_v_next = previousElement; ... ----------------------- I'm showing both functions here, as both are used in objectify, but only the second seems to be missing according to your report. If this looks the same on your side, I'm really out of ideas. Stefan
Stefan Behnel wrote: [snip]
It's consistently missing though in Python 2.3. Perhaps it accidentally gets turned off together with thread support? I did try to test this theory yesterday though on Python 2.4 by explicitly disabling tests, and that didn't help.
Ok, then, first thing to check: does "previousElement" turn up as a static function in the generated src/lxml/etree.h?
The only reference to previousElement (and nextElement) in etree.h are here: extern DL_IMPORT(xmlNode) (*(nextElement(xmlNode (*)))); extern DL_IMPORT(xmlNode) (*(previousElement(xmlNode (*))));
Could you check what the preprocessor sees in objectify.c (gcc -E)?
Hm, I wasn't previously familiar with gcc -E. I tried running it against objectify.c but got a lot of missing includes for Python and libxml2 (which is odd as these things are in /usr/include). I'm not quite sure how you generate your output, but here's my reference to previousElement when I do gcc -E: extern DL_IMPORT(xmlNode) (*(nextElement(xmlNode (*)))); extern DL_IMPORT(xmlNode) (*(previousElement(xmlNode (*)))); ... __pyx_v_next = nextElement; ... __pyx_v_next = previousElement; ... Hm, is it possible I'm using the wrong version of Pyrex? I have lxml's version installed for Python 2.4 but I guess I don't have that one for Python 2.3... Us having to maintain our own version of Pyrex rather sucks. I just installed lxml's version of Pyrex, and now the tests start. We still get some failures, though. Most of them are because 'assertFalse' doesn't appear to exist. I added this to HelperTestCase and made those errors go away. There's also the use of operator.itemgetter, which was only introduced in Python 2.4. I hacked up a simplistic implementation too. Now we're down to one failure in Python 2.3: ====================================================================== FAIL: test_findall (lxml.tests.test_objectify.ObjectifyTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/faassen/working/lxml/src/lxml/tests/test_objectify.py", line 218, in test_findall root.getchildren()[:2]) File "/usr/lib/python2.3/unittest.py", line 302, in failUnlessEqual raise self.failureException, \ AssertionError: [<Element b at b787f0cc>, ''] != [<Element b at b787f0cc>, ''] You'd think that this *should* be equal and thus succeed. Possibly some rich comparison feature that doesn't exist yet in Python 2.3? Back to you, Stephan. :) Regards, Martijn
Hi Martijn, Martijn Faassen wrote:
Hm, I wasn't previously familiar with gcc -E. I tried running it against objectify.c but got a lot of missing includes for Python and libxml2
You can use the same command line that distutils use to compile the module, except for the "-c xxx.so" part.
Us having to maintain our own version of Pyrex rather sucks.
Sure, but it's currently not that easy to push things upstream back into Pyrex. Maybe Greg manages to get some work done over Christmas.
I just installed lxml's version of Pyrex, and now the tests start.
Ah, finally. :)
Now we're down to one failure in Python 2.3:
====================================================================== FAIL: test_findall (lxml.tests.test_objectify.ObjectifyTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/faassen/working/lxml/src/lxml/tests/test_objectify.py", line 218, in test_findall root.getchildren()[:2]) File "/usr/lib/python2.3/unittest.py", line 302, in failUnlessEqual raise self.failureException, \ AssertionError: [<Element b at b787f0cc>, ''] != [<Element b at b787f0cc>, '']
You'd think that this *should* be equal and thus succeed. Possibly some rich comparison feature that doesn't exist yet in Python 2.3?
Or maybe just works differently. That was a bad test case anyway, as equality of objectified elements is not really well defined in general. It can be type specific, which might be the problem here already. I changed that to an identity test, so it should work now. Stefan
Hi Martijn, Martijn Faassen wrote:
On Python 2.3 this segfaults with lxml 1.1 (it works with lxml 1.0):
from lxml import etree etree.parse('sfadfdfd')
That could be the same problem that Holger found when testing with Python 2.3 on Solaris. Maybe it's a bug in the thread handling of Python 2.3, which would make it a problem limited to lxml 1.1. Could you post a valgrind trace for this?
Additionally, the tests don't work anymore under Python 2.3. For lxml 1.1 some dependencies on Python 2.4's doctest module exist that don't work on Python 2.3, probably because we dropped the custom doctest that I added initially.
Most likely, yes.
For lxml 1.0 this is less bad, but there are still some dependencies on 'sorted()' and such in the tests.
Ah, sure, must have been me who added them. Maybe I should just install a 32bit Python 2.3 on my machine, to do my own tests before releasing...
I don't think we actually ever explictly dropped support for Python 2.3. Perhaps we should for a particular version of lxml
Hmm, it would definitely be easiest to do that for lxml 1.1. ;)
, but it'd be nice if we could track down this bug. It might indicate something wrong in Python 2.4 that just doesn't show up right away, I don't know.
I don't think so. At least in Holger's stack traces, the segfault occurred in pthreads when releasing the Python thread context, so it can't really be a problem in lxml itself. One way to work around this kind of problem would be to not release the thread context under 2.3. That should be simple to do, if we know the right places where we have to do this. Stefan
Hi all,
I don't think we actually ever explictly dropped support for Python 2.3. Perhaps we should for a particular version of lxml
Hmm, it would definitely be easiest to do that for lxml 1.1. ;)
, but it'd be nice if we could track down this bug. It might indicate something wrong in Python 2.4 that just doesn't show up right away, I don't know.
I don't think so. At least in Holger's stack traces, the segfault occurred in pthreads when releasing the Python thread context, so it can't really be a problem in lxml itself.
One way to work around this kind of problem would be to not release the
context under 2.3. That should be simple to do, if we know the right
Just my -1 for dropping python2.3 support. When used in an enterprise production context you will just not switch tested versions quick unless there is some real need to (new functionality you desparately want/dependancy of a module you need) or you have time to spare (doesn't happen that often). It's especially time-consuming when you depend on C- or C++-extensions. As for the dependency of some module, it might well be the case that someone would rather choose not use lxml than upgrade python. thread places
where we have to do this.
Stefan
Better imho, so 2.3 users can still depend on lxml. Regards, Holger Der Inhalt dieser E-Mail ist vertraulich. Falls Sie nicht der angegebene Empfänger sind oder falls diese E-Mail irrtümlich an Sie adressiert wurde, verständigen Sie bitte den Absender sofort und löschen Sie die E-Mail sodann. Das unerlaubte Kopieren sowie die unbefugte Übermittlung sind nicht gestattet. Die Sicherheit von Übermittlungen per E-Mail kann nicht garantiert werden. Falls Sie eine Bestätigung wünschen, fordern Sie bitte den Inhalt der E-Mail als Hardcopy an. The contents of this e-mail are confidential. If you are not the named addressee or if this transmission has been addressed to you in error, please notify the sender immediately and then delete this e-mail. Any unauthorized copying and transmission is forbidden. E-Mail transmission cannot be guaranteed to be secure. If verification is required, please request a hard copy version.
Hi, Holger Joukl wrote:
I don't think we actually ever explictly dropped support for Python 2.3. Perhaps we should for a particular version of lxml Hmm, it would definitely be easiest to do that for lxml 1.1. ;)
Just my -1 for dropping python2.3 support.
;) I just said it would be the *easiest* solution. I agree that it's worth keeping 2.3 compatibility as long as we can. There were no major changes in the C-API since that version that would prevent us from doing so.
One way to work around this kind of problem would be to not release the thread context under 2.3. That should be simple to do, if we know the right places where we have to do this.
Better imho, so 2.3 users can still depend on lxml.
I'll try to come up with a fix then. Maybe it's enough to somehow disable the thread context calls to make lxml run single-threaded under 2.3. I'll have to rely on someone else to test it, though. Stefan
Hi Holger, Martijn, Stefan Behnel wrote:
Holger Joukl wrote:
One way to work around this kind of problem would be to not release the thread context under 2.3. That should be simple to do, if we know the right places where we have to do this. Better imho, so 2.3 users can still depend on lxml.
I'll try to come up with a fix then. Maybe it's enough to somehow disable the thread context calls to make lxml run single-threaded under 2.3. I'll have to rely on someone else to test it, though.
Ok, I committed this simple patch to the trunk that simply skips releasing and re-acquiring the thread contexts under Python 2.3. I tried switching it on under 2.5 and didn't find any problems in the tests, so please check if it works on your side with 2.3, too. If this works as expected, this would also give us a straight forward way to compile lxml without threading by passing an option (--without-threading) to setup.py and switching on the code section below via a compiler define. Stefan Index: src/lxml/etree_defs.h =================================================================== --- src/lxml/etree_defs.h (Revision 35078) +++ src/lxml/etree_defs.h (Arbeitskopie) @@ -16,6 +16,20 @@ #endif #endif +/* Threading can crash under Python 2.3 */ +#if PY_VERSION_HEX < 0x02040000 +#ifndef WITHOUT_THREADING + #define WITHOUT_THREADING +#endif +#endif + +#ifdef WITHOUT_THREADING + #define PyEval_SaveThread() (NULL) + #define PyEval_RestoreThread(state) + #define PyGILState_Ensure() (PyGILState_UNLOCKED) + #define PyGILState_Release(state) +#endif + /* libxml2 version specific setup */ #include "libxml/xmlversion.h" #if LIBXML_VERSION < 20621
participants (4)
-
Holger Joukl
-
Martijn Faassen
-
Stefan Behnel
-
Tres Seaver