which xml libraries? was (Re: PyPy 1.4 released)

Hi, what xml libraries are people using with pypy? What is working well? cu, On Sun, Nov 28, 2010 at 9:48 AM, Maciej Fijalkowski <fijall@gmail.com>wrote:

On Sun, Nov 28, 2010 at 11:58 AM, René Dudfield <renesd@gmail.com> wrote:
PyExpat works, although it's slow (ctypes-based implementation). I know genshi has some troubles with it, someone is debugging now. Besides I don't think there are any working (unless someone wrote a pure-python one) Cheers, fijal

Amaury Forgeot d'Arc, 28.11.2010 11:44:
Hmm, reasonable? $ ./bin/pypy -m timeit -s 'import xml.etree.ElementTree as ET' \ 'ET.parse("ot.xml")' 10 loops, best of 3: 1.27 sec per loop $ python2.7 -m timeit -s 'import xml.etree.ElementTree as ET' \ 'ET.parse("ot.xml")' 10 loops, best of 3: 486 msec per loop $ python2.7 -m timeit -s 'import xml.etree.cElementTree as ET' \ 'ET.parse("ot.xml")' 10 loops, best of 3: 33.7 msec per loop Stefan

On Mon, Nov 29, 2010 at 14:40, Stefan Behnel <stefan_ml@behnel.de> wrote:
Is any JITting expected to trigger with so few iteractions? Or does RPython saves the need for that? I tried increasing the loop count, but I couldn't, because of two different bugs somewhere (in PyPy I guess). I tried ensuring that at least 1000 iterations were displayed, but timeit doesn't work for more than 852 iterations on the attached example (found on my HD): $ pypy-trunk/pypy/translator/goal/pypy-c -m timeit -n 853 -s 'import xml.etree.ElementTree as ET' 'ET.parse("extensionNames.xml")' ImportError: No module named linecache Now, even if linecache is imported locally, linecache.py exists (located in the same path as timeit.py, i.e. lib-python/2.5.2/). Furthermore, it works fine on the Python interpreter, suggesting that the -m option might be part of the bug: import timeit a=timeit.Timer('ET.parse("extensionNames.xml")', 'import xml.etree.ElementTree as ET') a.timeit(1000) However, a bigger timing count doesn't work: line 161, in timeit File "<timeit-src>", line 6, in inner File "/Users/pgiarrusso/Documents/Research/Sorgenti/PyPy/pypy-trunk/lib_pypy/xml/etree/ElementTree.py", line 862, in parse File "/Users/pgiarrusso/Documents/Research/Sorgenti/PyPy/pypy-trunk/lib_pypy/xml/etree/ElementTree.py", line 579, in parse IOError: [Errno 24] Too many open files: 'extensionNames.xml' Inspection of the pypy process confirms a leak of file handles to the XML files. Whether it is GC not being invoked, a missing destructor, or simply because the code should release file handles, I dunno. Is there a way to trigger explicit GC to workaround such issues? Warning: all this is with a 32bit PyPy-1.4 on Mac OS X. Bye -- Paolo Giarrusso - Ph.D. Student http://www.informatik.uni-marburg.de/~pgiarrusso/

Hi all, thanks to the tips, I verified on Mac OS X a 17% slowdown, after manually taking the best times, vs Python-2.5 (32bit). Measuring on the command line would give a 57% slowdown instead, because of lack of warmup. As a matter of fact, however, pyexpat is not involved here for PyPy, and here (v1.4) it is still implemented through ctypes (in lib_pypy/pyexpat.py), and not in RPython in pypy/rlib/. Python 2.7 may well be faster, which might explain some extra difference with Stefan's results. It looks like the two bugs should be easy to fix: - a file leak on the tested XML module, indeed - an IOException on module opening converted to "file not found" - at least in Java, file not found is a specific exception which can be distinguished from generic I/O errors. On Mon, Nov 29, 2010 at 22:29, Piotr Skamruk <piotr.skamruk@gmail.com> wrote:
simplier would be set ulimit -n to 65536 (probably in /etc/security/limits.conf)
Thanks, I needed both this and the GC tips, since during a test run to run 10^4 iterations, I can't call the GC and still get meaningful results. [I'm on Mac OS X though, so ulimit -S -n 10240 is the best one can do, otherwise "Invalid argument", i.e. EINVAL, results]. Additionally, I just discovered that the ImportError on "import linecache" looks filehandle-related as well, because changing the ulimit changes the iteration count triggering the error, so it's likely an effect of the same bug. Still, the original error message should be preserved, and this should be easy to fix. In these conditions, my best results after warming up are: 0.358 ms PyPy-JIT-32bit (see below for JIT logs) 0.305 ms CPython-2.5-32bit 0.269 ms CPython-2.6-64bit 0.553 ms PyPy-64bit-noJIT, rev 79307, 21 Nov 2010 which means a 17% slowdown on comparable setups, rather than a 2x slowdown; measuring with timeit on the cmd line, instead, would give a 57% slowdown. All this is on a very small input file, the one I attached before. That's for the total of 1000 iterations, on a Core 2 Duo 2.6GHz. I don't report the average because: a) it is difficult to get something significant anyway (I don't want to code confidence intervals, and automated tools wouldn't call GC appropriately) b) I expect the deviation to be due more to unrelated load on my laptop (around 12-18% CPU) than to actual spread of the runtime. I set PYPYLOG='jit-summary:-' before the PyPy-JIT run and got this - I hope somebody can check from this whether the JIT is working successfully. [f2dd1fbaa1c2] {jit-summary Tracing: 25 0.163456 Backend: 23 0.017392 Running asm: 191214 Blackhole: 2012 TOTAL: 502.543032 ops: 68338 recorded ops: 32764 calls: 1759 guards: 18005 opt ops: 2757 opt guards: 696 forcings: 111 abort: trace too long: 2 abort: compiling: 0 abort: vable escape: 0 nvirtuals: 6693 nvholes: 1059 nvreused: 3979 Total # of loops: 18 Total # of bridges: 6 Freed # of loops: 0 Freed # of bridges: 0 [f2dd1fc141a8] jit-summary} Best regards.
-- Paolo Giarrusso - Ph.D. Student http://www.informatik.uni-marburg.de/~pgiarrusso/

2010/11/30 Paolo Giarrusso <p.giarrusso@gmail.com>
Did you compile pypy yourself? if the expat development files are present, the translation should build the pyexpat module: Python 2.5.2 (79656, Nov 29 2010, 21:05:28) [PyPy 1.4.0] on linux2
-- Amaury Forgeot d'Arc

On Tue, Nov 30, 2010 at 08:13, Maciej Fijalkowski <fijall@gmail.com> wrote:
My apologies, I self-compiled PyPy and I get the output you describe indeed. Therefore I guess that the ctypes implementation I come across in lib_pypy/pyexpat.py is probably a fallback - in case only the library, but not the headers, are present. Anyway, this does not interact with benchmarks above - Stefan, I still don't get why you complained that pyexpat is slow by showing benchmarks for another module, I guess I do not understand your email, but it asks "reasonable?" after Amaury talks about pyexpat. I'll try to benchmark it soon; a reasonable way to call pyexpat would make it simpler since I have limited time and mental energy to devote, and figuring out a non-stupid way to use it might be non-trivial without learning to use the library. Best regards -- Paolo Giarrusso - Ph.D. Student http://www.informatik.uni-marburg.de/~pgiarrusso/

Paolo Giarrusso, 01.12.2010 00:34:
Well, in CPython, I can see little to no reason why anyone would go as low-level as pyexpat when you can use cElementTree. So IMHO the level to compare is what people would normally use rather than what people could potentially use if they were beaten hard enough. Stefan

On Wed, Dec 1, 2010 at 9:48 AM, Stefan Behnel <stefan_ml@behnel.de> wrote:
Hey. Sure, makes sense :-) I think one of the reasons for some slowdown is that calls from C are not jitted if they don't contain loops themselves. This doesn't explain the whole thing obviously, because there is something really wrong going on looking at numbers.

On Sun, Nov 28, 2010 at 11:58 AM, René Dudfield <renesd@gmail.com> wrote:
PyExpat works, although it's slow (ctypes-based implementation). I know genshi has some troubles with it, someone is debugging now. Besides I don't think there are any working (unless someone wrote a pure-python one) Cheers, fijal

Amaury Forgeot d'Arc, 28.11.2010 11:44:
Hmm, reasonable? $ ./bin/pypy -m timeit -s 'import xml.etree.ElementTree as ET' \ 'ET.parse("ot.xml")' 10 loops, best of 3: 1.27 sec per loop $ python2.7 -m timeit -s 'import xml.etree.ElementTree as ET' \ 'ET.parse("ot.xml")' 10 loops, best of 3: 486 msec per loop $ python2.7 -m timeit -s 'import xml.etree.cElementTree as ET' \ 'ET.parse("ot.xml")' 10 loops, best of 3: 33.7 msec per loop Stefan

On Mon, Nov 29, 2010 at 14:40, Stefan Behnel <stefan_ml@behnel.de> wrote:
Is any JITting expected to trigger with so few iteractions? Or does RPython saves the need for that? I tried increasing the loop count, but I couldn't, because of two different bugs somewhere (in PyPy I guess). I tried ensuring that at least 1000 iterations were displayed, but timeit doesn't work for more than 852 iterations on the attached example (found on my HD): $ pypy-trunk/pypy/translator/goal/pypy-c -m timeit -n 853 -s 'import xml.etree.ElementTree as ET' 'ET.parse("extensionNames.xml")' ImportError: No module named linecache Now, even if linecache is imported locally, linecache.py exists (located in the same path as timeit.py, i.e. lib-python/2.5.2/). Furthermore, it works fine on the Python interpreter, suggesting that the -m option might be part of the bug: import timeit a=timeit.Timer('ET.parse("extensionNames.xml")', 'import xml.etree.ElementTree as ET') a.timeit(1000) However, a bigger timing count doesn't work: line 161, in timeit File "<timeit-src>", line 6, in inner File "/Users/pgiarrusso/Documents/Research/Sorgenti/PyPy/pypy-trunk/lib_pypy/xml/etree/ElementTree.py", line 862, in parse File "/Users/pgiarrusso/Documents/Research/Sorgenti/PyPy/pypy-trunk/lib_pypy/xml/etree/ElementTree.py", line 579, in parse IOError: [Errno 24] Too many open files: 'extensionNames.xml' Inspection of the pypy process confirms a leak of file handles to the XML files. Whether it is GC not being invoked, a missing destructor, or simply because the code should release file handles, I dunno. Is there a way to trigger explicit GC to workaround such issues? Warning: all this is with a 32bit PyPy-1.4 on Mac OS X. Bye -- Paolo Giarrusso - Ph.D. Student http://www.informatik.uni-marburg.de/~pgiarrusso/

Hi all, thanks to the tips, I verified on Mac OS X a 17% slowdown, after manually taking the best times, vs Python-2.5 (32bit). Measuring on the command line would give a 57% slowdown instead, because of lack of warmup. As a matter of fact, however, pyexpat is not involved here for PyPy, and here (v1.4) it is still implemented through ctypes (in lib_pypy/pyexpat.py), and not in RPython in pypy/rlib/. Python 2.7 may well be faster, which might explain some extra difference with Stefan's results. It looks like the two bugs should be easy to fix: - a file leak on the tested XML module, indeed - an IOException on module opening converted to "file not found" - at least in Java, file not found is a specific exception which can be distinguished from generic I/O errors. On Mon, Nov 29, 2010 at 22:29, Piotr Skamruk <piotr.skamruk@gmail.com> wrote:
simplier would be set ulimit -n to 65536 (probably in /etc/security/limits.conf)
Thanks, I needed both this and the GC tips, since during a test run to run 10^4 iterations, I can't call the GC and still get meaningful results. [I'm on Mac OS X though, so ulimit -S -n 10240 is the best one can do, otherwise "Invalid argument", i.e. EINVAL, results]. Additionally, I just discovered that the ImportError on "import linecache" looks filehandle-related as well, because changing the ulimit changes the iteration count triggering the error, so it's likely an effect of the same bug. Still, the original error message should be preserved, and this should be easy to fix. In these conditions, my best results after warming up are: 0.358 ms PyPy-JIT-32bit (see below for JIT logs) 0.305 ms CPython-2.5-32bit 0.269 ms CPython-2.6-64bit 0.553 ms PyPy-64bit-noJIT, rev 79307, 21 Nov 2010 which means a 17% slowdown on comparable setups, rather than a 2x slowdown; measuring with timeit on the cmd line, instead, would give a 57% slowdown. All this is on a very small input file, the one I attached before. That's for the total of 1000 iterations, on a Core 2 Duo 2.6GHz. I don't report the average because: a) it is difficult to get something significant anyway (I don't want to code confidence intervals, and automated tools wouldn't call GC appropriately) b) I expect the deviation to be due more to unrelated load on my laptop (around 12-18% CPU) than to actual spread of the runtime. I set PYPYLOG='jit-summary:-' before the PyPy-JIT run and got this - I hope somebody can check from this whether the JIT is working successfully. [f2dd1fbaa1c2] {jit-summary Tracing: 25 0.163456 Backend: 23 0.017392 Running asm: 191214 Blackhole: 2012 TOTAL: 502.543032 ops: 68338 recorded ops: 32764 calls: 1759 guards: 18005 opt ops: 2757 opt guards: 696 forcings: 111 abort: trace too long: 2 abort: compiling: 0 abort: vable escape: 0 nvirtuals: 6693 nvholes: 1059 nvreused: 3979 Total # of loops: 18 Total # of bridges: 6 Freed # of loops: 0 Freed # of bridges: 0 [f2dd1fc141a8] jit-summary} Best regards.
-- Paolo Giarrusso - Ph.D. Student http://www.informatik.uni-marburg.de/~pgiarrusso/

2010/11/30 Paolo Giarrusso <p.giarrusso@gmail.com>
Did you compile pypy yourself? if the expat development files are present, the translation should build the pyexpat module: Python 2.5.2 (79656, Nov 29 2010, 21:05:28) [PyPy 1.4.0] on linux2
-- Amaury Forgeot d'Arc

On Tue, Nov 30, 2010 at 08:13, Maciej Fijalkowski <fijall@gmail.com> wrote:
My apologies, I self-compiled PyPy and I get the output you describe indeed. Therefore I guess that the ctypes implementation I come across in lib_pypy/pyexpat.py is probably a fallback - in case only the library, but not the headers, are present. Anyway, this does not interact with benchmarks above - Stefan, I still don't get why you complained that pyexpat is slow by showing benchmarks for another module, I guess I do not understand your email, but it asks "reasonable?" after Amaury talks about pyexpat. I'll try to benchmark it soon; a reasonable way to call pyexpat would make it simpler since I have limited time and mental energy to devote, and figuring out a non-stupid way to use it might be non-trivial without learning to use the library. Best regards -- Paolo Giarrusso - Ph.D. Student http://www.informatik.uni-marburg.de/~pgiarrusso/

Paolo Giarrusso, 01.12.2010 00:34:
Well, in CPython, I can see little to no reason why anyone would go as low-level as pyexpat when you can use cElementTree. So IMHO the level to compare is what people would normally use rather than what people could potentially use if they were beaten hard enough. Stefan

On Wed, Dec 1, 2010 at 9:48 AM, Stefan Behnel <stefan_ml@behnel.de> wrote:
Hey. Sure, makes sense :-) I think one of the reasons for some slowdown is that calls from C are not jitted if they don't contain loops themselves. This doesn't explain the whole thing obviously, because there is something really wrong going on looking at numbers.
participants (6)
-
Amaury Forgeot d'Arc
-
Maciej Fijalkowski
-
Paolo Giarrusso
-
Piotr Skamruk
-
René Dudfield
-
Stefan Behnel