[pypy-dev] which xml libraries? was (Re: PyPy 1.4 released)
p.giarrusso at gmail.com
Mon Nov 29 21:54:24 CET 2010
On Mon, Nov 29, 2010 at 14:40, Stefan Behnel <stefan_ml at behnel.de> wrote:
> Amaury Forgeot d'Arc, 28.11.2010 11:44:
>> 2010/11/28 Maciej Fijalkowski
>>> On Sun, Nov 28, 2010 at 11:58 AM, René Dudfield wrote:
>>>> what xml libraries are people using with pypy? What is working well?
>>> PyExpat works, although it's slow (ctypes-based implementation). I
>>> know genshi has some troubles with it, someone is debugging now.
>>> Besides I don't think there are any working (unless someone wrote a
>>> pure-python one)
>> PyExpat is now a built-in module, implemented in RPython,
>> and should have reasonable performance.
> Hmm, reasonable?
> $ ./bin/pypy -m timeit -s 'import xml.etree.ElementTree as ET' \
> 10 loops, best of 3: 1.27 sec per loop
> $ python2.7 -m timeit -s 'import xml.etree.ElementTree as ET' \
> 10 loops, best of 3: 486 msec per loop
> $ python2.7 -m timeit -s 'import xml.etree.cElementTree as ET' \
> 10 loops, best of 3: 33.7 msec per loop
Is any JITting expected to trigger with so few iteractions? Or does
RPython saves the need for that? I tried increasing the loop count,
but I couldn't, because of two different bugs somewhere (in PyPy I
I tried ensuring that at least 1000 iterations were displayed, but
timeit doesn't work for more than 852 iterations on the attached
example (found on my HD):
$ pypy-trunk/pypy/translator/goal/pypy-c -m timeit -n 853 -s 'import
xml.etree.ElementTree as ET' 'ET.parse("extensionNames.xml")'
ImportError: No module named linecache
Now, even if linecache is imported locally, linecache.py exists
(located in the same path as timeit.py, i.e. lib-python/2.5.2/).
Furthermore, it works fine on the Python interpreter, suggesting that
the -m option might be part of the bug:
xml.etree.ElementTree as ET')
However, a bigger timing count doesn't work:
Traceback (most recent call last):
File "<console>", line 1, in <module>
line 161, in timeit
File "<timeit-src>", line 6, in inner
line 862, in parse
line 579, in parse
IOError: [Errno 24] Too many open files: 'extensionNames.xml'
Inspection of the pypy process confirms a leak of file handles to the
XML files. Whether it is GC not being invoked, a missing destructor,
or simply because the code should release file handles, I dunno. Is
there a way to trigger explicit GC to workaround such issues?
Warning: all this is with a 32bit PyPy-1.4 on Mac OS X.
Paolo Giarrusso - Ph.D. Student
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 365 bytes
Desc: not available
More information about the Pypy-dev