[pypy-issue] Issue #1839: xml.etree.ElementTree relatively slow (pypy/pypy)

Franklin Lee issues-reply at bitbucket.org
Sat Aug 9 08:24:58 CEST 2014


New issue 1839: xml.etree.ElementTree relatively slow
https://bitbucket.org/pypy/pypy/issue/1839/xmletreeelementtree-relatively-slow

Franklin Lee:

I have a 46MB XML file. I parsed it with xml.etree.ElementTree, using CPython, PyPy, 2, and 3.

Here's the code.
```
#!python

import sys
import os
from xml.etree import ElementTree as ET
from datetime import datetime

fpath = os.path.join(os.getenv('appdata'), 'cbloader', 'combined.dnd40')
os.path.exists(fpath)

start = datetime.now()
tree = ET.parse(fpath)
root = tree.getroot()
end = datetime.now()

print (end - start)
```

Rough times (not rigorous):
- CPython 2: 25-30 seconds
- CPython 3: 4-5 seconds
- PyPy2: 11 seconds
- PyPy3: 11 seconds

Some stats on the XML file (since I have no right to distribute it, I think):
- 46.8MB
- 541228 nodes
- Height of 3
- Root has 38125 children
- Biggest text node: 12590




More information about the pypy-issue mailing list