[pypy-issue] [issue718] string.plit('\x00') takes 300% longer under pypy1.5 w/jit

Robert Collins pypy-dev-issue at codespeak.net
Tue May 10 09:00:03 CEST 2011

Robert Collins <robertc at robertcollins.net> added the comment:

ronny asked on IRC how to get sample data - its the first journal from an 
lmirror (https://launchpad.net/lmirror) mirror set of the Ubuntu archive.

If you have an archive mirror (400GB :P) - then:
$cd to the mirror root
$lmirror init ./ubuntu
$ls -l .lmirror/metadata/ubuntu/journals
should show something like

-rw-r--r-- 1 robertc robertc       19 2011-05-08 15:23 0
-rw-r--r-- 1 robertc robertc 72885391 2011-05-08 15:25 1

The 1 file is what is being parsed, so to reproduce:
bytestring = file('1','rb').read()
def _foo():
   return bytestring.split('\x00')
# start time here
tokens = _foo()
# stop time here

should show the behaviour

status: unread -> chatting

PyPy development tracker <pypy-dev-issue at codespeak.net>

More information about the pypy-issue mailing list