[pypy-issue] [issue729] Pypy strangely slow when reading files

Da_Blitz tracker at bugs.pypy.org
Wed Jun 1 19:17:23 CEST 2011

Da_Blitz <pypy at pocketnix.org> added the comment:

looking deeper and without looking at python's src, it appears to allocate a
buffer for the file that is 10x the specified buffer size. setting the buffer
appears to only limit the length of the returned item and not the length of the
actual os level read call

pypy differs in that it limits the read syscall to the specified buffer length
and i assume just returns that instead of performing actual buffering

could be diffrent, please flame me. this was just inferring from syscalls

one quick way to see this is with strace -e read to dump the read syscalls and
their args and look at the requested read size

simple test program:

#!/usr/bin/env python
"""Uncomment the print line to confirm the length of the returned data (without

run with "strace -e read <interpreter> <file>"
where <interpreter> is either python or pypy and <file> is a textfile from
allfiles.zip or similar large (> 20kB) file

length of the requested read is the third field
import sys
import os

fd = os.open(sys.argv[1], os.RD_ONLY)
f = os.open(fd, "r", 1000)

for line in f:
#   print len(line)

PyPy bug tracker <tracker at bugs.pypy.org>

More information about the pypy-issue mailing list