[pypy-issue] Issue #2240: file.__iter__() for pipes slower under PyPy 4.0.1 than CPython 2.7.10 (pypy/pypy)

Richard Barrell issues-reply at bitbucket.org
Tue Feb 16 08:11:10 EST 2016


New issue 2240: file.__iter__() for pipes slower under PyPy 4.0.1 than CPython 2.7.10
https://bitbucket.org/pypy/pypy/issues/2240/file__iter__-for-pipes-slower-under-pypy

Richard Barrell:

When running `for line in f: …`, where `f` is a pipe such as a subprocess.Popen object's stdout, CPython 2.7.10 reads 10kiB of data with each read(2) syscall but PyPy 4.0.1 is reading 1 byte of data with each read(2) syscall. PyPy 4.0.1 winds up being about 40-50 times slower than CPython 2.7.10 in this use case.

I resorted to implementing a thing like file.__iter__() manually in pure Python to get around this in my program. :(

```
$ # testing with CPython 2.7.10 first:
$ python --version
Python 2.7.10
$ python lines.py generate
created textfile.txt
$ sha256sum lines.py
61d35907ce030172fdfe8c06c48b8ad475b571e6f1c345195381aa7195e173ed  lines.py
$ time python lines.py file_iter
('there are', 1048576, 'lines')
took 124.87ms with (file.__iter__)

real	0m0.139s
user	0m0.131s
sys	0m0.022s
$ time python lines.py to_lines
('there are', 1048576, 'lines')
took 433.11ms with to_lines()

real	0m0.451s
user	0m0.444s
sys	0m0.021s
$ # now testing with pypy 4.0.1:
$ pypy --version
Python 2.7.10 (5f8302b8bf9f53056e40426f10c72151564e5b19, Jan 15 2016, 18:28:10)
[PyPy 4.0.1 with GCC 4.9.3]
$ pypy lines.py generate
created textfile.txt
$ sha256sum lines.py
61d35907ce030172fdfe8c06c48b8ad475b571e6f1c345195381aa7195e173ed  lines.py
$ time pypy lines.py file_iter
('there are', 1048576, 'lines')
took 5583.92ms with (file.__iter__)

real	0m5.741s
user	0m1.603s
sys	0m4.229s
$ time pypy lines.py to_lines
('there are', 1048576, 'lines')
took 94.72ms with to_lines()

real	0m0.242s
user	0m0.207s
sys	0m0.049s
```





More information about the pypy-issue mailing list