[pypy-issue] [issue1234] Poor seek set performance (compared to cpython 2.7)

Pigmej tracker at bugs.pypy.org
Mon Aug 13 10:37:52 CEST 2012


New submission from Pigmej <jedrzej.nowak at codernity.com>:

I attached the script. The script itself doesn't make sense at all but it demonstrates the 
real problem.

Please run strace on that and compare pypy and cpython 2.7 . You should see at the end of it 
following output:

--- cpython 2.7 ---
write(3, "00000000000000000000000000000000"..., 4096) = 4096
write(3, "25252525252525252525252525252525"..., 4096) = 4096
write(3, "45454545464646464646464646464646"..., 4096) = 4096
write(3, "66666666666666666666666666666666"..., 4096) = 4096
write(3, "86868686868686868787878787878787"..., 2616) = 2616
read(3, "", 4096)                       = 0
lseek(3, 0, SEEK_SET)                   = 0
read(3, "00000000000000000000000000000000"..., 4096) = 4096
lseek(3, 4096, SEEK_SET)                = 4096
lseek(3, 4096, SEEK_SET)                = 4096
lseek(3, 4096, SEEK_SET)                = 4096
lseek(3, 4096, SEEK_SET)                = 4096
lseek(3, 4096, SEEK_SET)                = 4096
lseek(3, 4096, SEEK_SET)                = 4096
lseek(3, 4096, SEEK_SET)                = 4096
lseek(3, 4096, SEEK_SET)                = 4096
lseek(3, 4096, SEEK_SET)                = 4096
lseek(3, -4006, SEEK_CUR)               = 90
fsync(3)                                = 0
---

--- pypy trunk ---
write(3, "00000000000000000000000000000000"..., 8200) = 8200
write(3, "46464646464646464646464646464646"..., 8200) = 8200
write(3, "87878787878787878787878787878787"..., 2600) = 2600
read(3, "", 8192)                       = 0
lseek(3, 0, SEEK_SET)                   = 0
read(3, "00000000000000000000000000000000"..., 8192) = 8192
lseek(3, 10, SEEK_SET)                  = 10
read(3, "00000000000000000000000000000000"..., 8192) = 8192
lseek(3, 20, SEEK_SET)                  = 20
read(3, "00000000000000000000000000000000"..., 8192) = 8192
lseek(3, 30, SEEK_SET)                  = 30
read(3, "00000000000000000000000000000000"..., 8192) = 8192
lseek(3, 40, SEEK_SET)                  = 40
read(3, "00000000000000000000000000000000"..., 8192) = 8192
lseek(3, 50, SEEK_SET)                  = 50
read(3, "00000000000000000000000000000000"..., 8192) = 8192
lseek(3, 60, SEEK_SET)                  = 60
read(3, "00000000000000000000000000000000"..., 8192) = 8192
lseek(3, 70, SEEK_SET)                  = 70
read(3, "00000000000000000000000000000011"..., 8192) = 8192
lseek(3, 80, SEEK_SET)                  = 80
read(3, "00000000000000000000111111111111"..., 8192) = 8192
lseek(3, 90, SEEK_SET)                  = 90
fsync(3)                                = 0
---


As you can see cpython optimized those seeks / reads. While pypy did 1:1 what was written in 
original python script.

----------
files: pypy_vs_py27_seek.py
messages: 4651
nosy: pigmej, pypy-issue
priority: performance bug
status: unread
title: Poor seek set performance (compared to cpython 2.7)

________________________________________
PyPy bug tracker <tracker at bugs.pypy.org>
<https://bugs.pypy.org/issue1234>
________________________________________


More information about the pypy-issue mailing list