[New-bugs-announce] [issue6629] seek doesn't properly handle file buffer, leads to silent data corruption
report at bugs.python.org
Mon Aug 3 04:00:41 CEST 2009
New submission from Karoly Lorentey <karoly at lorentey.hu>:
The new io.BufferedRandom implementation in Python 3.1 has a broken seek
that seems not to properly handle the case when the target of the seek
lies inside the contents of the file buffer. It leaves the file object
in a confused state, such that the next write operation applies after
the end of the buffer(!) instead of the specified target.
I could reproduce the following symptoms on both Debian Lenny and Mac OS
X Leopard. I downloaded the Python 3.1 tarball from python.org, and
built it by hand using './configure && make'.
Python 3.1 (r31:73572, Aug 3 2009, 02:32:10)
[GCC 4.0.1 (Apple Inc. build 5493)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> open("test", "wb").write(b"A" * 10000)
>>> file = open("test", "rb+")
>>> file.read(10) # Reads 4096 bytes into file buffer
>>> file.write(b"B" * 10000) # This should overwrite the whole file
14096 # Hmm, 0 + 10000 == 14096?
>>> d = open("test", "rb").read()
14096 # ?!
>>> d[0:10] # The file should now consist of 10000 Bs...
b'AAAAAABBBB' # ... but the Bs only start after a buffer's worth of
This bug has actually caused me some subtle, silent data corruption that
went undetected for quite a while. Hurray for backups!
The above code works fine in Python 3.0, and its Python 2.5 port also
produces correct results.
A workaround for 3.1 is to call flush before every seek.
title: seek doesn't properly handle file buffer, leads to silent data corruption
versions: Python 3.1
Python tracker <report at bugs.python.org>
More information about the New-bugs-announce