Possible read()/readline() bug?
Steven D'Aprano
steve at REMOVE-THIS-cybersource.com.au
Thu Oct 23 00:24:58 EDT 2008
On Wed, 22 Oct 2008 16:59:45 -0400, Terry Reedy wrote:
> Mike Kent wrote:
>> Before I file a bug report against Python 2.5.2, I want to run this by
>> the newsgroup to make sure I'm not [missing something].
>
> Good idea ;-). What you are missing is a rereading of the fine manual
> to see what you missed the first time. I recommend this *whenever* you
> are having a vexing problem.
With respect Terry, I think what you have missed is the reason why the OP
thinks this is a bug. He's not surprised that buffering is going on:
"This indicates some sort of buffering and caching is going on."
but he thinks that the buffering should be discarded when you seek:
"It seems pretty clear to me that this is wrong. If there is any
caching going on, it should clearly be discarded if I do a seek. Note
that it's not just readline() that's returning me the wrong, cached
data, as I've also tried this with read(), and I get the same
results. It's not acceptable that I have to close and reopen the file
before every read when I'm doing random record access."
I think Mike has a point: if a cache is out of sync with the actual data,
then the cache needs to be thrown away. A bad cache is worse than no
cache at all.
Surely dealing with files that are being actively changed by other
processes is hard. I'm not sure that the solution is anything other than
"well, don't do that then". How do other programming languages and Unix
tools behave? (Windows generally only allows a single process to read or
write to a file at once.)
Additionally, I wonder whether what Mike is seeing is some side-effect of
file-system caching. Perhaps the bytes written to the file by echo are
only written to disk when the file is closed? I don't know, I'm just
hypothesizing.
--
Steven
More information about the Python-list
mailing list