Possible read()/readline() bug?

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Thu Oct 23 00:24:58 EDT 2008


On Wed, 22 Oct 2008 16:59:45 -0400, Terry Reedy wrote:

> Mike Kent wrote:
>> Before I file a bug report against Python 2.5.2, I want to run this by
>> the newsgroup to make sure I'm not [missing something].
> 
> Good idea ;-).  What you are missing is a rereading of the fine manual
> to see what you missed the first time.  I recommend this *whenever* you
> are having a vexing problem.

With respect Terry, I think what you have missed is the reason why the OP 
thinks this is a bug. He's not surprised that buffering is going on:

"This indicates some sort of buffering and caching is going on."

but he thinks that the buffering should be discarded when you seek:

"It seems pretty clear to me that this is wrong.  If there is any
caching going on, it should clearly be discarded if I do a seek.  Note
that it's not just readline() that's returning me the wrong, cached
data, as I've also tried this with read(), and I get the same
results.  It's not acceptable that I have to close and reopen the file
before every read when I'm doing random record access."


I think Mike has a point: if a cache is out of sync with the actual data, 
then the cache needs to be thrown away. A bad cache is worse than no 
cache at all.

Surely dealing with files that are being actively changed by other 
processes is hard. I'm not sure that the solution is anything other than 
"well, don't do that then". How do other programming languages and Unix 
tools behave? (Windows generally only allows a single process to read or 
write to a file at once.)

Additionally, I wonder whether what Mike is seeing is some side-effect of 
file-system caching. Perhaps the bytes written to the file by echo are 
only written to disk when the file is closed? I don't know, I'm just 
hypothesizing.


-- 
Steven



More information about the Python-list mailing list