[Tutor] Read-ahead for large fixed-width binary files?

Kent Johnson kent37 at tds.net
Sun Nov 18 13:57:16 CET 2007


Marc Tompkins wrote:
> On Nov 17, 2007 8:14 PM, Kent Johnson <kent37 at tds.net 
> <mailto:kent37 at tds.net>> wrote:
>     Have you tried specifying a buffer size in the open() call?
> 
> Yes. 
> I compared:
> -  no buffer size specified
> -  any of a wide range of positive numbers (from 1 to 4M)
> -  -1
> and saw no noticeable difference - as opposed to adding the StringIO 
> buffering, which kicked things up by a notch of six or so.
> 
> By the way, this is really obscurely documented.  

It is documented in the official docs of the open() function, that can't 
really be called obscure:
http://docs.python.org/lib/built-in-funcs.html#l2h-54

though I admit that this section of the docs itself is not as well-known 
as it should be. (I strongly recommend that all learning Pythonistas 
read sections 2.1 and 3 of the Library Reference.)

It took me a lot of
> Googling to find even one mention of it - in Programming Python by Mark 
> Lutz - and I was very excited... until I tested it and found that it did 
> nothing for me.  Bummer.  Then I re-read the passage:
> 
>     /Buffer size/
> 
>         The open call also takes an optional third buffer size argument,
>         which lets you control stdio buffering for the file -- the way
>         that data is queued up before being transferred to boost
>         performance. If passed, means file operations are unbuffered

Should read, "If 0 is passed"

> I've only tested on Windows XP; is XP one of those that don't provide 
> sevbuf?  (Actually, I think that's a typo - I think it should be 
> "setvbuf" - but it exists in both the 2001 and 2006 editions of the 
> book.) 

Yes, it is correctly spelled in the official docs:
http://docs.python.org/lib/built-in-funcs.html#foot1196

Kent


More information about the Tutor mailing list