[Tutor] Read-ahead for large fixed-width binary files?

Kent Johnson kent37 at tds.net
Sun Nov 18 14:15:56 CET 2007


Marc Tompkins wrote:
> On Nov 17, 2007 8:20 PM, Kent Johnson <kent37 at tds.net 
> <mailto:kent37 at tds.net>> wrote:
>     use plain slicing to return the individual records instead of StringIO.
> 
> I hope I'm not being obtuse, but could you clarify that? 

I think it will simplify the looping. A sketch, probably needs work:

def by_record(path, recsize):
   with open(path,'rb') as inFile:
     inFile.read(recLen)  # throw away the header record
     while True:
       buf = inFile.read(recLen*4096)
       if not buf:
         return
       for ix in range(0, len(buf), recLen):
         yield buf[ix:ix+recLen]

 > I'm not sure I see how this makes my
> life better than using StringIO (especially since I'm actually using 
> cStringIO, with a "just-in-case" fallback in the import section, and it 
> seems to be pretty fast.)

This version seems simpler and more readable to me.

Kent


More information about the Tutor mailing list