[Tutor] Read-ahead for large fixed-width binary files?
Kent Johnson
kent37 at tds.net
Sun Nov 18 14:15:56 CET 2007
Marc Tompkins wrote:
> On Nov 17, 2007 8:20 PM, Kent Johnson <kent37 at tds.net
> <mailto:kent37 at tds.net>> wrote:
> use plain slicing to return the individual records instead of StringIO.
>
> I hope I'm not being obtuse, but could you clarify that?
I think it will simplify the looping. A sketch, probably needs work:
def by_record(path, recsize):
with open(path,'rb') as inFile:
inFile.read(recLen) # throw away the header record
while True:
buf = inFile.read(recLen*4096)
if not buf:
return
for ix in range(0, len(buf), recLen):
yield buf[ix:ix+recLen]
> I'm not sure I see how this makes my
> life better than using StringIO (especially since I'm actually using
> cStringIO, with a "just-in-case" fallback in the import section, and it
> seems to be pretty fast.)
This version seems simpler and more readable to me.
Kent
More information about the Tutor
mailing list