Reading variable length records...
BrianQ at ActiveState.com
Thu Sep 13 00:36:16 CEST 2001
> I'm trying to read records from a 2 GB datafile, but my brain has
> stopped working, so I was wondering if someone has allready
> solved this problem. The records are variable length and are
> separated by a five character delimiter. I was trying to use
> file.read(n) with a blocksize of ~1Mb, but got a serious
> brainfart when trying to think of how to handle the case where
> only part of the delimiter was read in the current block.
Here is some pseudo-code to get you started:
data = ''
records = 
readData = datafile.read(size)
if not readData:
data += readData
partialRecords = data.split('12345')
records += partialRecords[:-1] # Last record is incomplete
data = records[-1]
# Hmmm, there is still data left over, probably bad
The basic idea is that you use split to collect as many records as
possible and just keep the left-over partial record for the next
round. Let me know if you need clarification.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 2220 bytes
Desc: not available
More information about the Python-list