Browsing text ; Python the right tool?

Jeff Shannon jeff at
Wed Jan 26 14:17:42 EST 2005

John Machin wrote:

> Jeff Shannon wrote:
>> [...]  If each record is CRLF terminated, then
>>you can get one record at a time simply by iterating over the file
>>("for line in open('myfile.dat'): ...").  You can have a dictionary
>>classes or factory functions, one for each record type, keyed off
>>of the 2-character identifier.  Each class/factory would know the 
>>layout of that record type,
> This is plausible only under the condition that Santa Claus is paying
> you $X per class/factory or per line of code, or you are so speed-crazy
> that you are machine-generating C code for the factories.

I think that's overly pessimistic.  I *was* presuming a case where the 
number of record types was fairly small, and the definitions of those 
records reasonably constant.  For ~10 or fewer types whose spec 
doesn't change, hand-coding the conversion would probably be quicker 
and/or more straightforward than writing a spec-parser as you suggest.

If, on the other hand, there are many record types, and/or those 
record types are subject to changes in specification, then yes, it'd 
be better to parse the specs from some sort of data file.

The O.P. didn't mention anything either way about how dynamic the 
record specs are, nor the number of record types expected.  I suspect 
that we're both assuming a case similar to our own personal 
experiences, which are different enough to lead to different preferred 
solutions. ;)

Jeff Shannon
Credit International

More information about the Python-list mailing list