methods to parse big foreign ascii-file?

Emile van Sebille emile at fenx.com
Wed Jun 12 14:36:40 EDT 2002


holger krekel
> tommorrow i want to convince an old skool C-programmer to
> use python. We are going to make a one-day project :-)
>
> It involves reading 50MB+ files of a strange ascii-format
> containing names, floats and flags all around. These items
> span multiple lines and should be grouped in lists or objects.
>
> How would you go about it? Options that come to my mind:
>
> - using the re module to parse line for line and
>   try to get it all right by hand.
>
> - come up with a grammar and let a parser do
>   the hard work (which one?)
>
> - using some module i don't yet know?
>
> thanks for pointers or comments,
>

I would just write it.  I'm assuming some kind of ascii data dump that's
got markers of some type indicating start/end of record/field.  I'd read
in the whole file, break out next record and parse that into
fields/objects/whatever.  I wouldn't try to use re or a parser: it
always makes me feel a step removed from the problem by having to define
it to something that I need to learn first.  You know the saying...
programmer says "... I know, I'll use regular expressions for this
problem" -- now he has two problems...  ;-)  I imagine that those who
are faced with this daily will be comfortable with other available
tools, but by the time I'd come up to speed I'd have moved on to another
project...

HTH,


--

Emile van Sebille
emile at fenx.com

---------




More information about the Python-list mailing list