Parsing by Line Data

Mitja nun at meyl.com
Fri Jun 18 08:31:42 EDT 2004


python1 <python1 at spamless.net>
(news:casjot020q7 at enews3.newsguy.com) wrote:
> Having slight trouble conceptualizing a way to write this script. The
> problem is that I have a bunch of lines in a file, for example:
>
> 01A\n
> 02B\n
> 01A\n
> 02B\n
> 02C\n
> 01A\n
> 02B\n
> .
> .
> .
>
> The lines beginning with '01' are the 'header' records, whereas the
> lines beginning with '02' are detail. There can be several detail
> lines
> to a header.
>
> I'm looking for a way to put the '01' and subsequent '02' line data
> into one list, and breaking into another list when the next '01'
> record is found.

I'd probably do something like
records = ('\n'+open('foo.data').read).split('\n01')

You can later do
structured=[record.split('\n') for record in records]
to get a list of lists. '01' is stripped from structured[0] and there may be
other flaws, but I guess the concept is clear.

> How would you do this? I'm used to using 'readlines()' to pull the
> file data line by line, but in this case, determining the break-point
> will
> need to be done by reading the '01' from the line ahead. Would you
> need
> to read the whole file into a string and use a regex to break where a
> '\n01' is found?





More information about the Python-list mailing list