Parsing by Line Data

Eddie Corns eddie at
Thu Jun 17 19:43:56 CEST 2004

python1 <python1 at> writes:

>Having slight trouble conceptualizing a way to write this script. The 
>problem is that I have a bunch of lines in a file, for example:


>The lines beginning with '01' are the 'header' records, whereas the 
>lines beginning with '02' are detail. There can be several detail lines 
>to a header.

>I'm looking for a way to put the '01' and subsequent '02' line data into 
>one list, and breaking into another list when the next '01' record is found.

>How would you do this? I'm used to using 'readlines()' to pull the file 
>data line by line, but in this case, determining the break-point will 
>need to be done by reading the '01' from the line ahead. Would you need 
>to read the whole file into a string and use a regex to break where a 
>'\n01' is found?

def gen_records(src):
    rec = []
    for line in src:
        if line.startswith('01'):
            if rec: yield rec
            rec = [line]
    if rec:yield rec

inf = file('input-file')
for record in gen_records (inf):
    do_something_to_list (record)


More information about the Python-list mailing list