Python's CSV reader

Andrew McLean spam-trap-095 at at-andros.demon.co.uk
Thu Aug 4 14:36:40 CEST 2005


In article <1123130181.767413.109340 at z14g2000cwz.googlegroups.com>,
Stephan <usenet.filter at gmail.com> writes
>I'm fairly new to python and am working on parsing some delimited text
>files.  I noticed that there's a nice CSV reading/writing module
>included in the libraries.
>
>My data files however, are odd in that they are composed of lines with
>alternating formats. (Essentially the rows are a header record and a
>corresponding detail record on the next line.  Each line type has a
>different number of fields.)
>
>Can the CSV module be coerced to read two line formats at once or am I
>better off using read and split?
>
>Thanks for your insight,
>Stephan
>

The csv module should be suitable. The reader just takes each line,
parses it, then returns a list of strings. It doesn't matter if
different lines have different numbers of fields.

To get an idea of what I mean, try something like the following
(untested):

import csv

reader = csv.reader(open(filename))

while True:

        #  Read next "header" line, if there isn't one then exit the
loop
        header = reader.next()
        if not header: break

        # Assume that there is a "detail" line if the preceding
        # "header" line exists
        detail = reader.next()

        # Print the parsed data
        print '-' * 40
        print "Header (%d fields): %s" % (len(header), header)
        print "Detail (%d fields): %s" % (len(detail), detail)

You could wrap this up into a class which returns (header, detail) pairs
and does better error handling, but the above code should illustrate the
basics.

-- 
Andrew McLean



More information about the Python-list mailing list