readlines() and "binary" files

Tue Sep 24 16:31:20 EDT 2002

Hi,

I have excel data with occasional multi-line fields,
which when dumped to CSV translates to embedded CR's
within a line, whereas the records/lines themselves
are delimited by the CR+NL pair (this is MS-land).
What I'd like to do is read those files and split every
line apart on the semi-colon field separator. But it
seems that whether the file is opened as text or not,
(x)readlines() still considers the lone CR as a line
delimiter and so not all my lines end up with the same
number of fields as they should. Is there a way to handle
this, or is readlines just not meant to work with anything
but proper text files?

    f = open('testdata.csv','rb')
    for line in f.xreadlines():
        fields = line.split(';')
        print len(fields) # should always be the same value