[Tutor] Malformed CSV

Jan Eden lists at janeden.org
Fri Dec 2 16:26:33 CET 2005


Kent Johnson wrote on 02.12.2005:

>I'm not entirely sure how you want to interpret the data above. One
>possibility is to just change the double "" to single " before
>processing with csv. For example:
>
># data is the raw data from the whole file
>data = '''""hotel,hamburg"","1","0","0"
>""hotel,billig, in berlin tegel"","1","0","0"
>""hotel+wien"","1","0","0"
>""hotel+nurnberg"","1","0","0"
>""hotel+london"","1","0","0"
>""hotel" "budapest" "billig"","1","0","0"'''
>
>data = data.replace('""', '"')
>data = data.splitlines()
>
>import csv
>
>for line in csv.reader(data):
>    print line
>
>Output is 
>['hotel,hamburg', '1', '0', '0']
>['hotel,billig, in berlin tegel', '1', '0', '0']
>['hotel+wien', '1', '0', '0']
>['hotel+nurnberg', '1', '0', '0']
>['hotel+london', '1', '0', '0']
>['hotel "budapest" "billig"', '1', '0', '0']
>
>which looks pretty reasonable except for the last line, and I don't
>really know what you would consider correct there.
>
Exactly, the last line is the problem. With correct (Excel-style) quoting, it would look like this

"""hotel"" ""budapest"" ""billig""","1","0","0"

i.e. each quote within a field would be doubled, and the output would be

['"hotel" "budapest" "billig"', '1', '0', '0']

i.e. the quoting of the original search string

"hotel" "budapest" "billig"

would be preserved (and this is important). I guess I need to notify the engineer responsible for the CSV output and have the quoting corrected.

Thanks,

Jan
-- 
Any sufficiently advanced technology is indistinguishable from a Perl script. - Programming Perl


More information about the Tutor mailing list