[Csv] Re: [Python-Dev] csv module TODO list

Andrew McNamara andrewm at object-craft.com.au
Wed Jan 5 12:14:02 CET 2005


>The CSV format is often used for exchanging large data files, not just for
>spreadsheet output.
>
>My experience: files with over a million rows are not uncommon. FWIW, no
>Unicode.

Matches my experience also, but I suspect we both live in English speaking
countries. Elsewhere in the world, the ratios could be reversed.

There has also been some suggestion that the native string type in Python
will become Unicode at some point in the future.

>My (jaundiced, but based on experience) viewpoint on newlines inside
>quoted strings:
>
>Prob (spreadsheet file with newlines inside data fields) = 0.001
>
>Prob (some programmer has not quoted their quotes properly) = 0.999
>
>Hence I suggest an option to specify this as a bug.

I agree. What makes this extra exciting at the moment is that the CSV
module will happily sit there slurping the whole file into memory trying
to match a stray quote (of course, I only noticed this when trying to
read a multi-gigabyte file).

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/


More information about the Csv mailing list