[Python-ideas] csv dialect enhancement
Chris Angelico
rosuav at gmail.com
Sun Jan 13 21:55:21 CET 2013
On Sat, Jan 12, 2013 at 4:16 AM, rurpy at yahoo.com <rurpy at yahoo.com> wrote:
> There is a common dialect of CSV, often used in database
> applications [*1], that distinguishes between an empty
> (quoted) string,
>
> e.g., the second field in "abc","",3
>
> and an empty field,
>
> e.g., the second field in "abc",,3
>
> This distinction is needed to specify or tell the
> difference between 0-length strings and NULLs, when sending
> csv data to or receiving it from a database application.
Ugh, this is exactly the sort of thing that my boss didn't believe
happened. He thinks that CSV is the same the world over, except for a
few really old or arcane programs that can be completely ignored. Took
a lot of arguing before we agreed to disagree on that one...
As an explicitly-requestable dialect, looks good.
> Sniffer:
> Will set "nulls" to True when both adjacent delimiters and
> quoted empty strings are seen in the input text.
> (Perhaps this behaviour needs to be optional for backward
> compatibility reasons?)
Yes, and make it optional. I think the interpretation of ,,,, as empty
strings is the more common, since CSV is often used in contexts that
don't have a concept of NULL (spreadsheets mainly); this ought, then,
to be the default, but one quick option can add recognition of this.
So, +1 on the whole idea.
ChrisA
More information about the Python-ideas
mailing list