[issue23041] csv needs more quoting rules

Samwyse report at bugs.python.org
Fri Dec 12 18:27:41 CET 2014


Samwyse added the comment:

David:  That's not a problem for me.

Sorry I can't provide real patches, but I'm not in a position to compile (much less test) the C implementation of _csv.  I've looked at the code online and below are the changes that I think need to be made.  My use cases don't require special handing when reading empty fields, so the only changes I've made are to the code for writers.  I did verify that the reader code mostly only checks for QUOTE_NOTNULL when parsing.  This means that completely empty fields will continue to load as zero-length strings, not None.  I won't stand in the way of anyone wanting to "fix" that for these new rules.



typedef enum {
    QUOTE_MINIMAL, QUOTE_ALL, QUOTE_NONNUMERIC, QUOTE_NONE,
    QUOTE_STRINGS, QUOTE_NOTNULL
} QuoteStyle;



static StyleDesc quote_styles[] = {
    { QUOTE_MINIMAL,    "QUOTE_MINIMAL" },
    { QUOTE_ALL,        "QUOTE_ALL" },
    { QUOTE_NONNUMERIC, "QUOTE_NONNUMERIC" },
    { QUOTE_NONE,       "QUOTE_NONE" },
    { QUOTE_STRINGS,    "QUOTE_STRINGS" },
    { QUOTE_NOTNULL,    "QUOTE_NOTNULL" },
    { 0 }
};



        switch (dialect->quoting) {
        case QUOTE_NONNUMERIC:
            quoted = !PyNumber_Check(field);
            break;
        case QUOTE_ALL:
            quoted = 1;
            break;
        case QUOTE_STRINGS:
            quoted = PyString_Check(field);
            break;
        case QUOTE_NOTNULL:
            quoted = field != Py_None;
            break;
        default:
            quoted = 0;
            break;
        }



"        csv.QUOTE_MINIMAL means only when required, for example, when a\n"
"            field contains either the quotechar or the delimiter\n"
"        csv.QUOTE_ALL means that quotes are always placed around fields.\n"
"        csv.QUOTE_NONNUMERIC means that quotes are always placed around\n"
"            fields which do not parse as integers or floating point\n"
"            numbers.\n"
"        csv.QUOTE_STRINGS means that quotes are always placed around\n"
"            fields which are strings.  Note that the Python value None\n"
"            is not a string.\n"
"        csv.QUOTE_NOTNULL means that quotes are only placed around fields\n"
"            that are not the Python value None.\n"

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue23041>
_______________________________________


More information about the Python-bugs-list mailing list