[Csv] CSV interface question

Dave Cole djc at object-craft.com.au
Thu Jan 30 00:15:23 CET 2003


Cliff> You've lost me, I'm afraid.  What I'm saying is that:

Cliff> csvreader = reader(file("test_data/sfsample.csv", 'r'),
Cliff>                    dialect='excel')

Cliff> isn't as flexible as

Cliff> csvreader = reader(file("test_data/sfsample.csv", 'r'),
Cliff>                    dialect=excel)

Cliff> where excel is either a pre-defined dictionary/class or a
Cliff> user-created dictionary/class.

Skip> Yes, but my string just indexes into a mapping to get to the
Skip> real dict which stores the parameter settings, as I indicated in
Skip> an earlier post:
Skip> 
Skip>     I was thinking of dialects as dicts.  You'd have
Skip> 
Skip>         excel_dialect = { "quotechar": '"',
Skip>                           "delimiter": ',',
Skip>                           "linetermintor": '\r\n',
Skip>                           ...
Skip>                         }

Note the spelling error in "linetermintor" - user constructed
dictionaries are not good.

Whenever I find myself using dictionaries for storing values as
opposed to indexing data I can't escape the feeling that my past as a
Perl programmer is coming back to haunt me.  At least with Perl there
is some syntactic sugar to make this type of thing less ugly:

excel_dialect = { quotechar => '"',
                  delimiter => ',',
                  linetermintor => '\r\n' }

In the absence of that sugar I would prefer something like the
following:

class excel:
    quotechar = '"'
    delimiter = ','
    linetermintor = '\r\n'

settings = {}
for dialect in (excel, exceltsv):
    settings[dialect.__name__] = dialect

Maybe we could include a name attribute which allowed us to use
'excel-tsv' as a dialect identifier.

Skip>     with a corresponding mapping as you suggested:
Skip> 
Skip>         settings = { 'excel': excel_dialect,
Skip>                      'excel-tsv: excel_tabs_dialect, }
Skip> 
Skip>     then in the factory functions do something like:
Skip> 
Skip>         def reader(fileobj, dialect="excel", **kwds):
Skip>             kwargs = copy.copy(settings[dialect])
Skip>             kwargs.update(kwds)
Skip>             # possible sanity check on kwargs here ...
Skip>             return _csv.reader(fileobj, **kwargs)

With the class technique this would become:

def reader(fileobj, dialect=excel, **kwds):
    kwargs = {}
    for key, value in dialect.__dict__.iteritems():
        if not key.startswith('_'):
            kwargs[key] = value
    kwargs.update(kwds)
    return _csv.reader(fileobj, **kwargs)

Skip> Did that not make it out?  I also think it's cleaner if we have
Skip> a data file which is loaded at import time to define the various
Skip> dialects.  That way we aren't mixing too much data into our
Skip> code.  It also opens up the opportunity for users to later
Skip> specify their own dialect data files.  Where I indicated
Skip> "possible sanity check" above would be a call to a validation
Skip> function on the settings.

Hmmm...  Hard and messy to define classes on the fly.  Then we are
back to some kind of dialect object.

class dialect:
    def __init__(self, quotechar='"', delimiter=',', lineterminator='\r\n'):
        self.quotechar = quotechar
        self.delimiter = delimiter
        self.lineterminator = lineterminator

settings = { 'excel': dialect(),
             'excel-tsv': dialect(delimiter='\t') }

def add_dialect(name, dialect):
    settings[name] = dialect

def reader(fileobj, args='excel', **kwds):
    kwargs = {}
    if not isinstance(args, dialect):
        dialect = settings[args]
    kwargs.update(name.__dict__)
    kwargs.update(kwds)
    return _csv.reader(fileobj, **kwargs)

This would then allow you to extend the settings dictionary on the
fly, or simply pass your own dialect object.

>>> import csv
>>> my_dialect = csv.dialect(lineterminator = '\f')
>>> rdr = csv.reader(file('blah.csv'), my_dialect)

- Dave

-- 
http://www.object-craft.com.au

_______________________________________________
Csv mailing list
Csv at mail.mojam.com
http://manatee.mojam.com/mailman/listinfo/csv



More information about the Csv mailing list