PEP 305 - CSV File API

Ian Bicking ianb at colorstudy.com
Sun Feb 2 05:01:42 EST 2003


On Sun, 2003-02-02 at 03:25, Dave Cole wrote:
> How kosher would it be to do something like this?
> 
> >>> class MyRow:
> ...     def __str__(self):
> ...         # something
> ... 
> >>> csvreader = csv.reader(file("some.csv"), rowclass=MyRow)
> >>> for row in csvreader:
> ...     print row


Then MyRow has to understand that particular file format.  You could
just as well do [MyRow(row) for row in csv.reader(file("some.csv"))] (or
some iterable equivalent).

More interesting would be something like:

reader = csv.columnreader(file("some.csv"))

with columnreader being like:

class columnreader(object):

    def __init__(self, *args, **kw):
        self.reader = reader(*args, **kw)
        self.iter = iter(self.reader)
        class Row:

            def __init__(self, row):
                self.row = row

            def __getitem__(self, item):
                try:
                    # If they are trying integer access:
                    return self.row[item]
                except TypeError:
                    # else name-based access:
                    return self.row[self.headers[item]]

        headers = self.iter.next()
        self.headers = {}
        for i, header in enumerate(headers):
            self.headers[header] = i
            # I'm not sure if this will work:
            setattr(Row, header, property(eval('lambda self: self.row[%i]' % i)))
        self.rowClass = Row

    def __iter__(self):
        return self

    def next(self):
        return self.rowClass(self.iter.next())



This isn't necessarily hugely fast.  For more ideas on that db_row I
believe might be of interest, as it does this for SQL results, in a
manner that supposedly is good both for speed and memory
(http://opensource.theopalgroup.com/).  It's a similar problem, where
you get back lists (if you want to be fast) where each item has the
identical attributes, but those attributes aren't known until you start
getting the objects, and you'd probably want to trade flexibility for
speed anyway since the objects are generally intermediates.

Anyway, they do it with a C extension I believe.


-- 
Ian Bicking           Colorstudy Web Development
ianb at colorstudy.com   http://www.colorstudy.com
PGP: gpg --keyserver pgp.mit.edu --recv-keys 0x9B9E28B7
4869 N Talman Ave, Chicago, IL 60625 / (773) 275-7241






More information about the Python-list mailing list