[Tutor] question about descriptors

Peter Otten __peter__ at web.de
Sat Nov 7 09:03:44 EST 2015


Albert-Jan Roskam wrote:

> 
> 
> p, li { white-space: pre-wrap; }
> 
> Hi,
> First, before I forget, emails from hotmail/yahoo etc appear to end up in
> the spam folder these days, so apologies in advance if I do not appear to
> follow up to your replies. Ok, now to my question. I want to create a
> class with read-only attribute access to the columns of a .csv file. E.g.
> when a file has a column named 'a', that column should be returned as list
> by using instance.a. At first I thought I could do this with the builtin
> 'property' class, but I am not sure how. I now tried to use descriptors
> (__get__ and __set__), which are also used by ' property' (See also:
> https://docs.python.org/2/howto/descriptor.html).
> 
> In the " if __name__ == '__main__'" section, [a] is supposed to be a
> shorthand for == equivalent to [b]. But it's not.I suspect it has to do
> with the way attributes are looked up. So once an attribute has been found
> in self.__dict__ aka "the usual place", the search stops, and __get__ is
> never called. But I may be wrong. I find the __getatttribute__,
> __getattr__ and __get__ distinction quite confusing. What is the best
> approach to do this? Ideally, the column values should only be retrieved
> when they are actually requested (the .csv could be big). Thanks in
> advance!
> 
> 
> 
> import csv
> from cStringIO import StringIO
> 
> 
> class AttrAccess(object):
> 
> 
>     def __init__(self, fileObj):
>         self.__reader = csv.reader(fileObj, delimiter=";")
>         self.__header = self.__reader.next()
>         #[setattr(self, name, self.__get_column(name)) for name in
>         #[self.header]
>         self.a = range(10)
> 
> 
>     @property
>     def header(self):
>         return self.__header
>         
>     def __get_column(self, name):
>         return [record[self.header.index(name)] for record in
>         self.__reader]  # generator expression might be better here.
>         
>     def __get__(self, obj, objtype=type):
>         print "__get__ called"
>         return self.__get_column(obj)
>         #return getattr(self, obj)
>         
>     def __set__(self, obj, val):
>         raise AttributeError("Can't set attribute")
>         
> if __name__ == " __main__":
>     f = StringIO("a;b;c\n1;2;3\n4;5;6\n7;8;9\n")
>     instance = AttrAccess(f)
>     print instance.a  # [a] does not call __get__. Looks, and finds, in
>     self.__dict__?
>     print instance.__get__("a")  # [b] this is supposed to be equivalent
>     to [a]
>     instance.a = 42  # should throw AttributeError!

I think the basic misunderstandings are that 

(1) the __get__() method has to be implemented by the descriptor class
(2) the descriptor instances should be attributes of the class that is 
supposed to invoke __get__(). E. g.:

class C(object):
   x = decriptor()

c = C()

c.x # invoke c.x.__get__(c, C) under the hood.

As a consequence you need one class per set of attributes, instantiating the 
same AttrAccess for csv files with differing layouts won't work.

Here's how to do it all by yourself:

class ReadColumn(object):
    def __init__(self, index):
        self._index = index
    def __get__(self, obj, type=None):
        return obj._row[self._index]
    def __set__(self, obj, value):
        raise AttributeError("oops")


def first_row(instream):
    reader = csv.reader(instream, delimiter=";")

    class Row(object):
        def __init__(self, row):
            self._row = row

    for i, header in enumerate(next(reader)):
        setattr(Row, header, ReadColumn(i))

    return Row(next(reader))


f = StringIO("a;b;c\n1;2;3\n4;5;6\n7;8;9\n")
row = first_row(f)
print row.a
row.a = 42

Instead of a custom descriptor you can of course use the built-in property:

    for i, header in enumerate(next(reader)):
        setattr(Row, header, property(lambda self, i=i: self._row[i]))

In many cases you don't care about the specifics of the row class and use 
collections.namedtuple:


def rows(instream):
    reader = csv.reader(instream, delimiter=";")
    Row = collections.namedtuple("Row", next(reader))
    return itertools.imap(Row._make, reader)


f = StringIO("a;b;c\n1;2;3\n4;5;6\n7;8;9\n")
row = next(rows(f))
print row.a
row.a = 42




More information about the Tutor mailing list