Using __*attr__ with __*item__

Bengt Richter bokr at oz.net
Thu Aug 8 07:19:21 CEST 2002


On Thu, 08 Aug 2002 02:45:13 GMT, <me at home.com> wrote:

>     I will be providing a data structure-like interface to some raw
>data, so using __getattr__ and __setattr__ are just what I need to use. 
>The problem comes when I need to operate as if I have an array of
>elements in this data.
>
>     Each class that is instantiated has an instance of a data block
>that is operates on via __getattr__ and __setattr__.  This works well
>because the attribute name helps to determine where the data is
>located, and how it is to be accessed.  When I have an array of related
>data in the block, I would like to use __getitem__ and __setitem__. 

(Warning. I haven't tested any of this ;-)

Do you want to write source like this ?

   item_value = raw_data_container.data_block_name[index_within_data_block]

I am assuming you want to operate on your raw data in place. But you should be
able to update the buffer reference in an instance and still proceed. Just don't
keep a stale instance of IndexableClass around (which you normally wouldn't).

If so, have __getattr__ return an instance of a class for datablocks that has
the required references to the appropriate raw data inside, and __getitem__
defined to do the indexed selection from that. __getattr__ could also return
named unindexed values like

   item_value = raw_data_container.item_name

You'd just get an error if you tried to index it (unless it was a string or something
that supports indexing).

>The problem is that __*item__ does not get the attribute name passed to
>it, just the index range and potential values.  Because of this, I have
>no way to know how to access the buffer.  The class is defined
>something like:
>
>
>class foo:
 class Foo:
>    def __init__ ():
     def __init__(self, datasource=None_or_some_default):
>        self.data = get the data buffer
Normally you don't want to initialize from a global source, and you have to put self
in the arg list before you can use it. If this data buffer is to be accessed first by
an element name, and then optionally by index within the element, you can start with
         self._data = datasource # just a reference to the raw stuff
then
    foo_instance = Foo(my_raw_data)
will make it so
    foo_instance._data
is the raw data. You may want to hide it a little as self._data so you can avoid it
easily when you process __getattr__
>
>    def __setattr__ (self, attr, value):
         if attr == '_data': raise AttributeError, 'Raw data may not be accessed directly'
>        load 'value' into self.data per 'attr'
         # maybe something like

         where_att, kind_of_att = find_att_in_raw_data(self.__dict__['_data'], attr)
         if kind_of_att == SCALAR_KIND:
             # pretend global raw data access routines exist ;-) (You could put them in the class).
             set_scalar_from_raw(self.__dict__['_data'], where_att)
         else:
             raise AttributeError, '%s must be accessed item-wise by index'% attr

         # i.e., but note that this should be considered to operate on a single named item,
         # and you probably need to look in the raw data to see that it's not indexable,
         # since indexable stuff gets set by first __getattr__ to find the indexable thing
         # and then __setitem__ on what's found.
             
>
>    def __getattr__ (self, attr):
         if attr == '_data': raise AttributeError, 'Raw data may not be accessed directly'
>        read value from sefl.data per 'attr' and return it
         here you want to look at the raw data per attr and decide if what you find
         is indexable. If it is, then you don't return an item, you make a class instance
         initialized with a reference to the part of the raw data you found with attr,
	 and you return that. So when you write
     foo_instance.attr
         you well get that IndexableClass instance. Its class will have __[gs]etitem__ defined, so when
         you write
     foo_instance.attr[an_index]
         It's __getitem__ will be called, and when you write
     foo_instance.attr[an_index] = new_value
         its __setitem__ will be called. Note that here .attr still caused __getattr__ so be called,
         not __setattr__, because we're not targeting the attribute itself, but something indexed within.

         So, you need something like
     def __getattr__ (self, attr):
         where_att, kind_of_att = find_att_in_raw_data(self.__dict__['_data'], attr)
         if kind_of_att == SCALAR_KIND:
             return get_scalar_from_raw(self.__dict__['_data'], where_att)
         else:
             return IndexingClass(self.__dict__['_data'], where_att)

         # ( the .__dict__['_data'] is to avoid a recursive __getattr__ call )
># end class foo

And then don't put the following in class Foo. Make it the separate IndexingClass, e.g.,

 class IndexingClass:
     def __init__(self, raw_data_ref, where_att):
         self.raw_data_ref = raw_data_ref
         self.where_att = where_att    # place where data was located for particular attr
>
>    def __setitem__ (self, item, values):
>        ### Can't determine which of the possible arrays are being
>        ### referenced
         # now you can
         set_indexed_item_in_raw_data( self.raw_data_ref, self.where_att, values)

>
>    def __getitem__ (self, item):
>        ### Can't determine which of the possible arrays are being
>        ### referenced
         # now you can
         return get_indexed_item_in_raw_data(self.raw_data_ref, self.where_att)

     # if you implement this, you should be able to write
     # for item in foo_instance.attr: do_something(item)
     # it will also be needed if you want __getitem__ to support slices like
     # foo_instance.attr[3:-4]
     def __len__(self):
         return get_max_valid_index_plus_one(self.raw_data_ref, self.where_att)
>
># end class IndexingClass
>
>
>     If I move the __*item__ routines into their own package, then all
>visibility to self.data is lost.
      Not if you do it as above.
>
>     I have a work around, but it bothers me because it looks too much
>like a bad hack.  The solution I will be trying is to move the
>__*item__ routines into their own class.  Then define EACH element of
>the array as a unique attribute.  Once this is done, instantiate the
Sorry, I don't want to dig through it ;-) No more time.

>array class with these elements and pass them BACK INTO the structure
>class.  Something like this:
>
>class foo:
>    def __init__ (self, other_params):
>        self.data = get the data buffer
>
>    def __setattr__ (self, attr, value):
>        load 'value' into self.data per 'attr'
>
>    def __getattr__ (self, attr):
>        read value from sefl.data per 'attr' and return it
>
>    def set_array (self, array_attribute, the_array):
>        self.__dict__[array_attribute] = the_array
>
># end class foo
>
>
>class bar:
>    def __init__ (self, array_element_list):
>        self.the_array = array_element_list
>
>    def __setitem__ (self, item, values):
>        for index in range (item.start, item.stop):
>            # this should trap to the proper __setattr__ in class foo
>            self.the_array [index] = values [index - item.start]
>
>    def __getitem__ (self, item):
>        the_answer = []
>        for index in range (item.start, item.stop):
>            # this should trap to the proper __getattr__ in class foo
>            the_answer.append (self.the_array [index])
>
># end class bar
>
>FOO = foo()
>the_array = bar([FOO.array_element_1, FOO.array_element_2, ...])
>FOO.set_array ("array_name", the_array)
>
>     Because of how I am creating FOO, I can hide the instantiation of
>bar, but it still bothers me.
>
>     Any thoughts as to a better way to do this?
>
First, use capitals on class names and lower case for instances, or you'll
confuse a lot of people ;-)

Regards,
Bengt Richter



More information about the Python-list mailing list