Returning a list/dict as "read-only" from a method ?

Bengt Richter bokr at oz.net
Thu Dec 26 13:19:01 EST 2002


On 26 Dec 2002 05:31:07 -0800, aztech1200 at yahoo.com (Az Tech) wrote:

>Hello group,
>
>Platform: ActivePython 2.2.1 on Windows 98 - but nothing OS-specific 
>in the code.
>
>Sample code is given at the end of the post. The problem description 
>is just below.
>
>I've looked at the Python docs but couldn't get a solution. Please
> suggest some approach to a solution.
>
>TIA
>Aztech.
>
>
>This question is about how to return a list/dictionary from a method
>(of a class) in such a way that it stays "read-only" to the caller -
>or any other technique that is equivalent in terms of effect. Read on
>for the gory details (slightly long post, sorry):
First question is does the client really need a dictionary per se?
I.e., do they need to be able to iterate through it, and stuff like
that, or so they just need to look up single values for various keys?

If the latter, you could give your class a lookup method something
like
    def lookup(self, key, default=None):
        return self.d1.get(key, default) # or self.d1[key] for exceptions
or
    def lookup(self, key): return self.d1[key] # for KeyError exceptions

You could also take that last line and use __getitem__ instead of lookup
and then your client could write
    c1 = C1('dummy')
    print c1['key1']
instead of
    print c1.lookup('key1')

Since your class wouldn't have a __setitem__ method, c1['key1']= anything would
throw an exception. But you wouldn't be able to write for k,v in c1.items(): ...
and all those things. If you make your class a subclass of dict, you can. E.g.,

====< aztech1200_yahoo.py >=======================================
class C2(dict): # inherit the dict from base class dict
    def __init__(self, filename):
        # simulate reading data from the file and populating the dict.
        # self.d1 = {}
        # self.d1['key1'] = 'value1'
        # self.d1['key2'] = 'value2'
        self._setval('key1', 'value1')  # self['key1'] = would be refused
        self._setval('key2', 'value2')
        print 'in C2.__init__(), id(self) = ', id(self)

    def _setval(self, key, value): # just for a nicer spelling
        dict.__setitem__(self, key, value)
    def __setitem__(self, key, value):
        print '\n---->> refusing to set %s = %s\n' % (`key`, `value`) # or raise an exception

    def get_dict(self):
        return self     # but this is a noop since the caller already has a reference

def main():
    c2 = C2('dummy')
    main_dict = c2.get_dict()
    print 'in main(), before modifying main_dict, id(main_dict) = ', id(main_dict)
    print 'and main_dict = ', main_dict
    main_dict['key2'] = 'a different value'
    main_dict = c2.get_dict()
    print 'in main(), after modifying main_dict, id(main_dict) = ', id(main_dict)
    print 'and main_dict = ', main_dict
    
if __name__=='__main__':
    main()
==================================================================
Which gives

[ 9:32] C:\pywk\clp>aztech1200_yahoo.py
in C2.__init__(), id(self) =  8236384
in main(), before modifying main_dict, id(main_dict) =  8236384
and main_dict =  {'key2': 'value2', 'key1': 'value1'}

---->> refusing to set 'key2' = 'a different value'

in main(), after modifying main_dict, id(main_dict) =  8236384
and main_dict =  {'key2': 'value2', 'key1': 'value1'}

Or interactively, we can do

 >>> import aztech1200_yahoo as az
 >>> c2 = az.C2('dummy')
 in C2.__init__(), id(self) =  8228720
 >>> c2
 {'key2': 'value2', 'key1': 'value1'}
 >>> c2['key1']=123

 ---->> refusing to set 'key1' = 123

 >>> for k,v in c2.items(): print k,v
 ...
 key2 value2
 key1 value1

et cetera

>
>I have a class - call it C1, whose purpose in life is to read data
>from a binary file, and maintain some of the data read, as member
>variables of the class. It should provide some methods that allow
>clients to get the values of that data. Some of the data is structured
>in nature (i.e. some of the individual data elements logically belong
>together, as in a C struct), so I was planning to return such data  -
>from one method, as a list - and from another method, as a dictionary.
Is referring to the struct fields by integer indexes instead of names
satisfactory? Why not define a class with the same names as your struct?
(and use slots to minimize space). They could be made read only by
e.g., overriding __setattr__, e.g.,

 >>> class MyStruct(object):
 ...     __slots__ = ['field1','field2']
 ...     def __init__(self, f1,f2):
 ...         object.__setattr__(self, 'field1', f1)
 ...         object.__setattr__(self, 'field2', f2)
 ...     def __setattr__(self, a, v): raise AttributeError, 'MyStruct is read only'
 ...
 >>> ms = MyStruct('foo', 123)
 >>> ms.field1
 'foo'
 >>> ms.field2
 123
 >>> ms.field2=456
 Traceback (most recent call last):
   File "<stdin>", line 1, in ?
   File "<stdin>", line 6, in __setattr__
 AttributeError: MyStruct is read only

But notice:
 >>> dict.__setattr__(ms,'field2','weaseled around it')
 >>> ms.field2
 'weaseled around it'


BTW, a tuple can't be modified, so you could make a tuple and pass
references to that, if you want to give a clients a sequence.
However, note that if the things you refer to in the tuple are modifiable,
clients will be able to use the references in the tuple to that end, since
doing that doesn't modify the tuple itself.

>I made use of the id() function to identify when two different
>variables actually refer to the same piece of memory. As a test, I
>wrote a dummy class and 2 methods as above (one of which returned a
>list and the other a dictionary); when I printed id(<my_dict_or_list>)
>from within the method, and also from the caller of the method
>(invoked via a created object of that class, of course), I found that
>id() returns the same value in both places. I think this means that
>any caller of my object's method, can modify my member variable (list
>or dictionary) that I am returning via the method. I don't want to let
>this happen, obviously. (If I allowed it, then, each time the method
"Allowed" has degrees of enforcement. You would probably need OS help
to prevent access by a knowledgeable worker-arounder, but I assume you
mean accidental misuse. And for that you have options.

>was called, instead of simply returning the list/dictionary member
>variable originally created, I would have to re-read that data from
>the file, and return the re-read data instead of my original copy in
>my member variable. (I would not be able to simply return my original
>copy of the data, since the first caller (as also any subsequent
>callers) would have a reference to the very same member variable (as
>shown by the calls to id(), and hence could potentially have modified
>it, thereby changing my member variable from its original value read
>from the file.) This re-reading would be highly inefficient due to
>repeated file I/O to read the same data multiple times - as opposed to
>reading it once from the file, saving it in a member variable, and
>then simply returning that variable whenever the method was called
>from the 2nd time onwards.) So what I did next was to write a fragment

>of code inside each method (just after the code to read data from the
>file), to make a copy of the list/dictionary, and return that copy
>instead. Now, even if my caller modifies my return value, it doesn't
>matter, as they are only modifying a copy, and my original is intact.

>I do this create-and-return-a-copy each time the method is called.
>What occurred to me next, is that to perform the memory-to-memory copy
>- in the method - each time the method is called - is still
>inefficient - particularly if the list/dictionary's size is more than
>a few (or few hundred) bytes (though of course, its much faster than
>re-reading the data each time from the file). Is there any way that I
>can avoid the need to re-read data from the file, as well as the need
>to do a memory-to-memory copy, each time the method is called, and
>yet, prevent a caller from modifying my class's list/dictionary ?
>
Lots of ways, depending on your exact requirements. class C2 above
was a subclass of dict. You could still define whatever methods
you want to return MyStruct instances or tuples or whatever, while
letting the main interfaces be inherited from dict, or you could
do as you did first and make a class that contains a dict instance
(possibly subclassed for read-only and passed to clients like you
did -- or you could define methods on your class that just did
specific things, not allowing direct access to all the dict methods of
the real dict.


>I planned to do the memory-to-memory copy of the list/dictionary in a
>hand-coded fashion - by iterating through the list/dictionary's items.
>I am aware that there may be better ways to do this - like 'deepcopy'
>or some such, but haven't looked into them yet. Anyway, I don't think
>any other way of copying would help solve my problem, since what I am
>aiming at is to avoid the copy in the first place - except for a
>one-time copy, if needed.
See if any of the above helps.

>
>Sample code:
>
>class C1:
>    def __init__(self, filename):
>        # simulate reading data from the file and populating the dict.
>        self.d1 = {}
>        self.d1['key1'] = 'value1'
>        self.d1['key2'] = 'value2'
>        print 'in C1.__init__(), id(self.d1) = ', id(self.d1)
>
>    def get_dict(self):
>        return self.d1
>
>def main():
>    c1 = C1('dummy')
>    main_dict = c1.get_dict()
>    print 'in main(), before modifying main_dict, id(main_dict) = ',
>id(main_dict)
>    print 'and main_dict = ', main_dict
>    main_dict['key2'] = 'a different value'
>    main_dict = c1.get_dict()
>    print 'in main(), after modifying main_dict, id(main_dict) = ',
>id(main_dict)
>    print 'and main_dict = ', main_dict
>   
>main()
>
>
>>>> in C1.__init__(), id(self.d1) =  24213904
>in main(), before modifying, id(main_dict) =  24213904
>and main_dict =  {'key2': 'value2', 'key1': 'value1'}
>in main(), after modifying, id(main_dict) =  24213904
>and main_dict =  {'key2': 'a different value', 'key1': 'value1'}
>>>>     
>
>Hope I've made the problem clear. If not, let me know.
>
Yes, you have made your intent pretty clear, but the requirements
of the clients are not as clear to me (e.g., do they need to be able to
use all of a dict's features except modifying it, or should they
be limited to less?), so it's hard for me to suggest alternative
ways of meeting them other than shotgunning as above.

Regards,
Bengt Richter



More information about the Python-list mailing list