[IronPython] Array Access Problem

Thu May 26 10:56:05 CEST 2005

On May 10, 2005, at 10:57 AM, Jim Hugunin wrote:

>> Bob Ippolito wrote:
>>
>>> (1) Don't have mutable value types, use a reference type that points
>>> to a value type (some kind of proxy)
>>>
>
> I don't think that this is possible to do in a consistent way and my
> suspicion is that doing this half-way would be more confusing than not
> doing it at all.  Let's walk through the original example:
>
>
>>>> apt = Array.CreateInstance(Point, 1)
>>>>
> This creates a true CLI array of Point structs
>
>
>>>> pt = Point(1,2)
>>>>
> Today this makes a new Point struct and returns the boxed version of
> that struct.  We could instead return a new instance of an  
> imaginary new
> type, ValueProxy<Point>.  This new instance is a standard reference  
> type
> that holds a point as its data.  This proxy will need to forward all
> field, property and method accesses to the contained Point struct.
>
>
>>>> apt[0] = pt
>>>>
> What do we do here?  We need to copy the data in pt into apt[0].  This
> is what it means to have an array of structs.  No matter what we do  
> with
> proxies or wrappers there's no way out of this copy.  We could add  
> some
> kind of pointer to the ValueProxy<Point> keeping track of the fact  
> that
> there's a copy of this variable now held in apt[0].  This would  
> need to
> be an arbitrarily large list of pointers.  This list would also be  
> easy
> to break with CLI code that directly modified apt or other containers
> holding on to the value types.
>
>
>>>> pt.X = 0
>>>>
> The only way this can modify apt[0] is if we keep the full list of
> references in ValueProxy.  See above for why keeping that full list
> still wouldn't always work.
>
>
>>>> apt[0].X = 0
>>>>
> This example would work using the ValueProxy that pointed to apt[0];
> however, when apt[0] is assigned to a variable the situation  
> becomes as
> bad as it is for pt.
>
>
>>>> for pt in apt:
>>>>   pt.X = 0
>>>>
> The for loop uses an Enumerator to loop through the points in apt.
> Without constructing a custom enumerator for arrays there's no way to
> get anything but copy semantics here.  While we could build a custom
> enumerator for arrays this wouldn't solve the general case of value
> types being returned from methods.
>
> When I played with this example in C#, I discovered something
> interesting:
>
> Point[] pa = new Point[3];
> foreach (Point p in pa) {
>     pt.X = 10;
> }
>
> The code above generates an error from the C# compiler:
> "Cannot modify members of 'p' because it is a 'foreach iteration
> variable'"
>
> The C# compiler is treating these iteration variables as semi- 
> immutable
> in order to minimize the confusion that can come from the copy  
> semantics
> of value types.  This seems like a promising idea...

Actually the idea I had was different -- leaving boxed type handling  
as-is, but the __getitem__ of the Point[] instance would return  
"ValueProxy" instances.. which would give you similar semantics to C#  
-- as long as you don't keep it around for a long time.  Of course,  
you could deviate from standard Python a little bit and have an  
optional extension to the __getitem__ protocol that would recognize  
that the __getitem__ is really just to find a "pointer" so that it  
can set an attribute somewhere.  __getitemforsetattr__ or something...

I only really had that idea because it would fix the reported bug,  
you're probably right about how it's currently half-implemented being  
more confusing.. however, I think it might be less confusing than the  
current state.

>>> (2) Make value types immutable (or at least the ones you grab from
>>> collections)
>>>
>
> All of the problems with value types stem from their mutability.   
> Nobody
> ever complains that int, double, char, etc. are value types because
> those are all immutable.  For immutable objects there's no difference
> between pass by reference and pass by value.
>
> The CLR team's API Design Guidelines say this:
> - Do not create mutable value types.
> http://blogs.msdn.com/kcwalina/archive/2004/09/28/235232.aspx
> (or see here - http://peter.golde.org/2003/10/13.html#a16)
>
> In some ways, this would be just reflecting in IronPython this good
> design sense.
>
> One advantage of immutability is that it would make failures like the
> following much more obvious:
>
>
>>>> apt[0].X = 0
>>>>
> If value types were immutable this would throw.  The exception message
> might give people enough information to get started tracking down the
> issue and modifying their code to work correctly.
>
> What are the problems with this approach?
>
> 1. C#/VB examples won't port very naturally to IronPython and the docs
> will need a section explaining the various workarounds to the fact  
> that
> IronPython doesn't support this idiom.  This isn't ideal, but I could
> easily live with this doc burden.
>
> 2. There's no way that I know of to make a value type 100% immutable
> without controlling its implementation.  IronPython could block  
> setting
> of fields and properties on value types, but there's no way to  
> reliably
> detect and block all sets that came through methods.  Just getting the
> properties and fields would probably cover 95% of the cases where  
> people
> try to mutate a value type, but it seems pretty awkward to me to say
> that value types in IronPython are sort-of immutable unless there are
> mutating methods.  The fact that this is what the C# compiler does for
> iteration variables is encouraging at least in that it's a precedent.
>
> 3. There might be things that are impossible to express with this
> restriction.  I don't think that's true, particularly with the use of
> named parameters to initialize fields and properties in the value  
> type's
> constructor.  However, one of the principles of IronPython is that it
> should be able to use any CLS library and it's possible there's some
> weird library design with value types that wouldn't work if they were
> considered virtually immutable by IronPython.
>
> If we went down the immutable value type route, it would be  
> interesting
> to look at different kinds of sugar that could be provided to make the
> impact on most programs less than it currently is.

In PyObjC we have similar problems to this.. the mutable value type  
problem exists, but isn't a problem in practice because people Just  
Don't Do That.  What *is* a problem is that Foundation has a mutable  
string type.

Now this sounds like a small problem at first, but since Foundation  
NSDictionary is key-copying, mutable strings are hashable and are  
allowed to pass for a regular string anywhere.  Also, since unicode  
objects are immutable in Python and their hash can not change, weird  
things can happen.

In practice, this is also not a problem (anymore).  From Python, the  
NSMutableString is bridged to a subclass of unicode.  So, it has a  
copy of the contents at the time of its creation, and all of the  
Python methods will behave as documented since they are using  
Python's implementation.  However, it also has all of the methods of  
NSMutableString and they also act correctly.  In order to get an  
updated Python representation, you simply call some Objective-C- 
implemented-method that will return the object again and you'll get a  
new proxy (normally proxies are guaranteed unique so "is" works, but  
this is not true for most classes that we conveniently bridge to  
immutable Python built-in types).  Fortunately, the NSObject protocol  
has a "self" instance method that will return that instance ..

 >>> from Foundation import *
 >>> s = NSMutableString.string()
 >>> s
u''
 >>> hash(s)
0
 >>> s.description()
u''
 >>> s.appendString_('foo')
 >>> hash(s)
0
 >>> s
u''
 >>> s.description()
u'foo'
 >>> s.self()
u'foo'

It looks confusing in a contrived example like this, but in practice  
you're generally either using one set of methods or the other.. so  
I've never been confused by it and we haven't had any complaints.

You could provide some similar workaround, with a function or method  
that mutates a field (because unlike in the PyObjC case, you're not  
guaranteed mutating methods).

-bob