PEP 218 Re: ANN: set-0.1 module available

Fernando Pérez fperez528 at yahoo.com
Sat May 18 04:26:20 EDT 2002


James J. Besemer wrote:

> I understood you perfectly and I (re)submit that your needs as stated can be
> met equally well by a set implementation that happen to be immutable.

Well, I think you misunderstood me simply because I'm perfectly aware of the 
idea you're stating here and I thought I had made that much clear (I'm 
obviously not having much success with clarity today). In fact, another post 
of mine on this same thread included:

> Maybe for implementation reasons it will be practical to have them be
> _internally_ immutable and have them 'simulate' mutability like strings do
> (s = 'hi'; s+='ho' works 'in place' even though it doesn't ;)

So I know perfectly well that immutable objects can simulate in-place 
operations. The question is whether the performance hit you take is 
significant or not. Since it _can_ be (try a large loop on a string with += 
and watch your machine hit a wall), I tend to be of the opinion that unless 
there's a compelling reason to make the objects immutable, one is better off 
by not forcing the performance penalty. Simply because it makes the language 
more scalable and worry-free.

I personally would rather have mutable sets which can't be used as dict keys, 
but I'm sure someone else might prefer the exact opposite.

So just to make sure I'm clear: yes, I know that mutability can be simulated 
to death with immutable objects, and I've known all along how strings work. 
The point is that there is a non-negligible penalty to be paid by doing 
things that way for large objects. So the issue is not whether it can be done 
or not, but whether the existence of that penalty tilts the decision in favor 
of making them mutable and thus paying the price of losing hashability.

And by the way, when you say 'Creating a new one each time SOUNDS like a lot 
more work but in practice I bet it won't be', I hope you don't really mean 
that. If you deal routinely with sets with 2**N (N>>30) elements, a copy 
operation for simulating += can be prohibitive. Why do you think that 
Numeric's arrays implement [a:b] slicing as a reference and not a copy 
operation (which bites most newbies _hard_)? 

If Numeric arrays behaved like Python lists (where [a:b] is a copy operation) 
they would be perfectly useless for numerical computing. Imagine writing the 
standard relaxation solution for the Laplace equation --a handful of lines of 
Numeric code-- with your 'I bet in practice it's not more work' and guess 
what will happen ;)  That is the crux I'm trying to point you to, not whether 
you can _simulate_ in-place operations with immutable objects. We've known 
that all along.


Cheers,

f.

PS. In case you are curious, for the 2d case the meat of the Laplace code is 
as simple as:
        # The actual iteration
        u[1:-1, 1:-1] = ((u[0:-2, 1:-1] + u[2:, 1:-1])*dy2 +
                         (u[1:-1,0:-2] + u[1:-1, 2:])*dx2)*dnr_inv

You can see a complete, detailed discussion at 
http://www.scipy.org/site_content/weave/python_performance.html



More information about the Python-list mailing list