PEP 218 Re: ANN: set-0.1 module available

James J. Besemer jb at cascade-sys.com
Sat May 18 03:24:03 EDT 2002


Fernando Pérez wrote:

> Ok, I should have expressed myself more carefully. I'm not a mathematician by
> training,[...].

Me neither.

> My only formal contact with the subject had been in long ago
> history of mathematics courses I took for fun.

Same here.

Fascinating stuff, no?

> What I had in mind is that in the context of an evolving problem (such as is a
> computer program which executes in time), the elements which satisfy the
> definition of a given set can increase. What I had in mind was the difference
> between saying 'x=5', which fairly well 'fixes' x and saying
>
> 's={f | f is a file in '.' with size > 10k} '
>
> which can change if new files are added to the directory as the program
> evolves (assuming it checks periodically). So while I agree with your strict
> mathematical definition of immutability, I hope what I had in mind is clearer
> now.

I understood you perfectly and I (re)submit that your needs as stated can be met
equally well by a set implementation that happen to be immutable.  E.g., in your
above example, the syntax in the {} is a constructor.  Each time it is evaluated
it would create a NEW set object and assign it to s.  The object itself could be
immutable and yet you could express the computations you desire.  What you really
need is to be able to change the set valued object assigned to s, and you can do
that by changing the existing object or by assigning a new one.  Creating a new
one each time SOUNDS like a lot more work but in practice I bet it won't be.  The
set operations themselves are fairly expensive and duping the object probably
won't take significantly longer.

A better example might be something like

    s = { some set }
    ...
    while 1:
        s += { file.getname() }

Here you presumably have an existing set assigned to s and you're maybe adding a
new element by reading a name out of a file.  By += we want to union the new
element into the set.  Again, there is NO problem if the set is immutable.  Set
objects would define a "+" operator and a "+=".  The fact that a new set object
gets created doesn't matter, same as if a new string is created by its "+="
operator.

> Shame on me for the lack of precision, I'll be more careful next time.

Ha!  Don't worry.  *I* have made much bigger mistakes here.

Anyway, I objected to your rationale.  As we discuss this further, my position is
softening on the ultimate outcome.

> Here you completely misunderstood me.

I don't think I misunderstood.  Rather I simply disagree.  Everything you want can
be provided by a set implementation that is immutable or mutable.  Doesn't matter.

Implementation details aside, "immutable" means a few specific things in Python:

(a) if objects a and b are of the same immutable type, then a == b <=> a is b.

(b) immutable objects export a __hash__() function such that

    a == b <=> __hash__( a ) == __hash__( b )

(c) Operation ons immutable objects always produce new objects, rather than
changing the state of an existing one.  A chief way this comes into play is in
function calls:

    def f( s ):
        s.modification()

    s = { some set }
    f( s )

At the end of this sequence, if s is immutable then the side-effect in f() is NOT
propigated to the variable s passed in.

To propigate the side effect, f() needs to expressly return a result, e.g.,

    def f( s ):
        s.modify()
        return s

    s = { some set }
    s = f( s )

Thus if sets were immutable then functions that modify sets would have to be true
functions.  You would not be able to write functions that alter arguments in place
via side effects.

Chief benefit of immutability is that sets could be used as keys to dictionaries.
One voice from the frontier says most ops do NOT mutate sets.  Another says he'd
do it all the time if he could.  There is not a lot of clamor for needing set
indicies in dictionaries.

Downside, is functions (a) and (b) above are non-trivial to implement and if you
give up immutability then there's no reason to.  The copying implied at each
operation could be unnecessarily expensive for very large sets.  Some would argue
the loss of functional side effects is a bad thing.  The present prototype
implementation is mutable.

All things considered, I withdraw my guilty plea and change my own vote to
mutable.

Regards

--jb

--
James J. Besemer  503-280-0838 voice
http://cascade-sys.com  503-280-0375 fax
mailto:jb at cascade-sys.com







More information about the Python-list mailing list