[Python-Dev] Single- vs. Multi-pass iterability

Tim Peters tim.one@comcast.net
Sun, 21 Jul 2002 16:14:43 -0400


[Ping]
> Name any protocol for which the question "does this mutate?" has
> no answer.

[Tim]
>> Heh -- you must not use Zope much <0.6 wink>.  I'm hard pressed to
>> think of a protocol where that does have a reliable answer.  Here:
>>
>>     x1 = y.z
>>     x2 = y.z
>>
>> Are x1 and x2 the same object after that?  At least equal?  Did
>> either line mutate y?  You simply can't know without knowing how y's
>> type implements __getattr__, and with the introduction of computed
>> attributes (properties) it's just going to get muddier.

[Ping]
> That's not the point.

It answered the question you asked.

> You could claim that *any* polymorphism in Python is useless by the
> same argument.

It's your position that "for" is semi-useless because of the possibility for
mutation.  That isn't my position, and that some people write mutating
__getattr__ (etc) doesn't make y.z (etc) unattractive to me either.

> But Python is not useless; Python code really is reusable;

Provided you play along with code's often-undocumented preconditions,
absolutely.

> and that's because there are good conventions about what the behaviour
> *should* be.  People who do really find this upsetting should go use a
> strongly-typed language.

Sorry, I couldn't follow this part.  It's a fact that mutating __getattr__
(etc) implementations exist, and it's a fact that I'm not much bothered by
it.  I don't suggest they move to a different language, either (assuming
that by "strongly-typed" you meant "statically typed" -- Python is already
strongly typed).

> In general, getting "y.z" should be idempotent, and should not mutate y.
> I think everyone would agree on the concept.  If it does mutate y with
> visible effects, then the implementor is breaking the convention.

No argument, although I have to emphasize that it's *just* "a convention",
and repeat my prediction that the introduction of properties is going to
make this particular convention less reliable in real life over time.

> Sure, Python won't prevent you from writing a file-like class where you
> write the string "blah" to the file by fetching f.blah and you close the
> file by mentioning f[42].  But when users of this class then come running
> after you with pointed sticks, i'm not going to fight them off. :)

While properties aren't going to stop you from saying

    self.transactionid = self.session_manager.newid

and get a new result each time you do it.  Spelling no-argument method calls
without parens is popular in some other languages, and it's "a feature" that
properties make that easy to spell in Python 2.2 too.

> This is a list of all the type slots accessible from Python, before
> iterators (i.e. pre-2.2).  Beside each is the answer to the question:
>
>     Suppose you look at the value of x, then do this operation to x,
>     then look at the value of x.  Should we expect the two observed
>     values to be the same or different?
> ...

I don't know why you're bothering with this, but it's got holes.  For
example, some people overly fond <wink> of C++ enjoy overloading "<<" in
highly non-functional ways.  For another, the section on the inplace
operators seems confused; after

    x1 = x
    x += y

there's no single best answer to whether

    x is x1

is true, or to whether the value of x1 before is == to the value of x1
after.  The most popular convention*s* for the inplace operators are

- If x is of a mutable type, then x is x1 after, and the pre- and post-
  values of x1 are !=.

- If x is of an immutable type, then x is not x1 after, and the pre- and
  post- values of x1 are ==.

The second case is forced, but the first one isn't.  In light of all that,
the intended meaning of "different" in

>     nb_inplace_add                      different

is either incorrect, or so weak that it's not worth much.  I suppose you
mean that, in Python code

    x += y

the object bound to the name "x" before the operation most likely has a
different (!=) value than the object bound to the name "x" after the
operation.  That's true, but relies on what the generated code does *with*
the result of nb_inplace_add.  If you just call the method

    x.__iadd__(y)

there's simply no guessing whether x is "different" as a result (it never is
for x of an immutable type, it usually is for x of a mutable type, and
there's no way to tell the difference just by staring at x).

>     nb_hex                              same

I sure hope so <wink>.

> ...
> In every case except for __call__, there exists a canonical answer.

If by "canonical" you mean "most common", sure, with at least the exceptions
noted above.

> We all rely on these conventions every time we write a Python program.
> And learning these conventions is a necessary part of learning Python.
>
> You can argue, as Guido has, that in the particular case of for-loops
> distinguishing between mutating and non-mutating behavior is not worth
> the trouble.  But you can't say that we should give up on the whole
> concept *in general*.

To the contrary, in a language with state it's crucial for the programmer to
know when they're mutating state.  If you use a mutating __getattr__, you
better be careful that the code you call doesn't rely on __getattr__ not
mutating; if you use an iterator object, you better be careful that the code
you call doesn't require something stronger than an iterator object.  It's
all the same to me, and as Guido repeated until he got tired of it, the
possibility for "for" and "x in y" (etc) to mutate has always been there,
and has always been used.

I didn't and still don't have any notable real-life problems dealing with
this, although I too have gotten bit when passing a generator-iterator to
code that required a sequence.  I suppose the difference is that I said
"oops! I screwed up!", fixed it, and moved on.  It would have helped most if
Python had a scheme for declaring and enforcing interfaces, *and* I bothered
to use it (doubtful); second-most if the docs for the callee had spelled out
its preconditions better; I doubt it would have helped at all if a variant
spelling of "for" had been used, because I didn't eyeball the body of the
callee first.  As is, I just stuffed the generator-iterator object inside
tuple() at the call site, and everything was peachy.  That took a lot less
effort than reading this thread <0.9 wink>.