Re: [Python-Dev] accumulator display syntax

Greg> What you're proposing hijacks the indexing syntax and uses it to Greg> mean something completely different from indexing, which is a much Greg> bigger change, and potentially a very confusing one. Greg> So, no, sorry, it doesn't overcome my objection! I agree. Any expression bracketed by '[' and ']', no matter how many other clues to the ultimate result it might contain, ought to result in a list as far as I'm concerned. Skip

On Friday 17 October 2003 06:12 pm, Skip Montanaro wrote:
Hmmm, how is, e.g. foo[x*x for x in bar] any more an "expression bracketed by [ and ]" than, say, foo = {'wot': 'tow'} foo['wot'] ...? Yet the latter doesn't involve any lists that I can think of. Nor do I see why the former need "mean something completely different from indexing" -- it means to call foo's __getitem__ with the appropriately constructed object, just as e.g. foo[ 'va':23:2j, {'zip':'zop'}:45:(3,4) ] today calls it with a tuple of two weird slice objects (and doesn't happen to involve any lists whatsoever). Alex

>> I agree. Any expression bracketed by '[' and ']', no matter how many >> other clues to the ultimate result it might contain, ought to result >> in a list as far as I'm concerned. Alex> Hmmm, how is, e.g. Alex> foo[x*x for x in bar] Alex> any more an "expression bracketed by [ and ]" than, say, Alex> foo = {'wot': 'tow'} Alex> foo['wot'] Alex> ...? When I said "expression bracketed by '[' and ']' I agree I was thinking of list construction sorts of things like: foo = ['wot'] not indexing sorts of things like: foo['wot'] I'm not in a mood to try and explain anything in more precise terms this morning (for other reasons, it's been a piss poor day so far) and must trust your ability to infer my meaning. I have no idea at this point how to interpret foo[x*x for x in bar] That looks like a syntax error to me. You have a probably identifier followed by a list comprehension. Here's a slightly more precise term: If a '['...']' construct exists in context where a list constructor would be legal today, it ought to evaluate to a list, not to something else. Alex> ... just as e.g. foo[ 'va':23:2j, {'zip':'zop'}:45:(3,4) ] ... I have absolutely no idea how to interpret this. Is this existing or proposed Python syntax? Skip

On Friday 17 October 2003 06:38 pm, Skip Montanaro wrote: ...
Perfectly valid and current existing Python syntax:
Not particularly _sensible_, mind you, and I hope nobody's yet written any container that IS to be indexed by such tuples of slices of multifarious nature. But, indexing does stretch quite far in the current Python syntax and semantics (in Python's *pragmatics* you're supposed to use it far more restrainedly). Alex

Which is why I didn't like the 'sum[x for x in S]' notation much. Let's look for an in-line generator notation instead. I like sum((yield x for x in S)) but perhaps we can make this work: sum(x for x in S) (Somebody posted a whole bunch of alternatives that were mostly picking random delimiters; it didn't look like the right approach.) --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
but perhaps we can make this work:
sum(x for x in S)
Being able to use generator compressions as an expression would be useful. In that case, I assume the following would be possible as well: mygenerator = x for x in S for y in x for x in S: print y return x for x in S Thanks, -Shane Holloway

[Shane Holloway]
You'd probably have to add extra parentheses around (x for x in S) to help the poor parser (and the human reader). --Guido van Rossum (home page: http://www.python.org/~guido/)

Hello, On Fri, Oct 17, 2003 at 11:55:53AM -0600, Shane Holloway (IEEE) wrote:
Interesting but potentially confusing: we could expect the last one to mean that we executing 'return' repeatedly, i.e. returning a value more than once, which is not what occurs. Similarily, yield x for x in g() in a generator would be quite close to the syntax discussed some time ago to yield all the values yielded by a sub-generator g, but in your proposal it wouldn't have that meaning: it would only yield a single object, which happens to be an iterator with the same elements as g(). Even with parenthesis, and assuming a syntax to yield from a sub-generator for performance reason, the two syntaxes would be dangerously close: yield x for x in g() # means for x in g(): yield x yield (x for x in g()) # means yield g() Armin

I'm not sure what you mean by executing 'return' repeatedly; the closest thing in Python is returning a sequence, and this is pretty close (for many practical purposes, returning an iterator is just as good as returning a sequence).
IMO this is not at all similar to what it suggests for return, as executing 'yield' multiple times *is* a defined thing. This is why I'd prefer to require extra parentheses; yield (x for x in g()) is pretty clear about how many times yield is executed.
I don't see why we need yield x for x in g() when we can already write for x in g(): yield x This would be a clear case of "more than one way to do it". --Guido van Rossum (home page: http://www.python.org/~guido/)

Armin Rigo wrote:
Yes, this is one of the things I trying to getting at -- If gencomps are expressions, then they must be expressions everywhere, or my poor brain will explode. As for the subgenerator "unrolling", I think there has to be something added to the yield statement to accomplish it -- because it is also useful to yield a generator itself and not have it unrolled. My favorite was "yield *S" for that discussion...\ -Shane Holloway

At 10:15 AM 10/17/03 -0700, Guido van Rossum wrote:
Offhand, it seems like the grammar might be rather tricky, but it actually does seem more Pythonic than the "yield" syntax, and it retroactively makes listcomps shorthand for 'list(x for x in s)'. However, if gencomps use this syntax, then what does: for x in y*2 for y in z if y<20: ... mean? ;) It's a little clearer with parentheses, of course, so perhaps they should be required: for x in (y*2 for y in z if y<20): ... It would be more efficient to code that stuff inline in the loop, if the gencomp creates another frame, but it *looks* more efficient to put it in the for statement. But maybe I worry too much, since you could slap a listcomp in a for loop now, and I've never even thought of doing so.

"Phillip J. Eby" <pje@telecommunity.com> writes:
At 10:15 AM 10/17/03 -0700, Guido van Rossum wrote:
I like the look of this. In this context, it looks very natural.
It means you're trying to be too clever, and should use parentheses :-)
I'd rather not require parentheses in general. Guido's example of sum(x for x in S) looks too nice for me to want to give it up without a fight. But I'm happy to have cases where the syntax is ambiguous, or even out-and-out unparseable, without the parentheses. Whether it's possible to express this in a way that Python's grammar can deal with, I don't know. Paul. -- This signature intentionally left blank

>>> sum((yield x for x in S)) >>> >>> but perhaps we can make this work: >>> >>> sum(x for x in S) Paul> I like the look of this. In this context, it looks very natural. How would it look if you used the optional start arg to sum()? Would either of these work? sum(x for x in S, start=5) sum(x for x in S, 5) or would you have to parenthesize the first arg? sum((x for x in S), start=5) sum((x for x in S), 5) Again, why parens? Why not sum(<x for x in S>, start=5) sum(<x for x in S>, 5) or something similar? Also, sum(x for x in S) and sum([x for x in S]) look very similar. I don't think it would be obvious to the casual observer what the difference between them was or why the first form didn't raise a SyntaxError. >> It's a little clearer with parentheses, of course, so perhaps they >> should be required: >> >> for x in (y*2 for y in z if y<20): >> ... Paul> I'd rather not require parentheses in general. Parens are required in certain situations within list comprehensions around tuples (probably for syntactic reasons, but perhaps to aid the reader as well) where tuples can often be defined without enclosing parens. Here's a contrived example: >>> [(a,b) for (a,b) in zip(range(5), range(10))] [(0, 0), (1, 1), (2, 2), (3, 3), (4, 4)] >>> [a,b for (a,b) in zip(range(5), range(10))] File "<stdin>", line 1 [a,b for (a,b) in zip(range(5), range(10))] ^ SyntaxError: invalid syntax Paul> Guido's example of sum(x for x in S) looks too nice for me to want Paul> to give it up without a fight. But I'm happy to have cases where Paul> the syntax is ambiguous, or even out-and-out unparseable, without Paul> the parentheses. Whether it's possible to express this in a way Paul> that Python's grammar can deal with, I don't know. I rather suspect parens would be required for tuples if they were added to the language today. I see no reason to make an exception here. Skip

In article <16272.22369.546606.870697@montanaro.dyndns.org>, Skip Montanaro <skip@pobox.com> wrote:
This one has bitten me several times. When it does, I discover the error quickly due to the syntax error, but it would be bad if this became valid syntax and returned a list [a,X] where X is an iterator. I don't think you could count on this getting caught by a being unbound, because often the variables in list comprehensions can be single letters that shadow previous bindings. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science

Generally, when we talk about something "biting", we mean something that *doesn't* give a syntax error, but silently does something quite different than what you'd naively expect. This was made a syntax error specifically because of this ambiguity.
No, [a,X] would be a syntax error if X was an iterator comprehension. --Guido van Rossum (home page: http://www.python.org/~guido/)

Because the parser doesn't know whether the > after S is the end of the <...> brackets or a binary > operator. (Others can answer your other questions.) --Guido van Rossum (home page: http://www.python.org/~guido/)

>> But, indexing does stretch quite far in the current Python syntax and >> semantics (in Python's *pragmatics* you're supposed to use it far >> more restrainedly). Guido> Which is why I didn't like the 'sum[x for x in S]' notation much. Guido> Let's look for an in-line generator notation instead. I like Guido> sum((yield x for x in S)) Guido> but perhaps we can make this work: Guido> sum(x for x in S) Forgive my extreme density on this matter, but I don't understand what (yield x for x in S) is supposed to do. Is it supposed to return a generator function which I can assign to a variable (or pass to the builtin function sum() as in your example) and call later, or is it supposed to turn the current function into a generator function (so that each executed yield statement returns a value to the caller of the current function)? Assuming the result is a generator function (a first class object I can assign to a variable then call later), is there some reason the current function notation is inadequate? This seems to me to suffer the same expressive shortcomings as lambda. Lambda seems to be hanging on by the hair on its chinny chin chin. Why is this construct gaining traction? If you don't like lambda, I can't quite see why syntax this is all that appealing. OTOH, if lambda: x: x+1 is okay, then why not: yield: x for x in S ? Skip

At 01:57 PM 10/17/03 -0500, Skip Montanaro wrote:
Neither. It returns an *iterator*, conceptually equivalent to: def temp(): for x in S: yield x temp = temp() Except of course without creating a 'temp' name. I suppose you could also think of it as: (lambda: for x in S: yield x)() except of course that you can't make a generator lambda. If you look at it this way, then you can consider [x for x in S] to be shorthand syntax for list(x for x in S), as they would both produce the same result. However, IIRC, the current listcomp implementation actually binds 'x' in the current local namespace, whereas the generator version would not. (And the listcomp version might be faster.)

"Phillip J. Eby" <pje@telecommunity.com>:
Are we sure about that? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

On Monday 20 October 2003 02:08 am, Greg Ewing wrote:
We are indeed sure (sadly) that list comprehensions leak control variable names. We can hardly be sure of what iterator comprehensions would be defined to do, given they don't exist, but surely we can HOPE that in an ideal world where iterator comprehensions were part of Python they would not be similarly leaky:-). Alex

We are indeed sure (sadly) that list comprehensions leak control variable names.
But they shouldn't. It can be fixed by renaming them (e.g. numeric names with a leading dot).
It's highly likely that the implementation will have to create a generator function under the hood, so they will be safely contained in that frame. --Guido van Rossum (home page: http://www.python.org/~guido/)

At 05:45 PM 10/20/03 +0200, Alex Martelli wrote:
He was talking about having the bytecode compiler generate "hidden" names for the variables... ones that can't be used from Python. There's one drawback there, however... If you're stepping through the listcomp generation with a debugger, you won't be able to print the current item in the list, as (I believe) is possible now.

Good point. But this could be addressed in many ways; the debugger could grow a way to quote nonstandard variable names, or it could know about the name mapping, or we could use a different name-mangling scheme (e.g. prefix the original name with an underscore, and optionally append _1 or _2 etc. as needed to distinguish it from a real local with the same name). Or we could simply state this as a deficiency (I'm not sure I've ever needed to debug that situation). --Guido van Rossum (home page: http://www.python.org/~guido/)

I meant that the compiler should rename it. Just like when you use a tuple argument: def f(a, (b, c), d): ... this actually defines a function of three (!) arguments whose second argument is named '.2'. And the body starts with something equivalent to b, c = .2 For list comps, the compiler could maintain a mapping for the listcomp control variables so that if you write [x for x in range(3)] it knows to generate bytecode as if x was called '.7'; at the bytecode level there's no requirement for names to follow the identifier syntax. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum <guido@python.org> writes:
Implementing this might be entertaining. In particular what happens if the iteration variable is a local in the frame anyway? I presume that would inhibit the renaming, but then there's a potentially confusing dichotomy as to whether renaming gets done. Of course you could *always* rename, but then code like def f(x): r = [x+1 for x in range(x)] return r, x becomes even more incomprehensible (and changes in behaviour). And what about horrors like [([x for x in range(10)],x) for x in range(10)] vs: [([x for x in range(10)],y) for y in range(10)] ? I suppose you could make a case for throwing out (or warning about) all these cases at compile time, but that would require significant effort as well (I think). Cheers, mwh -- This song is for anyone ... fuck it. Shut up and listen. -- Eminem, "The Way I Am"

Here's the rule I'd propose for iterator comprehensions, which list comprehensions would inherit: [<expr1> for <vars> in <expr2>] The variables in <vars> should always be simple variables, and their scope only extends to <expr1>. If there's a variable with the same name in an outer scope (including the function containing the comprehension) it is not accessible (at least not by name) in <expr1>. <expr2> is not affected. In comprehensions you won't be able to do some things you can do with regular for loops: a = [1,2] for a[0] in range(10): print a
I think the semantics are crisply defined, users who write these deserve what they get (confusion and the wrath of their readers). --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido> Here's the rule I'd propose for iterator comprehensions, which list Guido> comprehensions would inherit: Guido> [<expr1> for <vars> in <expr2>] Guido> The variables in <vars> should always be simple variables, and Guido> their scope only extends to <expr1>. If there's a variable with Guido> the same name in an outer scope (including the function Guido> containing the comprehension) it is not accessible (at least not Guido> by name) in <expr1>. <expr2> is not affected. I thought the definition for list comprehension syntax was something like '[' <expr> for <vars> in <expr> [ for <vars> in <expr> ] * [ if <expr> ] * ']' The loop <vars> in an earlier for clause should be visible in all nested for clauses and conditional clauses, not just in the first <expr>. Skip

Absolutely, good point! --Guido van Rossum (home page: http://www.python.org/~guido/)

Michael Hudson <mwh@python.net>:
In particular what happens if the iteration variable is a local in the frame anyway? I presume that would inhibit the renaming
Why?
Anyone who writes code like that *deserves* to have the behaviour changed on them! If this is really a worry, an alternative would be to simply forbid using a name for the loop variable that's used for anything else outside the loop. That could break existing code too, but at least it would break it in a very obvious way by making it fail to compile. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

Greg Ewing <greg@cosc.canterbury.ac.nz> writes:
Well, because then you have the same name for two different bindings.
This was not my impression of the Python way. I know I'd be pretty pissed if this broke my app. I have no objection to breaking the above code, just to breaking it silently! Having code *silently change in behaviour* (not die with an expection, post a warning at compile time or fail to compile at all) is about an evil a change as it's possible to contemplate, IMO.
This would be infinitely preferable! Cheers, mwh -- I like silliness in a MP skit, but not in my APIs. :-) -- Guido van Rossum, python-dev

Not so fast. We introduced nested scopes, which could similarly subtly change the meaning of code without giving an error. Instead, we had at least one release that *warned* about situations that would change meaning silently under the new semantics. The same release also implemented the new semantics if you used a __future__ import. We should do that here too (both the warning and the __future__). I don't want this kind of code to cause an error; it's not Pythonic to flag an error when a variable name in an inner scope shields a variable of the same name in an outer scope. --Guido van Rossum (home page: http://www.python.org/~guido/)

>> We can hardly be sure of what iterator comprehensions would be >> defined to do, given they don't exist, but surely we can HOPE that in >> an ideal world where iterator comprehensions were part of Python they >> would not be similarly leaky:-). Guido> It's highly likely that the implementation will have to create a Guido> generator function under the hood, so they will be safely Guido> contained in that frame. Which suggests they aren't likely to be a major performance win over list comprehensions. If nothing else, they would push the crossover point between list comprehensions and iterator comprehensions toward much longer lists. Is performance is the main reason this addition is being considered? They don't seem any more expressive than list comprehensions to me. Skip

[Skip]
They are more expressive in one respect: you can't use a list comprehension to express an infinite sequence (that's truncated by the consumer). They are more efficient in a related situation: a list comprehension buffers all its items before the next processing step begins; an iterator comprehension doesn't need to do any buffering. So iterator comprehensions win if you're pipelining operations just like Unix pipes are a huge win over temporary files in some situations. This is particularly important when the consumer is some accumulator like 'average' or 'sum'. Whether there is an actual gain in speed depends on how large the list is. You should be able to time examples like sum([x*x for x in R]) vs. def gen(R): for x in R: yield x*x sum(gen(R)) for various lengths of R. (The latter would be a good indication of how fast an iterator generator could run.) --Guido van Rossum (home page: http://www.python.org/~guido/)

On Monday 20 October 2003 07:21 pm, Guido van Rossum wrote: ...
with a.py having: def asum(R): sum([ x*x for x in R ]) def gen(R): for x in R: yield x*x def gsum(R, gen=gen): sum(gen(R)) I measure: [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(100)' 'a.asum(R)' 10000 loops, best of 3: 96 usec per loop [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(100)' 'a.gsum(R)' 10000 loops, best of 3: 60 usec per loop [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(1000)' 'a.asum(R)' 1000 loops, best of 3: 930 usec per loop [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(1000)' 'a.gsum(R)' 1000 loops, best of 3: 590 usec per loop [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(10000)' 'a.asum(R)' 100 loops, best of 3: 1.28e+04 usec per loop [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(10000)' 'a.gsum(R)' 100 loops, best of 3: 8.4e+03 usec per loop not sure why gsum's advantage ratio over asum seems to be roughly constant, but, this IS what I measure!-) Alex

Great! This is a plus for iterator comprehensions (we need a better term BTW). I guess that building up a list using repeated append() calls slows things down more than the frame switching used by generator functions; I knew the latter was fast but this is a pleasant result. BTW, if I use a different function that calculates list() instead of sum(), the generator version is a few percent slower than the list comprehension. But that's because list(a) has a shortcut in case a is a list, while sum(a) always uses PyIter_Next(). So this is actually consistent: despite the huge win of the shortcut, the generator version is barely slower. I think the answer lies in the bytecode:
def lc(a): return [x for x in a]
The list comprehension executes 7 bytecodes per iteration; the generator version only 5 (this could be more of course if the expression was more complicated than 'x'). The YIELD_VALUE does very little work; falling out of the frame is like falling off a log; and gen_iternext() is pretty sparse code too. On the list comprehension side, calling the list's append method has a bunch of overhead. (Some of which could be avoided if we had a special-purpose opcode which called PyList_Append().) But the executive summary remains: the generator wins because it doesn't have to materialize the whole list. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido:
But the executive summary remains: the generator wins because it doesn't have to materialize the whole list.
But what would happen if the generator were replaced with in-line code that computes the values and feeds them to an accumulator object, such as might result from an accumulator syntax that gets inline-expanded in the same way as a list comp? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

I'd worry that writing an accumilator would become much less natural. The cool thing of iterators and generators is that you can write both the source (generator) and the destination (iterator consumer) as a simple loop, which is how you usually think about it. --Guido van Rossum (home page: http://www.python.org/~guido/)

>> [Alex measures speed improvements] Guido> Great! This is a plus for iterator comprehensions (we need a Guido> better term BTW). Here's an alternate suggestion. Instead of inventing new syntax, why not change the semantics of list comprehensions to be lazy? They haven't been in use that long, and while they are popular, the semantic tweakage would probably cause minimal disruption. In situations where laziness wasn't wanted, the most that a particular use would have to change (I think) is to pass it to list(). Skip

On Tuesday 21 October 2003 03:57 pm, Skip Montanaro wrote:
Well, yes, the _most_ one could ever have to change is move from [ ... ] to list[ ... ]) to get back today's semantics. But any use NOT so changed may break, in general; any perfectly correct program coded with Python 2.1 to Python 2.3 -- several years' worth of "current Python", by the time 2.4 comes out -- might break. I think we should keep the user-observable semantics as now, BUT maybe an optimization IS possible if all the user code does with the LC is loop on it (or anyway just get its iter(...) and nothing else). Perhaps a _variant_ of "The Trick" MIGHT be practicable (since I don't believe the "call from C holding just one ref" IS a real risk here). Again it would be based on reference-count being 1 at a certain point. The LC itself _might_ just build a generator and wrap it in a "pseudolist" object. Said pseudolist object, IF reacting to a tp_iter when its reference count is one, NEED NOT "unfold" itself. But for ANY other operation, it must generate the real list and "get out of the way" as much as possible. Note that this includes a tp_iter WITH rc>1. For example: x = [ a.strip().upper() for a in thefile if len(a)>7 ] for y in x: blah(y) for z in x: bluh(z) the first 'for' implicitly calls iter(x) but that must NOT be allowed to "consume" thefile in a throwaway fashion -- because x can be used again later (e.g. in the 2nd for). This works fine today and has worked for years, and I would NOT like it to break in 2.4... if LC's had been lazy from the start (just as they are in Haskell), that would have been wonderful, but, alas, we didn't have the iterator protocol then...:-( As to whether the optimization is worth this complication, I dunno. I'd rather have "iterator literals", I think -- simpler and more explicit. That way when i see [x.bah() for x in someiterator] I KNOW the iterator is consumed right then and there, I don't need to look at the surrounding context... context-depended semantics is not Python's most normal and usual approach, after all... Alex

>> Here's an alternate suggestion. Instead of inventing new syntax, why >> not change the semantics of list comprehensions to be lazy? Alex> Well, yes, the _most_ one could ever have to change is move from [ Alex> ... ] to list[ ... ]) to get back today's semantics. But any use Alex> NOT so changed may break, in general; any perfectly correct Alex> program coded with Python 2.1 to Python 2.3 -- several years' Alex> worth of "current Python", by the time 2.4 comes out -- might Alex> break. I understand all that. Still, the "best" syntax for these so-called iterator comprehensions might have been the current list comprehension syntax. I don't know how hard it would be to fix existing code, probably not a massive undertaking, but the bugs lazy list comprehensions introduced would probably be a bit subtle. Let's perform a little thought experiment. We already have the current list comprehension syntax and the people thinking about lazy list comprehensions are seem to be struggling a bit to find syntax for them which doesn't appear cobbled together. Direct your attention to Python 3.0 where one of the things Guido has said he would like to do is to eliminate some bits of the language he feels are warts. Given two similar language constructs implementing two similar sets of semantics, I'd have to think he would like to toss one of each. The list comprehension syntax seems the more obvious (to me) syntax to keep while it would appear there are some advantages to the lazy list comprehension semantics (enumerate (parts of) infinite sequences, better memory usage, some performance improvements). I don't know when 3.0 alpha will (conceptually) become the CVS trunk. Guido may not know either, but it is getting nearer every day. Unless he likes one of the proposed new syntaxes well enough to conclude now that he will keep both syntaxes and both sets of semantics in 3.0, I think we should look at other alternatives which don't introduce new syntax, including morphing list comprehensions into lazy list comprehensions or leaving lazy list comprehensions out of the language, at least in 2.x. As I think people learned when considering ternary operators and switch statements, adding constructs to the language in a Pythonic way is not always possible, no matter how compelling the feature might be. In those situations it makes sense to leave the construct out for now and see if syntax restructuring in 3.0 will make addition of such desired features possible. Anyone for [x for x in S]L ? <lazy wink> Skip

On Tuesday 21 October 2003 06:34 pm, Skip Montanaro wrote: ...
Yes to both points. Hmmm...
should look at other alternatives which don't introduce new syntax, including morphing list comprehensions into lazy list comprehensions or
...as long as this can be done WITHOUT breaking a ton of my code...
leaving lazy list comprehensions out of the language, at least in 2.x. As
Eeek. Maybe. Sigh. 3 years or so (best case, assuming 2.4 is the last of the 2.*'s) before I can teach and deploy lazy comprehensions?-( Hmmm... what about skipping 2.4, and making a beeline for 3.0...?-) Alex

Hmmm... what about skipping 2.4, and making a beeline for 3.0...?-)
Not until I can quit my job at ES and spend a year or so on PSF funds on it. --Guido van Rossum (home page: http://www.python.org/~guido/)

[Skip Montanaro]
Skip is right about returning to the basics. Before considering some of the wild syntaxes that have been submitted, I suggest re-examining the very first proposal with brackets and yield. At one time, I got a lot of feedback on this from comp.lang.python. Just about everyone found the brackets to be helpful and not misleading, the immediate presence of "yield" was more than enough to signal that an iterator was being returned instead of a list: g = [yield (len(line),line) for line in file if len(line)>5] This syntax is instantly learnable from existing knowledge about list comprehensions and generators. The advantage of a near zero learning curve should not be easily dismissed. Also, this syntax makes is trivially easy to convert an existing list comprehension into an iterator comprehension if needed to help the application scale-up or to improve performance. Raymond Hettinger ################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# #################################################################

On Tuesday, October 21, 2003, at 01:58 PM, Raymond Hettinger wrote:
FWIW, that g is an iterator is *far* less surprising than the fact that yield turns a function into a generator. If it's okay that a yield in the body of a function change the function, why can't a yield in the body of a list comprehension change the list comprehension? It's a lot more noticeable, and people should know that "yield" signals something a little more tricky is going on. Also has good symmetry with the current meaning of yield. -- Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org

-1. I expect that most iterator comprehensions (we need a better term!) are not stored in a variable but passed as an argument to something that takes an iterable, e.g. sum(len(line) for line in file if line.strip()) I find that in such cases, the 'yield' distracts from what is going on by focusing attention on the generator (which is really just an implementation detail). We can quibble about whether double parentheses are needed, but this syntax is just so much clearer than the version with square brackets and yield, that there is no contest IMO. --Guido van Rossum (home page: http://www.python.org/~guido/)

Iterator expression?
Better. Or perhaps generator expression? To maintain the link with generator functions, since the underlying mechanism *will* be mostly the same. Yes, I like that even better. BTW, while Alex has shown that a generator function with no free variables runs quite fast, a generator expression that uses variables from the surrounding scope will have to use the nested scopes machinery to access those, unlike a list comprehension; not only does this run slower, but it also slows down all other uses of that variable in the surrounding scope (because it becomes a "cell" throughout the scope). Someone could time how well y = 1 sum([x*y for x in R]) fares compared to y = 1 def gen(): for x in R: yield y*y sum(gen()) for R in (range(N) for N in (100, 1000, 10000)). --Guido van Rossum (home page: http://www.python.org/~guido/)

[Tim]
The implementation could synthesize a generator function abusing default arguments to give the generator's frame locals with the same names.
Yes, I think that could work -- I see no way that something invoked by the generator expression could possibly modify a variable binding in the surrounding scope. <thinks> Argh, someone *could* pass around a copy of locals() and make an assignment into that. But I think we're already deprecating non-read-only use of locals(), so I'd like to ban that as abuse. --Guido van Rossum (home page: http://www.python.org/~guido/)

At 14:04 21.10.2003 -0700, Guido van Rossum wrote:
so this, if I understand: def h(): y = 0 l = [1,2] it = (x+y for x in l) y = 1 for v in it: print v will print 1,2 and not 2,3 unlike: def h(): y = 0 l = [1,2] def gen(S): for x in S: yield x+y it = gen(l) y = 1 for v in it: print v

[Guido]
[Samuele]
Argh. Of course. No, I think it should use the actual value of y, just like a nested function. Never mind that idea then. --Guido van Rossum (home page: http://www.python.org/~guido/)

At 15:14 21.10.2003 -0700, Guido van Rossum wrote:
this is a bit OT and too late, but given that our closed over variables are read-only, I'm wondering whether, having a 2nd chance, using cells and following mutations in the enclosing scopes is really worth it, we kind of mimic Scheme and relatives but there outer scope variables are also rebindable. Maybe copying semantics not using cells for our closures would not be too insane, and people would not be burnt by trying things like this: for msg in msgs: def onClick(e): print msg panel.append(Button(msg,onClick=onClick)) which obviously doesn't do what one could expect today. OTOH as for general mutability, using a mutable object (list,...) would allow for mutability when one really need it (rarely).

At 00:43 22.10.2003 +0200, Samuele Pedroni wrote:
of course OTOH cells make it easier to cope with recursive references: def g(): def f(x): ... f refers to f ... return f but this seem more an implementation detail, although not using cells would make this rather trickier to support.

[Changing the subject.] [Samuele]
It was done this way because not everybody agreed that closed-over variables should be read-only, and the current semantics allow us to make them writable (as in Scheme, I suppose?) if we can agree on a syntax to declare an "intermediate scope" global. Maybe "global x in f" would work? def outer(): x = 1 def intermediate(): x = 2 def inner(): global x in outer x = 42 inner() print x # prints 2 intermediate() print x # prints 42 --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
Why not make local variables attributes of the function, i.e. replace: def inner(): global x in outer x = 42 with: def inner(): outer.x = 42 Global variables could then be assigned via: global.x = 42 Could this be made backwards compatible? Bye, Walter Dörwald

Because this already means something! outer.x refers to the attribute x of function outer. That's quite different than local variable x of the most recent invocation of outer on the current thread's call stack!
Global variables could then be assigned via: global.x = 42
This has a tiny bit of appeal, but not enough to bother. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido:
Hmmm, maybe x of outer = 42 Determined-to-get-an-'of'-into-the-language-somehow-ly, Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

In article <200310220314.h9M3Euk11066@oma.cosc.canterbury.ac.nz>, Greg Ewing <greg@cosc.canterbury.ac.nz> wrote:
scope(outer).x = 42 Almost implementable now by using the inspect module to find the first matching scope, except that inspect can't change the local variable values, only look at them. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science

On Wednesday 22 October 2003 00:51, Guido van Rossum wrote: ...
Maybe "global x in f" would work?
Actually, I would rather like to DO AWAY with the anomalous 'global' statement and its weird anomalies such as: x = 23 def f1(u): if u: global x x = 45 def f2(): if 0: global x x = 45 print x f2() print x f1(0) print x "if u:" when u is 0, and "if 0:", should have the same effect to avoid violating the least-astonishment rule -- but when the if's body has a global in it, they don't. Eeek. Plus. EVERY newbie makes the mistake of taking "global" to mean "for ALL modules" rather than "for THIS module", uselessly using global in toplevel, etc. It's a wart and I'd rather work to remove it than to expand it, even though I _would_ like rebindable outers. I'd rather have a special name that means "this module" available for import (yes, I can do that with an import hook today). Say that __this_module__ was deemed acceptable for this. Then, import __this_module__ __this_module__.x = 23 lets me rebind the global-to-this-module variable x without 'global' and its various ills. Yeah, the name isn't _too_ cool. But I like the idea, and when I bounced it experimentally in c.l.py a couple weeks ago the reaction was mildly positive and without flames. Making globals a TAD less handy to rebind from within a function would not be exactly bad, either. (Of course 'global' would stay until 3.0 at least, but having an alternative I could explain it as obsolescent:-). Extending this idea (perhaps overstretching it), some other name "special for import" might indicate outer scopes. Though reserving the whole family of names __outer_<name>__ is probably overdoing it... plus, the object thus 'imported' would not be a module and would raise errors if you tried setattr'ing in it a name that's NOT a local variable of <name> (the import itself would fail if you were not lexically nested inside a function called <name>). Thus this would allow *re-binding* existing local outer names but not *adding* new ones, which feels just fine to me (but maybe not to all). OK, this is 1/4-baked for the closure issue. BUT -- I'd STILL love to gradually ease 'global' out, think the "import __this_module__" idea is 3/4-baked (lacks a good special name...), and would hate to see 'global' gain a new lease of life for sophisticated uses...;-) Alex

Eek. Global statement inside flow control should be deprecated, not abused to show that global is evil. :-)
Plus. EVERY newbie makes the mistake of taking "global" to mean "for ALL modules" rather than "for THIS module",
Only if they've been exposed to languages that have such globals.
uselessly using global in toplevel,
Which the parser should reject.
I think it's not unreasonable to want to replace global with attribute assignment of *something*. I don't think that "something" should have to be imported before you can use it; I don't even think it deserves to have leading and trailing double underscores. Walter suggested 'global.x = 23' which looks reasonable; unfortunately my parser can't do this without removing the existing global statement from the Grammar: after seeing the token 'global' it must be able to make a decision about whether to expand this to a global statement or an assignment without peeking ahead, and that's impossible.
If we removed global from the language, how would you spell assignment to a variable in an outer function scope? Remember, you can *not* use 'outer.x' because that already refers to a function attribute. --Guido van Rossum (home page: http://www.python.org/~guido/)

Yes, I think if we go this path, global should behave as a predefined variable. Maybe we should call it __globals__ after all, consistent with __file__ and __name__ (it would create a cycle, but we have plenty of those already). Though I still wish it didn't need underscores. Maybe 'globals' could sprout __getattribute__ and __setattr__ methods that would delegate to the current global module? --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
Another idea: We could replace the function globals() with an object that provides __call__ for backwards compatibility, but also has a special __setattr__. Then global assignment would be 'globals.x = 23'. Would this be possible? Bye, Walter Dörwald

Yes, I just proposed this in my previous response. :-) --Guido van Rossum (home page: http://www.python.org/~guido/)

So maybe the idea of using function attributes isn't totally nuts, if you use a special name. E.g. outer.__locals__.x and outer.__globals__.x
-1. Way too ugly. --Guido van Rossum (home page: http://www.python.org/~guido/)

On Wednesday 22 October 2003 01:40, Guido van Rossum wrote: ...
Eek. Global statement inside flow control should be deprecated, not abused to show that global is evil. :-)
OK, let's (deprecate them), shall we...?
Actually, I've seen that happen to complete newbies too. "global" is a VERY strong word -- or at least perceived as such.
uselessly using global in toplevel,
Which the parser should reject.
Again: can we do that in 2.4?
Using attribute assignment is my main drive here. I was doing it via import only to be able to experiment with that in today's Python;-).
So it can't be global, as it must stay a keyword for backwards compatibility at least until 3.0. What about: this_module current_module sys.modules[__name__] [[hmmm this DOES work today, but...;-)]] __module__ ...?
scope(outer).x , making 'scope' a suitable built-in factory function. I do think this deserves a built-in. If we have this, maybe scope could also be reused as e.g. scope(global).x = 23 ? I think the reserved keyword 'global' SHOULD give the parser no problem in this one specific use (but, I'm guessing...!). Alex

(Changing the subject yet again)
We can't expect everybody to guess the rules of the language purely based on the symbols used. But I appreciate the argument; 'global' comes from ABC's SHARE, but ABC doesn't have modules. (It does have workspaces, but AFAIR there is no communication at all between workspaces, so it isn't unreasonable that a SHAREd name in one workspace isn't visible in another workspace.)
Submit a patch. It'll probably break plenty of code though (I bet you including Zope :-), so you'll have to start with a warning in 2.4.
You could have writen an import hook that simply inserted __globals__ in each imported module. :-)
__module__ can't work because it has to be a string. (I guess it could be a str subclass but that would be too perverse.) Walter and I both suggested hijacking the 'globals' builtin. What do you think of that?
Hm. I want it to be something that the compiler can know about reliably, and a built-in function doesn't work (yet). The compiler currently knows enough about nested scopes so that it can implement locals that are shared with inner functions differently (using cells). It's also too asymmetric -- *using* x would continue to be just x. Hmm. That's also a problem I have with changing global assignment -- I think the compiler should know about it, just like it knows about *using* globals. And it's not just the compiler. I think it requires more mental gymnastics of the human reader to realize that def outer(): def f(): scope(outer).x = 42 print x return f outer()() prints 42 rather than being an error. But how does the compiler know to reserve space for x in outer's scope? Another thing is that your proposed scope() is too dynamic -- it would require searching the scopes that (statically) enclose the call for a stack frame belonging to the argument. But there's no stack by the time f gets called in the last example! (The curernt machinery for nested scopes doesn't reference stack frames; it only passes cells.)
I don't want to go there. :-) (If it wasn't clear, I'm struggling with this subject -- I think there are good reasons for why I'm resisting your proposal, but I haven't found them yet. The more I think about it, the less I like 'globals.x = 42' . --Guido van Rossum (home page: http://www.python.org/~guido/)

At 17:42 21.10.2003 -0700, Guido van Rossum wrote:
. suggests runtime, for compile time then maybe global::x=42 module::x=42 outer::x=42 (I don't like those, and personally I don't see the need to get rebinding for closed-over variables but anyway) another possibility is that today <name> <name> is a syntax error, so maybe global x = 42 or module x = 42 they would not be statements, this for symmetry would also be legal: y = module x + 1 then outer x = 42 and also y = g x + 1 the problems are also clear, in some other languages x y is function application, etc..

[Samuele]
. suggests runtime, for compile time then maybe
Right, that's what I don't like about it.
I don't like these either.
Juxtaposition of names opens a whole lot of cans of worms -- for one, it makes many more typos pass the parser. --Guido van Rossum (home page: http://www.python.org/~guido/)

On Tue, 2003-10-21 at 20:42, Guido van Rossum wrote:
I think it's good that attribute assignment and variable assignment look different. An object's attributes are more dynamic that the variables in a module. I don't see much benefit to conflating two distinct concepts. Jeremy

Guido> (If it wasn't clear, I'm struggling with this subject -- I think Guido> there are good reasons for why I'm resisting your proposal, but I Guido> haven't found them yet. The more I think about it, the less I Guido> like 'globals.x = 42' . How about __.x = 42 ? Skip

How about
__.x = 42
Too much line-noise, so too Perlish. :-) I don't like to use a mysterious symbol like __ for a common thing like a global variable. I also don't think I want global variable assignments to look like attribute assignments. --Guido van Rossum (home page: http://www.python.org/~guido/)

Go easy on me for piping up here, but aren't they attribute assignments or at least used as such? After reading the other posts in this thread I wonder if it would be helpful to have more information on how "global" is used in practice (and maybe some of those practices will be deemed bad, who knows).
From my (a user of Python) perspective, "global" has two uses:
1) Attaching objects to the module, so that other modules do a module.name to get at the object 2) Putting objects in some namespace visible to the rest of the module. Now whether or not #1 is "good" or "bad" - I don't know, but it sure looks like attribute assignment to me. Again, please regard this as just feedback from a user, but especially outside of the module it looks and acts like attribute assignment, I would expect the same to be true inside the module, and any distinction would seem arbitrary or artificial (consider, for example, that it is not an uncommon practice to write a module instead of a class if the class would be a singleton). As for #2, I personally don't use global at all because it just rubs me the wrong way (the same way it would if you removed "self." in a method and made bind-to-instance implicit like in C++). Instead, many of my modules have this at the top: class GV: someVar1 = None someVar2 = 5 (where GV = "global variables") I felt _really_ guilty doing this the first few times and I continue to thing it's yucky, but I don't know of a better alternative, and this approach reads better, especially compared to: global foo <more than a few lines of code> foo = 10 Seeing GV.foo = 10 adds a lot to readability.
Shutting up, -Dave

On Wednesday 22 October 2003 07:07 pm, Dave Brueck wrote: ...
I entirely afree with this "user of Python" perspective, and I think it's a pity it's been ignored in the following discussion.
and any distinction would seem arbitrary or artificial (consider, for
Yes! If the compiler needs to be aware of global assignments (which IS a good idea) we can do so by either introducing a new "operator keyword", OR something like Barry's suggestion of "import __me__" with __me__ as a magicname recognized by the compiler (hey, if it can recognize __future__ why not __me__?-). But to the Python user, making things look similar when their semantics and use ARE similar is a wonderful idea.
example, that it is not an uncommon practice to write a module instead of a class if the class would be a singleton).
Indeed, that IS the officially recommended practice (and Guido emphasized that in rather adamant words after he had recovered from the shock of seeing the Borg nonpattern presented at a Python-UK session...:-). Alex

On Sat, Oct 25, 2003 at 04:03:17PM +0200, Alex Martelli wrote:
Yes! If the compiler needs to be aware of global assignments (which IS a good idea) we can do so by either introducing a new "operator keyword"
One thing that I've always wondered about, why can't one do: def reset_foo(): global foo = [] # declare as global and do assignment As Alex pointed out in another mail (I'm paraphrasing liberally): redundancy is bad. By having to declare foo as global, there's a guaranteed redundancy of the variable when foo is also assigned. I don't know if this solution would make Alex dislike global less. But it changes global to look more like a statement, rather than a declaration. Neal

On Saturday 25 October 2003 04:29 pm, Neal Norwitz wrote:
Indeed, you can see 'global', in this case, as a kind of "operator keyword", modifying the scope of foo in an assignment statement. I really have two separate peeves against global (not necessarily in order of importance, actually): -- it's the wrong keyword, doesn't really _mean_ "global" -- it's a "declarative statement", the only one in Python (ecch) (leading to weird uncertainty about where it can be placed) -- "side-effect" assignment to globals, such as in def, class &c statements, is quite tricky and error-prone, not useful Well, OK, _three_ peeves... usual Spanish Inquisition issue...:-) Your proposal is quite satisfactory wrt solving the second issue, from my viewpoint. It would still create a unique-in-Python construct, but not (IMHO) a problematic one. As you point out, it _would_ be more concise than having to separately [a] say foo is global then [b] assign something. It would solve any uncertainty regarding placement of 'global', and syntactically impede using global variables in "rebinding as side-effect" cases such as def &c, so the third issue disappears. The first issue, of course, is untouched:-). It can't be touched without choosing a different keyword, anyway. So, with 2 resolutions out of 3, I do like your idea. However, I don't think we can get there from here. Guido has explained that the parser must be able to understand a statement that starts with 'global' without look-ahead; I don't know if it can keep accepting, for bw compat and with a warning, the old global xx while also accepting the new and improved global xx = 23 But perhaps it's not quite as hard as the "global.xx = 23" would be. I find Python's parser too murky & mysterious to feel sure. Other side issues: if you rebind a module-level xx in half a dozen places in your function f, right now you only need ONE "global xx" somewhere in f (just about anywhere); with your proposal, you'd need to flag "global xx = 23" at each of the several assignments to that xx. Now, _that suits me just fine_: indeed, I LOVE the fact that a bare "xx = 23" is KNOWN to set a local, and you don't have to look all over the place for declarative statements that might affect its semantics (hmmm, perhaps a 4th peeve vs global, but I see it as part and parcel of peeve #2:-). But globals-lovers might complain that it makes using globals a TAD less convenient. (Personally, I would not mind THAT at all, either: if as a result people use 10% fewer globals and replace them with arguments or classes etc, I think that will enhance their programs anyway;-). So -- +1, even though we may need a different keyword to solve [a] the problem of getting there from here AND [b] my peeve #1 ...:-). Alex

[Neal]
def reset_foo(): global foo = [] # declare as global and do assignment
[Alex]
I haven't heard anyone else in this thread agree with you on that one. I certainly don't think it's of earth-shattering ugliness.
-- it's a "declarative statement", the only one in Python (ecch) (leading to weird uncertainty about where it can be placed)
I'd be happy to entertain proposals for reasonable restrictions on where 'global' can be placed. (Other placements would have to be deprecated at first.)
-- "side-effect" assignment to globals, such as in def, class &c statements, is quite tricky and error-prone, not useful
Agreed; nobody uses these, but again this can be fixed if we want to (again we'd have to start deprecating existing use first). Note that this is also currently allowed and probably shouldn't: def f(): global x for x in ...: ...
Well, *every* construct is "unique in Python", isn't it? Because Python has only one of each construct, in line with the TOOWTDI zen. Or do you mean "not seen in other languages"? I'd disagree -- lots of languages have something similar, e.g. "int x = 5;" in C or "var x = 5" in JavaScript. IMO, "global x = 5" is sufficiently similar that it will require no time to learn.
I don't think that Neal's proposal solves #3, unless 'global x = ...' becomes the *only* way. Also, I presume that the following: def f(): global x = 21 x *= 2 print x should continue to be value, and all three lines should reference the same variable. But #3 is moot IMO, it can be solved without mucking with global at all, by simply making the parser reject 'class X', 'def X', 'import X' and 'for X' when there's also a 'global X' in effect. Piece of cake.
There is absolutely no problem recognizing this.
But perhaps it's not quite as hard as the "global.xx = 23" would be. I find Python's parser too murky & mysterious to feel sure.
If you can understand what code can be recognized by a pure recursive descent parser with one token lookahead and no backtracking, you can understand what Python's parser can handle.
You may love this for assignments, but for *using* variables there is already no such comfort. Whether "print xx" prints a local or global variable depends on whether there's an assignment to xx anywhere in the same scope. So I don't think that is a very strong argument.
--Guido van Rossum (home page: http://www.python.org/~guido/)

It seems noone liked (or remembered) an idea I proposed last february, but I'm going to repost it anyway: How about adding a "rebinding" operator, for example spelled ":=": a := 2 It would mean: bind the value 2 to the nearest scope that defines 'a'. Original post: http://mail.python.org/pipermail/python-dev/2003-February/032764.html A better summary by someone else who liked it: http://groups.google.com/groups?selm=mailman.1048248875.10571.python- list%40python.org Advantages: no declarative statement (I don't like global much to begin with, but much less for scope declarations other that what it means now). It's a nice addition to the current scoping rule: an assignment IS a scope declaration. Possible disadvantage: you can only rebind to the nearest scope that defines the name. If there's a farther scope that also defines that name you can't reach that. But that's nicely symmetrical with how _reading_ values from nested scopes work today, shadowing is nothing new. Ideally, augmented assignments would also become "rebinding". However, this may have compatibility problems. Just

How about adding a "rebinding" operator, for example spelled ":=":
a := 2
I expect Guido would object to that on the grounds that it's conferring arbitrary semantics on a symbol. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

Hardly arbitary (I have fond memories of several languages that used :=). But what is one to make of a function that uses both a := 2 and a = 2 ??? --Guido van Rossum (home page: http://www.python.org/~guido/)

Hardly arbitary (I have fond memories of several languages that used :=).
But all the ones I know of use it for ordinary assignment. We'd be having two kinds of assignment, and there's no prior art to suggest to suggest which should be = and which :=. That's the "arbitrary" part. The only language I can remember seeing which had two kinds of assignment was Simula, which had := for value assignment and :- for reference assignment (or was it the other way around? :-) I always thought that was kind of weird. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

On Sunday 26 October 2003 10:13, Greg Ewing wrote:
VB6 had LET x = y for value assignment and SET x = y for reference assignment. Yes, very confusing particularly because the LET keyword could be dropped. Fortunately we're not proposing anything like that;-). Icon had := for irreversible and <- for reversible assignment. (also :=: and <-> for exchanges and diffferent comparisons for == and === so maybe it HAD gone a bit overboard:-). I do recall an obscure language where <op>= was always augmented assignment equivalent to a = a <op> b. But in particular the : operator meant to evaluate two exprs and take the RH one, like comma in C, so a := b did turn out to mean the same as a = b BUT fail if a couldn't first be evaluated, which (sort of randomly) is sort of close to Just's proposal. Unfortunately I don't remember the language's name:-(. Googling a bit does show other languages distinguishing global from local variable assignments. E.g, in MUF, http://www.muq.org/~cynbe/muq/muf1_24.html , --> (arrow with TWO hyphens) assigns globally, -> (arrow with ONE hyphen) assigns locally. It appears that this approach is slightly less popular than the 'qualification' one I suggested (e.g. in Javascript you can assign window.x to assign the global x; in Beanshell, super.x to assign to x from enclosing scope) which in turn is less popular than declarations. Another not very popular idea is distinguishing locals and globals by name rules, as in Ruby $glob vs loc or KVirc Glob (upper initial) vs loc (lower initial). Alex

Guido van Rossum wrote:
Hardly arbitary (I have fond memories of several languages that used :=).
I think augmented assignment should (ideally) also be rebinding, and := kindof looks like an augmented assignment, so I don't think it's all that bad. I'd be used to it in a snap. But: let's not get carried away with this particular spelling, the main question is: "is it a good idea to have a rebinding assignment operator?" (regardless of how that operator is spelled). Needless to say, I think it is.
Simple, "a = 2" means 'a' is local to that function, so "a := 2" will rebind in the same scope. So the following example will raise UnboundLocalException: def foo(): a := 3 a = 2 And this will just work (but is kindof pointless): def foo(): a = 2 a := 3 And this would be a substitute for the global statement: a = 2 def foo(): a := 3 (Alex noted in private mail that one disadvantage of this idea is that it makes using globals perhaps TOO easy...) Just

On Sunday 26 October 2003 04:29, Guido van Rossum wrote:
Now, operator :=) MIGHT indeed be worth considering -- "rebinding assignment with a smile"! Yes, of course := IS a very popular way to denote assignment.
What would astonish me least: the presence of a normal rebiding would ensure a is local. I would prefer, therefore, if the compiler AT LEAST warned about the presence of := at the same scope, and probably I'd be even happier if the compiler flagged it as an outright error. I just can't think of good use cases for wanting both at the same scope on the same name. I can think of a dubious one: a style where = would be used as "initializing declaration" for a name at function start, and all further re-bindings of the name systematically always used := -- I can think of people who might prefer that style, but it might be best for Python to avoid style variance by forbidding it (since it obviously can't be _mandated_, thanks be:-). By forbidding compresence of = and := on the same name at the same scope, := becomes an unmistakable yet unobtrusive symbol saying "this assignment here is to a NON-local name", and thus amply satisfies my long-debated unease wrt "global". Alex

On Sunday 26 October 2003 01:09, Just van Rossum wrote:
In the light of the current discussion, this looks beautiful. At least if compresence of := and other bindings (= , class, def, for, import, ...) for the same name at the same scope is flagged as an error. I would also suggest for simplicity that := be only allowed in the simplest form of assignment: to a single bare name -- no packing, unpacking, chaining, nor can the LHS be an indexing, slicing, nor dotted name.
Yes. Neat. := becomes an unobtrusive but unmistakable indication "I'm binding this name in NON-local scope" and -- if defined with the restrictions I suggest -- meets all of my issues wrt 'global'.
I agree. Reaching other scopes but the "closest" outer one is not a use case of any overriding importance, IMHO.
Ideally, augmented assignments would also become "rebinding". However, this may have compatibility problems.
Unfortunately yes. It might have been better to define them that way in the first place, but changing them now is dubious. Besides, we could not load them with the restrictions I think should be put on := to make it simplest, sharpest, and most useful. Alex

Alex Martelli <aleaxit@yahoo.com>:
I'm not so sure. You need an existing binding before an augmented assignment will work, so I don't think there can be any correct existing usages that would be broken by this. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

[attribution lost]
Ideally, augmented assignments would also become "rebinding". However, this may have compatibility problems.
[Alex]
Unfortunately yes. It might have been better to define them that way in the first place, but changing them now is dubious.
[Greg]
Indeed. If x is neither local not declared global, x+=... is always an error, even if an x at an intermediate level exists, so THAT shouldn't be used as an argument against this. --Guido van Rossum (home page: http://www.python.org/~guido/)

On Monday 27 October 2003 04:58 am, Guido van Rossum wrote:
Actually, if the compiler were able to diagnose that, it would be wonderful -- but I don't think it can, because it can make no assumptions regarding what might be defined in global scope (or at least it definitely can't make any such assumptions now). So, yes, any sensible program that works today would keep working. I dunno about NON-sensible programs such as: def outer(): x = 23 def inner(): exec 'x = 45' x+=1 # etc etc but then I guess the presence of 'exec' might be defined to change semantics of += and/or disallow := or whatever else, just as today it turns off local-variable optimizations. My slight preference for leaving += and friends alone is that a function using them to rebind nonlocals would be hard to read, that since the change only applies when the LHS is a bare name the important use cases for augmented assignment don't apply any way, that it's a bit subtle to explain that foo.bar += baz ( += on a dotted name) implies a plain assignment (setattr) on foo.bar while foo_bar += baz ( += on bare name) might imply a := assignment (rebinding a nonlocal) IF there are no "foo_bar = baz" elsewhere in the same function BUT it would imply a plain assignment if there ARE other plain assignments to the same name in the same function, ... IOW it seems to me that we're getting into substantial amounts of subtlety in explaining (and thus maybe in implementing) a functionality change that's not terribly useful anyway and may damage rather than improve readability when it's used. Taking the typical P. Graham accumulator example, say: with += rebinding, we can code this: def accumulator(n=0): def increment(i): n += i return n return increment but without it, we would code: def accumulator(n=0): def increment(i): n := n + i return n return increment and it doesn't seem to me that the two extra keystrokes are to be considered a substantial price to pay. Admittedly in such a tiny example readability is just as good either way, as it's obvious which n we're talking about (there being just one, and extremely nearby wrt the point of use of either += or := ). Suppose we wanted to have the accumulator "saturate" -- if the last value it returned was > m it must restart accumulating from zero. Now, without augmented assignment: def accumulator_saturating(n=0, m=100): def increment(i): if n > m: n := i else: n := n + i return n return increment we have a pleasing symmetry and no risk of errors -- if we mistakenly use an = instead of := in either branch the compiler will be able to let us know immediately. (Actually I'd be quite tempted to code the if branch as "n := 0 + i" to underscore the symmetry, but maybe I'm just weird:-). If we do rely on augmented assignment being "rebinding": def accumulator_saturating(n=0, m=100): def increment(i): if n > m: n = i else: n += i return n return increment the error becomes a runtime rather than compile-time one, and does take a (small but non-zero) time to discover it. The += 's subtle new semantics (rebinds either a local or nonlocal, depending on how other assignments elsewhere in the function are coded) do make it slightly harder to understand and explain, compared to my favourite approach, which is: := is the ONLY way to rebind a nonlocal name (and only ever does that, only with a bare name on LHS, etc, etc) which can't be beaten in terms of how simple it is to understand and explain. The compiler could then diagnose an error when it sees := and += used on the same barename in the same function (and perhaps give a clear error message suggesting non-augmented := usage in lieu of the augmented assignment). Can somebody please show a compelling use case for some "nonlocal += expr" over "nonlocal := nonlocal + expr" , sufficient to override all the "simplicity" arguments above? I guess there must be some, since popular feeling appears to be in favour of having augmented-assignment as "rebinding", but I can't see them. Alex Alex

Alex Martelli wrote:
To an extent you're only making it _more_ difficult by saying "x := ..." rebinds to a non-local name" instead of "x := rebinds to x in whichever scope x is defined (which may be the local scope)". With the latter definition, there's less to explain regarding "x += ..." as a rebinding operation. I find that _if_ we were to add a rebinding operator, it would be extremely silly not to allow augmented assignments to be rebinding, perhaps even patronizing: "yes you can assign to outer scopes, but no you can't use augmented assignments for that since we think it makes it too difficult for you." We should either _not_ allow assignments to outer scopes at all, _or_ allow it and make it as powerful as practically possible. I don't think allowing it with non-obvious (arbitrary) limitations is a good idea. For example, the more I think about it, the more I am _against_ disallowing "a, b := b, a". That said, someone made a point here that rebinding is a behavior of a variable, not the assignment operation: that's a very good one indeed, and does make me less certain of whether adding := would be such a good idea after all. Just

I think you're making this sound more complicated than it is. I don't think you'll ever *have* to explain this anyway, as long as := and += use the same rules to find their target (I'd even accept rejecting the case where the target is a global for which the compiler can't find a single assignment, breaking an utterly minuscule amount of bad code, if any). I'm *not* saying that I like := (so far I still like 'global x in f' better) but I think that either way of allowing rebinding nonlocals will also have to allow rebinding them through += and friends. I think the main weakness (for me) of := and other approaches that try to force you to say you're rebinding a nonlocal each time you do it is beginning to show: there are already well-established rules for deciding whether a bare name is local or not, and those rules have always worked "at a distance". The main reason for disallowing rebinding nonlocals in the past has been that one of those rules was "if there's a bare-name assignment to it it must be local (unless there's also a global statement for it)" (and I couldn't find a satisfactory way to add a nonlocal declarative statement and I didn't think it was a huge miss -- actually I still think it's not a *huge* miss).
That's the argument that has always been used against += by people who don't like it. The counterargument is that (a) the savings in typing isn't always that small, and (b) += *expresses the programmer's thought better*. Personally I expect that as soon as nonlocal rebinding is supported in any way, people would be hugely surprised if += and friends were not.
Hah. Another argument *against* rebinding by :=, and *for* a nonlocal declaration. With 'nonlocal n, m' in increment() (or however it's spelled :-) the intent is clear.
--Guido van Rossum (home page: http://www.python.org/~guido/)

On Mon, Oct 27, 2003 at 07:11:16AM -0800, Guido van Rossum wrote:
I dislike := very much. I think it will confuse newbies and thus be abused. While I dislike the global declaration, I don't feel strongly about changing or removing it. The best alternative I've seen that addresses nested scope and the global declaration. Is to borrow :: from C++: foo = DEFAULT_VALUES counter = 0 def reset_foo(): ::foo = DEFAULT_VALUES def inc_counter(): ::counter += 1 def outer(): counter = 5 def inner(): ::counter += outer::counter # increment global from outer outer::counter += 2 # increment outer counter The reasons why I like this approach: * each variable reference can be explicit when necessary * no separate declaration * concise, no wording issues like global * similarity between global and nested scopes (ie, ::foo is global, scope::foo is some outer scope) both the global and nested issues are handled at once * doesn't prevent augmented assignment * it reads well to me and the semantics are pretty clear (although that's highly subjective) Neal

The only problem with using :: is a syntactic ambiguity: a[x::y] already means something (an extended slice with start=x, no stop, and step=y). --Guido van Rossum (home page: http://www.python.org/~guido/)

On Mon, Oct 27, 2003 at 08:51:16AM -0800, Guido van Rossum wrote:
I'm not wedded to the :: digraph, I prefer the concept. :: was nice because it re-used a similar concept from C++. No other digraph jumps out at me. Some other possibilities (I don't care for any of these): Global Nested ------ ------ :>variable scope:>variable *>variable scope*>variable ->variable scope->variable ?>variable scope?>variable &>variable scope&>variable Or perhaps variations using <. Neal

On Monday 27 October 2003 06:08 pm, Neal Norwitz wrote:
Does it have to be a digraph? We could use one of the ASCII chars Python doesn't use. For example, $ would give us exactly the same way as Ruby to strop global variables (though, differently from Ruby, we'd only _have_ to strop them on rebinding -- more-common "read" accesses would stay clean) -- $variable meaning 'global'. And scope$variable meaning 'outer'. OTOH, if we used @ instead, it would read better the other way 'round -- variable@scope DOES look like a pretty natural way to indicate "said variable at said scope" -- though it doesn't read quite as well _without_ a scope. Alex

On Monday 27 October 2003 07:00 pm, Just van Rossum wrote:
Sorry, just, but I really don't understand the "don't see immediate problem". As I understand the proposal: y = 23 biglist = range(999) def f(): y = 45 # sets a local ::y = 67 # sets the global print biglist[::y] should this print the 67-th item of biglist, or the first 45 ones? a[x::y] is similarly made ambiguous (slice from x step y, or index at y in scope x?), at least for human readers if not for the compiler -- to have the same expression mean either thing depending on whether x names an outer function, a local variable, or neither, or both, for example, would seem very confusing to me.
I like Neal's proposal, including the "::" digraph.
I just don't see how :: can be used nonconfusingly due to the 'clash' with "slicing with explicit step and without explicit stop" (ambiguity with slices with implicit 0 start for prefix use, a la ::y -- ambiguity with slices with explicit start for infix use, a la x::y). A digraph, single character, or other operator that could be used (and look nice) in lieu of :: either prefix or infix -- aka "stropping by any other name", even though the syntax sugar may look different from Ruby's use of prefix $ to strop globals -- would be fine. But I don't think :: can be it. Alex

Walter Dörwald:
I think ':=' is to close to '='. The default assigment should be much easier to type than the special case.
Well, typing "outer x = value" would require 6 more keystrokes than "x = value". Would that be difficult enough for you? :-) Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

The best alternative I've seen that addresses nested scope and the global declaration. Is to borrow :: from C++:
-1000! I hate it whenever an otherwise sensible language borrows this ugly piece of syntax. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

On Monday 27 October 2003 04:11 pm, Guido van Rossum wrote: ...
I don't think you'll ever *have* to explain this anyway, as long as := and += use the same rules to find their target (I'd even accept
Actually, I'd like to make a := ... an error when there's an a = ... in the same function, so it can't be exactly the same rules for a += ... in my opinion.
I'm *not* saying that I like := (so far I still like 'global x in f'
Ah well.
There are, but they represent a wart (according to AMK's python-warts page, http://www.amk.ca/python/writing/warts.html , and I agree with him on this, although NOT with his suggested fix of having the compiler "automatically adding a global when needed" -- I don't like too-clever compilers that make subtle inferences behind my back, and I think that the fact that Python's compiler doesn't is a strength, not a weakness). The "well-established rules" also cause one of the "10 Python pitfalls" listed at http://zephyrfalcon.org/labs/python_pitfalls.html . My personal experience teaching/consulting/mentoring confirms this, although I, personally, don't remember having been bitten by this (but then, I recall only 2 of those 10 pitfalls as giving trouble to me personally, as opposed to people I taught/advised/etc: mutable default arguments, and "loops of x=x+y" performance traps for sequences). It seemed to me that introducing := (or other approaches that require explicit denotation of "I'm binding a nonlocal here") was a chance to FIX the warts/pitfalls of those "already well-established rules". Albeit with a heavy heart, I would consider even a Rubyesque stropping of nonlocals (Ruby uses $foo to mean foo is nonlocal, others here have suggested :foo, whatever, it's not the sugar that matters most to me here) preferable to using "declarative statements" for the purpose. Oh well.
Agreed, not huge, just probably marginally worth doing. Should it make "declarative statements" more popular and widely used than today's bare "global", I don't even know if it would be worth it. I don't like declarative statements. I don't understand why you like them here, when, in your message of Thursday 23 October 2003 06:25:49 on "accumulator display syntax", you condemned a proposal "because it feels very strongly like a directive to the compiler". "A directive to the compiler" is exactly how "global" and other proposed declarative-statements feel to me: statements that don't DO things (like all other statements do), but strictly and only are "like a directive to the compiler".
The saving in typing is not always small _when on the left of the augmented assignment operator you have something much more complicated than just a bare name_. For example, counter[current_row + current_column * delta] += current_value Without += this statement would be too long, and it would be hard to check that the LHS and RHS match exactly -- in practice one would end up breaking it in two, current_index = current_row + current_column * delta counter[current_index] = counter[current_index] + current_value which IS still substantially more cumbersome than the previous version using += . But this counterargument does not apply to uses of += on bare names: the saving is strictly limited to the length of the bare name, which should be reasonably small.
We could try an opinion poll, but it's probably worth it only if this measure of "expected surprise" was the key point for your decision; if you're going to prefer declarative statements anyway, there's no point going through the aggravation.
I disagree that the example is "an argument for declarations": on the contrary, it's an argument for := without "rebinding +=". The erroneous example just reposted gives a runtime error anyway (I don't know why I wrote it would give a compile-time error -- just like a bare "def f(): x+=1" doesn't give a compile-time error today, so, presumably, wouldn't this reposted example). If "n := n + i" WAS used in lieu of the augmented assignment, THEN -- and only then -- could we give the preferable compile- time error, for forbidden mixing of "n = ..." and "n := ..." in different spots in the same function. Alex

At 07:11 27.10.2003 -0800, Guido van Rossum wrote:
I'm *not* saying that I like := (so far I still like 'global x in f' better)
if I understand 'global x in f' will introduce a local x in f even if there is none, for symmetry with global. Maybe this has already been answered (this thread is getting too long, and is this change scheduled for 2.4 or 3.0?) but x = 'global' def f(): def init(): global x in f x = 'in f' init() print x f() will this print 'global' or 'in f' ? I can argument both ways which is not a good thing. Thanks.

The compiler does a full analysis so it will know that init() refers to a cell for x in f's locals, and hence it will print 'in f'. For the purposes of deciding which variables live where, the presence of 'global x in f' inside an inner function (whether or not there's a matching assignment) is equivalent to the presence of an assignment to x in f's body. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido:
If we adopt a method of nonlocal assignment that allows the deprecation of "global", then we have a chance to change this, if we think that such "at-a-distance" rules are undesirable in general. Do we think that? Einstein-certainly-seemed-to-ly, Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

Alex certainly seems to be arguing this, but I think it's a lost cause. Even Alex will have to accept the long-distance effect of def f(): x = 42 . . (hundreds of lines of unrelated code) . print x And at some point in the future Python *will* grow (optional) type declarations for all sorts of things (arguments, local variables, instance variables) and those will certainly have effect at a distance. --Guido van Rossum (home page: http://www.python.org/~guido/)

On Tuesday 28 October 2003 03:55 am, Guido van Rossum wrote:
I must have some Don Quixote in my blood. Ah, can anybody point me to the nearest windmill, please...?-) Seriously, I realize by now that I stand no chance of affecting your decision in this matter. Nevertheless, and that attitude may indeed be quixotical, I still have (just barely) enough energy not to let your explanation of your likely coming decision stand as if it was ok with me, or as if I had no good response to your arguments. If it's a lost cause, I think it's because I'm not being very good at marshaling the arguments for it, not because those arguments are weak. So, basically for the record, here goes, once more...
I have absolutely no problem with that -- except that it's bad style, but the language cannot, in general, force good style. The language can and should ALLOW good style, but enforcing it is not always possible. In (old) C, there was often no alternative to putting a declaration far away from the code that used the variable, because declarations had to come at block start. Sometimes you could enclose declaration and use in a nested sub-block, but not always. C++ and modern C have removed this wart by letting declarations come at any point before the variable's used, and _encouraging_ (stylistically -- no enforcement) the declaration to come always together with the initialization. That's about all a language can be expected to do in this regard: not forbid "action at a distance" (that would be too confining), but _allow_ and _encourage_ most programs to avoid it. Python is and always has been just as good or even better: there being no separate declaration, you _always_ have the equivalent of it "at the first initialization" (as C++ and modern C encourage but can't enforce), and it's perfectly natural in most cases to keep that close to the region in a function where the name is of interest, if that region comprises only a subset of the function's body. But this, to some extent, is a red herring. "Reading" (accessing) the value referred to by a name looks the name up by rules I mostly _like_, even though it is quite possible that the name was set "far away". As AMK suggests in his "Python warts" essay, people don't often get in trouble with that because _most_ global (module-level, and even more built-in) names are NOT re-bound dynamically. So, when I see, e.g., print len(phonebook) it's most often fine that phonebook is global, just as it's fine that len is built-in (it may be argued that we have "too many" built-in names, and similarly that having "too many" global names is not a good thing, but having SOME such names is just fine, and indeed inevitable -- perhaps Python may remedy the "too many built-ins" in 3.0, and any programmer can refactor his own code to alleviate the "too many globals" -- no deep problem here, in either case). Re-binding names is different. It's far rarer than accessing them, of course. And while all uses of "print x" mean (semantics equivalent to) "look x up in the locals, then if not found there in outer scopes, then if not found there in the globals, then if not found there in the builtins" -- a single, reasonably simple and uniform rule, independent from any "purely declarative statement", which just determines where the value will come from -- the situation for "x=42" is currently different. It's a rarer situation than just accessing x; it's _more_ important to know where x will be bound, because that will affect its future lifetime -- which we don't particularly care about when we're just accessing it, but is more important when we're setting it; _and_ (alas!) it's affected by a _possible_, purely-declarative, instruction-to-the-compiler "global" statement SOMEwhere. "Normally", "x=42" binds or rebinds x locally. That's the common case, as rebinding nonlocals is rare. It's therefore a little trap that some (a small % of) the time we are instead rebinding a nonlocal _with no nearby reminder of the fact_. No "nearby reminder" is really needed for the _more common_ case of _accessing_ a name -- partly because "where is this being accessed from" is often less crucial (while it IS crucial when _binding_ the name), partly because it's totally common and expected that the "just access" may be doing lookup in other namespaces (indeed, when I write len(x), it's the rare case where len HAS been rebound that may be a trap!-).
Can we focus on the locals? Argument passing, and setting attributes of objects with e.g. "x.y = z" notation, are already subject to rather different rules than setting bare names, e.g. "x.y = z" might perfectly well be calling a property setter x.setY(z) or x.__setattr__('y', z), so I don't think refining those potentially-subtle rules will be a problem, nor that the situation is parallel to "global". However, optional type declarations for local variables might surely be (both problems and parallel:-), depending on roughly what you have in mind for that. E.g., are you thinking, syntax sugar apart, of some new statement "constrain_type" which might go something like...: def f(): constrain_type(int) x, y, z, t x = 23 # ok y = 2.3 # ??? a z = "23" # ??? b t = "foo" # raise subclass of (TypeError ?) If so, what semantics do you have in mind for cases a and b? I can imagine either an implicit int() call around the RHS (which is why I guess the assignment to t would fail, though I don't know whether it would fail with a type or value error), or an implicit isinstance check, in which case a and b would also fail (and then no doubt with a type error). I may be weird, but -- offhand, and not having had time to reflect on this in depth -- it seems to me that having assignment to bare names 'fail' in some circumstances, while revolutionary in Python, would not be particularly troublesome in the "action at a distance" sense. After all the constrain_type would have the specific purpose of forbidding some assignments that would otherwise succeed, would be used specifically for that, and making "wrong" assignment fail immediately and noisily would be exactly what it's for. I may not think it a GOOD idea to introduce it (for local variables), but if I argued against it it would not be on the lines of "one can't tell by just looking at y=2.3 whether it succeeds or fails". If the concept is to make y=2.3 implicitly do y=int(2.3) I would be much more worried. THEN, with no clear indication to the contrary, we'd have "y=2.3" leave y with a value of 2.3, or 2, or maybe something else for sufficiently weird values of X in a "constrain_type(X) y" -- the semantics of a CORRECT program would suddenly grow subtle dependencies on "nonlocal" ``declarations''. So, if THAT is your intention -- and indeed that would be far closer to the way "global" works: it doesn't FORBID assignments, rather it changes their semantics -- then I admit the parallel is indeed strict, and I would be worried on basically the same grounds as I'm grumbling about 'global' and its planned extensions. Yes, I realize this seems to be arguing _against_ adaptation -- surely if we had "constrain_type(X) y", and set "y = z", we might like an implicit "y = adapt(z, X)" to be the assignment's semantics? My answer (again, this is a first-blush reaction, haven't thought deeply about the issues) is that adaptation is good, but implicit rather than explicit is ungood, and I'm not sure the good is stronger than the ungood here; AND, adaptation is not typecasting: e.g y=adapt("23", int) should NOT succeed. So, while I might be more intrigued than horrified by such novel suggestions, I would surely see the risks in them -- and some of the risks I'd see WOULD be about "lack of local indication of nonobvious semantics shift". Just like with 'global', yes. Alex

The current situation: Rebinding a variable at module scope:
If I try to write "global x.y" inside the function, Idle spits the dummy (and rightly so). I can rebind x.y quite happily, since I am only referencing x, and the lookup rules find the scope I need. I don't see any reason for 'global <var> in <scope>' or syntactic sugar for nonlocal binding (i.e. ":=" ) to accept anything that the current global does not. Similarly, consider the following from Idle:
def f(): x += 1
x = 1 f()
Traceback (most recent call last): File "<pyshell#12>", line 1, in -toplevel- f() File "<pyshell#10>", line 2, in f x += 1 UnboundLocalError: local variable 'x' referenced before assignment Augmented assignment does not currently automatically invoke a "global" definition now, so why should that change no matter the outcome of this discussion? Guido's suggestion of "nonlocal" for a variant of global that searches intervening namespaces first seems nice - the term "non-local variable" certainly strikes me as the most freqently used way of referring to variables from containing scopes in this thread.
If 'nonlocal' was allowed only to _rebind_ variables, rather than create them at some other scope (probably advisable since 'nonlocal' merely says, 'somewhere other than here', which means there is no obvious suggestion for where to create the new variable - I could argue for either "module scope" or "nearest enclosing scope"). Defining it this way also allows catching problems at compile time instead of runtime (YMMV on whether that's a good thing or not) At this point, Just's "rebinding variable from outer scope only" assignment operator "x := 1" might seem like nice syntactic sugar for "nonlocal x; x =1" (it wouldn't require a new keyword, either) Is there really any need to allow anything more then replicating the search order for variable _reference_? Code which nests sufficient scopes that a simple 'inside-out' search is not sufficient would just seem sorely in need of a redesign to me. . . Regards, Nick. -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions."

Nick Coghlan strung bits together to say: ::snip:: Saw the rather sensible suggestion to shelve this discussion only _after_ making my previous post. Ah well. Cheers, Nick. -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions."

Because of the fair user expectation that if you can write "x = x + 1" you should also be able to write "x += 1".
I just realized one thing that explains why I prefer explicitly designating the scope (as in 'global x in f') over something like 'nonlocal'. It matches what the current global statement does, and it makes it crystal clear that you *can* declare a variable in a specific scope and assign to it without requiring there to be a binding for that variable in the scope itself. EIBTI when comparing these two. --Guido van Rossum (home page: http://www.python.org/~guido/)

At 07:27 28.10.2003 -0800, Guido van Rossum wrote:
looking at: x = 'global' def f(): def init(): global x in f x = 'in f' def g(): print x init() g() I don't really know whether to call explicit or implicit the fact that x in g is not the global one. And contrast with x = 'global' def f(): x = 0 def init(): global x x = 'in f' def g(): print x init() g() or consider x = 'global' def f(): global x def init(): global x in f x = 'in f' def g(): print x init() g()

At 09:56 AM 10/28/03 +0100, Alex Martelli wrote:
AND, adaptation is not typecasting: e.g y=adapt("23", int) should NOT succeed.
Obviously, it wouldn't succeed today, since int doesn't have __adapt__ and str doesn't have __conform__. But why would you intend that they not have them in future? And, why do you consider adaptation *not* to be typecasting? I always think of it as "give me X, rendered as a Y", which certainly sounds like a description of typecasting to me.

On Tuesday 28 October 2003 02:57 pm, Phillip J. Eby wrote:
I'd be delighted to have the int type sprout __adapt__ and the str type sprout __conform__ -- but neither should accept this case, see below.
typecasting (in Python) makes a NEW object whose value is somehow "built" (possibly in a very loose sense) from the supplied argument[s], but need not have any more than a somewhat tangential relation with them. adaptation returns "the same object" passed as the argument, or a wrapper to it that makes it comply with the protocol. To give a specific example: x = file("foo.txt") now (assuming this succeeds) x is a readonly object which is an instance of file. The argument string "foo.txt" has "indicated", quite indirectly, how to construct the file object, but there's really no true connection between the value of the argument string and what will happen as that object x is read. Thinking of what should happen upon: x = adapt("foo.txt", file) what I envision is DEFINITELY the equivalent of: x = cStringIO.StringIO("foo.txt") i.e., the value (aka object) "foo.txt", wrapped appropriately so as to conform to the (readonly) "file protocol" (I can call x.read(3) and get "foo", then x.seek(0) then x.read(2) and get "fo", etc). Hmmm, that PEP definitely needs updating (including mentions of PyProtocol as well as of this issue...)...! I've been rather remiss about it so far -- sorry. Alex

At 05:55 PM 10/28/03 +0100, Alex Martelli wrote:
You didn't actually give any example of why 'adapt("23",int)' shouldn't return 23, just why adapt("foo",file) shouldn't return a file. Currently, Python objects may possess an __int__ method for conversion to integers, and a __str__ method for conversion to string. So, it seems to me that for such objects, 'adapt(x,int)' should be equivalent to x.__int__() and 'adapt(x,str)' should be equivalent to x.__str__(). So, there is already a defined protocol within Python for conversion to specific types, with well-defined meaning. One might argue that since it's already possible to call the special method or invoke the type constructor, that it's not necessary for there to be an adapt() synonym for them. However, it's also possible to get an object's attribute or call an arbitrary function by exec'ing a dynamically constructed string instead of using getattr() or having functions as first class objects. So, I don't see any problem with "convert to integer" being 'int(x)' and yet still being able to spell it 'adapt(x,int)' in the circumstance where 'int' is actually a variable or parameter, just as one may use 'getattr(x,y)' when the attribute to be gotten is a variable.
I don't understand the dividing line here. Perhaps that's because Python doesn't really *have* an existing notion of typecasting as such, there are just constructors (e.g. int) and conversion methods (e.g. __int__). However, conversion methods and even constructors of immutable types are allowed to be idempotent. 'int(x) is x' can be true, for example. So, how is that different?
I don't see how any of this impacts the question of whether adapt(x,int) == int(x). Certainly, I agree with you that adapt("foo",file) should not equal file("foo"), but I don't understand what one of these things has to do with the other.

On Tuesday 28 October 2003 07:47 pm, Phillip J. Eby wrote: ...
You didn't actually give any example of why 'adapt("23",int)' shouldn't return 23, just why adapt("foo",file) shouldn't return a file.
Which is sufficient to show that, IN GENERAL, adapt(x, sometype) should not just be the equivalent of sometype(x), as you seemed (and, below, still seem) to argue. Now, if you want to give positive reasons, specific, compelling use-cases, to show why for SOME combinations of type(x) and sometype that general rule should be violated, go ahead, but the burden of proof is on you. If you do want to try and justify such specific-cases exceptions, remember: "adapt(x, foo)" is specified as returning "x or a wrapper around x", and clearly a new object of type foo with no actual connection to x is neither of those. That's a "formal" reasoning from the PEP's actual current text. But perhaps informal reasoning may prove more convincing -- let's try. adaptation is *NOT* conversion -- it's not the creation of a new object that will thereafter live a life separate from the original one. This part is not relevant when the objects are immutable, but it's quite relevant to your GENERAL idea of, e.g.:
But, still on that general idea of yours that I quote above, there is worse, MUCH worse. Consider: an object's type often supports a __str__ that, as per its specs in the docs, is "the ``informal'' string representation of an object ... convenient or concise representation may be used instead". The docs make it AMPLY clear that the purpose of __str__ is STRICTLY for the object's type to give a (convenient, concise, possibly quite incomplete and inaccurate) HUMAN-READABLE representation of the object. To assert that this is in any way equivalent to a claim, on the object type's part, that its instances can "adapt themselves to the string protocol", beggars belief. It borders, I think, on the absurd, to maintain that, for example, "<open file '/goo/bag', mode 'r' at 0x402cbae0>" *IS* my open file object "adapted to" the string protocol. It's clearly a mere human readable representation, a vague phantom of the object itself. It should be obvious that, just as "adapting a string to the (R/O) file protocol" means wrapping it in cStringIO.StringIO, so the reverse adaptation, "adapting a file to the string protocol", should utilize a wrapper object that presents the file's data with all string object methods, for example via mmap.
So, there is already a defined protocol within Python for conversion to specific types, with well-defined meaning. One might argue that since it's
Conversion is one thing, adaptation is a different thing. Creating a new object "somehow related" to an existing one -- i.e., conversion -- is a very different thing from "wrapping" an existing object to support a different protocol -- adaptation. Consider another typical case:
See the point? CONVERSION, aka construction, aka typecasting, i.e. list(x), has created a new object, based on what WERE the contents of x at the time at conversion, but INDEPENDENT from it henceforwards. Adaptation should NOT work that way: adapt(x, list) would, rather, return a wrapper, providing listlike methods (some, like pop or remove, would delegate to x's own methods -- others, like sort, would require more work) and _eventually performing actual operations on x_, NOT on a separate thing that once, a long time ago, was constructed by copying it. Thus, I see foo(x) and adapt(x, foo) -- even in cases where foo is a type -- as GENERALLY very different. If you have SPECIFIC use cases in mind where it would be clever to make the two operations coincide, you still haven't made them; I only heard vague generalities about how adapt(x, y) "should" work without ANY real support for them. If the code that requests adaptation is happy, as a fall-back, to have (e.g.) "<open file '/goo/bag', mode 'r' at 0x402cbae0>" as the "ersatz adaptation" of a file instance to str, for example, it can always do the fall-back itself, e.g. try: z = adapt(x, y) except TypeError: try: z = y(x) except (TypeError, ValueError): # whatever other desperation measures it wants to try To have adapt itself imply such measures would be a disaster, and make adaptation basically unusable in all cases where one might have (e.g.) "y is str".
Yeah, that's much like C++, except C++ is more general in terms of conversion methods -- not only can a constructor for type X accept a Y argument (or const Y&, equivalently), but type Y can also always choose to provide an "operator X()" to typecast its instances to the other type [I think I recall that if BOTH types try to cooperate in such ways you end up with an ambiguity error, though:-)]. That's in contrast to the specific few 'conversion methods' that Python supports only for a small set of numeric types as the destination of the conversion. Either the single-argument constructor or the operator may get used when you typecast (static_cast<X>(y) where y is an instance of Y). There isn't all that much difference between C++'s approach and Python's here, except for C++'s greater generality and the fact that in Python you always use notation X(y) to indicate the typecasting request. ("typecast" is not a C++ term any more than it's Python's -- I think it's only used in some obscure languages such as CIAO, tools like Flex/Harpoon, Mathworks, etc -- so, sorry if my use was obscure). One important difference: in C++, you get to define whether a one-argument constructor gets to be evaluated "implicitly", when an object of type X is required and one of type Y is supplied instead, or not. If the constructor is declared explicit, then it ONLY gets called for EXPLICIT typecasts such as X(y). In Python, we think EIBNI, and therefore typecasts are explicit. We do NOT "adapt" a float f to int when an int is required, as in somelist[f]: we raise a TypeError -- if you want COERCION, aka CONVERSION, to an int, with possible loss of information etc, you EXPLICITLY code somelist[int(f)]. Your proposal that adaptation be, when possible, implemented by conversion, goes against the grain of that good habit and principle. Adaptation in general is not conversion -- when you know you want, or at least can possibly tolerate as a fallback, total conversion, ASK for it, explicitly -- perhaps as a fallback if adaptation fails, as above. Having "adapt(x, y)" just basically duplicate some possible cases of y(x) would be a serious diminution of adaptation's potential and usefulness.
it's part of the PEP that, if isinstance(x, y), then normally x is adapt(x, y) [[ with a specific exception for "non substitutable subclasses" whose usecases I do not know -- anyway, such subclasses would need to be _specifically_ "exempted" from the general rule, e.g. by providing an __adapt__ that raises as needed ]]. So, calling y(x) will be wrong EXCEPT when type y is immutable AND it's EXACTLY the case that "type(x) is y", NOT a subclass, otherwise:
... the 'is' constraint is lost, despite the fact that xx IS quite obviously "substitutable" and has requested NO exception to the rule, AT ALL. Again: adaptation is not conversion -- and this is NOT about the admitted defects in the PEP, because this case is VERY specifically spelled out there. Implementing adapt(x, y) as y(x) may perhaps be of some practical use in some cases, but I am still waiting for you to show any such use case of practical compelling interest. I hope I have _amply_ shown that the implementation strategy is absolutely out of the question as a general one, so it matters up to a point if some very specific subcases are well served by that strategy, anyway. The key issue is, such cases, if any, will need to be very specifically identified and justified one by one. Alex

At 09:55 PM 10/28/03 +0100, Alex Martelli wrote:
I'm not arguing that, nor have I ever intended to. I merely questioned your appearing to argue that adapt(x,sometype) should NEVER equal sometype(x).
Great, so now you know what you'd like file.__conform__(str) to do. This has nothing to do with what I was asking about. You said, in the post I originally replied to: "y=adapt("23", int) should NOT succeed." And I said, "why not?" This is not the same as me saying that adapt(x,y) for all y should equal y(x). Such an idea is patently absurd. I might, however, argue that adapt(x,int) should equal int(x) for any x whose __conform__ returns None. Or more precisely, that int.__adapt__(x) should return int(x). And that is why I'm asking why you appear to disagree. However, you keep talking about *other* values of y and x than 'int' and "23", so I'm no closer to understanding your original statement than before.
For protocols whose contract includes immutability (such as 'int') this distinction is irrelevant, since a snapshot is required. Or are you saying that adaptation cannot be used to adapt a mutable object to a protocol that includes immutability?
It's you who has proposed how they work, and I who asked a question about your statement.
I'm not aware that I made such a proposal. I asked why you thought that adapt('23',int) should *not* return 23.
[lots more snipped]
We seem to be having two different conversations. I haven't proposed *anything*, only asked questions. Meanwhile, you keep debating my supposed proposal, and not answering my questions! Specifically, you still have not answered my question: Why do you think that 'adapt("23",int)' should not return 23? That is all I am asking, and trying to understand. It is a question, not a proposal for anything, of any kind. Now, it is possible I misunderstood your original statement, and you were not in fact proposing that it should not. If so, then that clarification would be helpful. All the rest of this about why adapt(x,y) may have nothing to do with y(x) isn't meaningful to me. The fact that 2+2==4 and 2*2 ==4 doesn't mean that multiplication is the same as addition! So why would adapt(x,y) and y(x) being equal for some values of x and y mean that adaptation is conversion? You seem to be arguing, however, that that's what I'm saying. Further, you seem to me to be saying that "Because addition is not multiplication, adding 2 and 2 should not equal 4. That's what multiplication is for, so you should always multiply 2 and 2 to get 4, never add them." And that seems so wrong to me, that I have to ask, "Why would you say a thing like that?" Then, you answer me by saying, "But addition is not multiplication, so why are you proposing that adding two numbers should always produce the same result as multiplying them?" When in fact I have not proposed any such thing, nor would I!

Alex Martelli <aleaxit@yahoo.com>:
Using my "outer" suggestion, augmented assignments to nonlocals would be written outer x += 1 which would make the intention pretty clear, I think. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

On Tuesday 28 October 2003 02:48 am, Greg Ewing wrote:
Absolutely clear, and wonderful. Pity that any alternative to 'global' has been declared "a lost cause" by Guido. I'd still like to forbid "side effect rebinding" via statements such as class, def, import, for, i.e., no outer def f(): ... and the like; i.e., the 'outer' statement should be 'outer' expr_stmt (in Grammar/Grammar terms) with the further constraint that the expr_stmt must be an assignment (augmented or not); and the outer statement should not be a 'small_stmt', so as to avoid the ambiguity of outer x=1; y=2 (is this binding a local or nonlocal name 'y'?). Alex

Alex Martelli <aleaxit@yahoo.com>:
i.e., the 'outer' statement should be 'outer' expr_stmt
The way I was thinking, "outer" wouldn't be a statement at all, but a modifier applied to an indentifier in a binding position. So, e.g. x, outer y, z = 1, 2, 3 would be legal, meaning that x and z are local and y isn't, and outer x = 1; y = 2 would mean y is local and x isn't. To make both x and y non-local you would have to write outer x = 1; outer y = 2 Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

Just> How about adding a "rebinding" operator, for example spelled ":=": Just> a := 2 Just> It would mean: bind the value 2 to the nearest scope that defines Just> 'a'. I see a couple problems: * Would you be required to use := at each assignment or just the first? All the toy examples we pass around are very simple, but it seems that the name would get assigned to more than once, so the programmer might need to remember the same discipline all the time. It seems that use of x := 2 and x = 4 should be disallowed in the same function so that the compiler can flag such mistakes. * This seems like a statement which mixes declaration and execution. Everyone seems to abhor the global statement. Perhaps its main saving grace is that it doesn't pretend to mix execution and declaration. I think to narrow the scope of possible alternatives it would be helpful to know if what we're looking for is a way to allow the programmer only bind in the nearest enclosing scope or if she should be able to bind to an arbitrary enclosing scope. The various ideas seem to be falling into those two categories. Guido, do you have a preference or a pronouncement on that idea? Knowing that would eliminate one category of solutions. Skip

Skip Montanaro wrote:
Just the first; "a = 2" still means "a is local to this scope".
I don't see it as a mistake. := would mean: "bind to whichever scope the name is defined in", and that includes the current scope. I disagree with Alex when he says := should mean "I'm binding this name in NON-local scope".
* This seems like a statement which mixes declaration and execution.
How is that different from "regular" assignment? It mixes declaration and execution in the same way. Just

>> * Would you be required to use := at each assignment or just the >> first? Just> Just the first; "a = 2" still means "a is local to this scope". That seems like a very subtle error waiting to happen... >> All the toy examples we pass around are very simple, but it seems >> that the name would get assigned to more than once, so the programmer >> might need to remember the same discipline all the time. It seems >> that use of x := 2 and x = 4 should be disallowed in the same >> function so that the compiler can flag such mistakes. Just> I don't see it as a mistake. := would mean: "bind to whichever Just> scope the name is defined in", and that includes the current Just> scope. I disagree with Alex when he says := should mean "I'm Just> binding this name in NON-local scope". Yeah, but if you come back to the code in six months and the nested function is 48 lines long and assigns to x using a variety of ":=" and "=" assignments, it seems to me like it will be hard to tell if there's a problem. >> * This seems like a statement which mixes declaration and execution. Just> How is that different from "regular" assignment? It mixes Just> declaration and execution in the same way. Not in the way of saying, "this is global and here's its value". Skip

Skip Montanaro wrote:
Since I said the wrong thing, I'm not sure how to respond to this... Do you still feel the same way with my corrected reply?
In a way := is the opposite of "this is local and here's its value". It says: "this is defined _somewhere_ and here's its new value". Just

"Just" == Just van Rossum <just@letterror.com> writes:
Just> Skip Montanaro wrote: >> >> * Would you be required to use := at each assignment or just >> >> the first? >> Just> Just the first; "a = 2" still means "a is local to this scope". >> >> That seems like a very subtle error waiting to happen... Just> Since I said the wrong thing, I'm not sure how to respond to Just> this... Do you still feel the same way with my corrected reply? Nope. Skip

Skip Montanaro wrote:
Nope.
Ok :). Yet I think I'm starting to agree with you and Alex that := should mean "this name is NON-local". A couple more things: - I think augmented assignments CAN be made "rebinding" without breaking code, since currently a += 1 fails if a is neither local nor global. - Would := be allowed in statements like "self.a := 2"? It makes no sense, but since "(a, b) := (2, 3)" IS meaningful, what about "(a, b, self.c) = (1, 2, 3)"? Just

On Sunday 26 October 2003 13:37, Just van Rossum wrote:
The more I think about it, the more I like it in its _simplest_ form.
You are right about the breaking code, but I would still slightly prefer to eschew this just for simplicity -- see also below.
I would not allow := in any but the SIMPLEST case: simple assignment to a bare name, no unpacking (I earlier said "no packing" but that's silly and I mispoke there -- "a := 3, 4, 5" WOULD of course be fine), no chaining, no := when the LHS is an indexing, slicing, attribute access. Keeping := Franciscan in its simplicity would make it easiest to implement, easiest to explain, AND avoid all sort of confusing cases where the distinction between := and = would otherwise be confusingly nonexistent. It would also make it most effective because it always means the same thing -- "assignment to (already-existing) nonlocal". This is much the spirit in which I'd forego the idea of making += etc access nonlocals too, though I guess I'm only -0 on that; it seems simplest and most effective to have the one concept "rebinding a nonlocal name" correspond in strict 1-1 way to the one notation := . Simplicity and effectiveness feel very Pythonic to me. I think rebinding nonlocals should be rare enough that the fact of having to write e.g. "a := a+1" rather than "a += 1" is a very minor problem. The important use case of += & friends, "xop[flap].quip(glop).nip[zap] += 1", gets no special benefit from += being deemed "rebinding" -- the rebinding concept applies usefully to bare names, and for a bare name writing name := name <op> RHS is no big deal wrt name <op>= RHS If name's a huge list, name.extend(anotherlist) is a fine substitute for name += anotherlist if you want to keep name nonlocal AND get some efficiency gain. Other containers with similar issues should also always supply a more readable synonym to __iadd__ for such uses, e.g. sets do, supplying union_update. So, keeping += &c just like today seems acceptable and preferable. Alex

On Sun, Oct 26, 2003, Alex Martelli wrote:
Sounds good to me. Question: what does this do? def f(): def g(x): z := x g(3) print z return g g = f() print z g('foo') print z That is, in the absence of a pre-existing binding, where does the binding for := go? I think it should be equivalent to global, going to the module scope. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan

On Sunday 26 October 2003 04:46 pm, Aahz wrote:
I think it should raise some subclass of NameError, because it's not an assignment to an _already-existing_ nonlocal, as per my text quoted above. It does not seem to me that "nested functions able to rebind module-level names" has compelling use cases, so I would prefer the simplicity of forbidding this usage. Alex

> Sounds good to me. Question: what does this do? > > def f(): > def g(x): > z := x ... > That is, in the absence of a pre-existing binding, where does the > binding for := go? I think it should be equivalent to global, going to > the module scope. This is one place I think an extension of the global statement has a definite advantage: def f(): def g(): global z in f z = x Skip

Skip Montanaro strung bits together to say:
Alternately (using Just's 'rebinding non-local' syntax: def f(): z = None def g(): z := x Cheers, Nick. -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions."

Alex Martelli wrote:
Minor, sure, but I think it's an unnecessary restriction, just like many people think Python's current inability to assign to outer scopes is unneccesary. If we have a rebinding operator, it'll be very surprising if augmented assignment ISN'T rebinding. It's just such a natural fit. Just

Just> - Would := be allowed in statements like "self.a := 2"? It makes Just> no sense, but since "(a, b) := (2, 3)" IS meaningful, what about Just> "(a, b, self.c) = (1, 2, 3)"? Ummm... This doesn't seem to be strengthening your argument. ;-) Skip

On Sunday 26 October 2003 11:42, Skip Montanaro wrote: ...
I entirely agree with you. There is no good use case that I can see for this mixture, and prohibiting it helps the compiler help the programmer.
* This seems like a statement which mixes declaration and execution.
That's actually the PLAIN assignment statement, which mixes assigning a value with telling the compiler "this name is local" (other binding statements such as def, class etc also do that). Alex

Nothing deep -- it just never occurred to me. I was mimicking ABC's "SHARE foo", which doesn't have this because its syntax for assignment is the more verbose "PUT value IN variable". I don't think it'll entice Alex though. :-) --Guido van Rossum (home page: http://www.python.org/~guido/)

On Sunday 26 October 2003 12:20 am, Guido van Rossum wrote:
Ah, you haven't seen my answer to it? I think it meets most of my objections -- all but the distaste for the keyword 'global' itself -- and I could definitely live with this more happily than with any other use of 'global'. Please see my direct response to Neal for more details. Alex

Alex Martelli wrote:
So it can't be global, as it must stay a keyword for backwards compatibility at least until 3.0.
Why? Removing keywords should be much simpler than adding them. I have no idea how hard it is to hack the parser to adjust, but I can't imagine how having 'global' no longer be a keyword as far as its concerned break b/w compatibility. What am I missing?

[David]
I don't recall the context, but I think the real issue with removing 'global' is that there's too much code out there that uses the global syntax to remove the global statement before 3.0. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote: [Alex]
So it can't be global, as it must stay a keyword for backwards compatibility at least until 3.0.
[David]
[GvR]
I would never have suggested that. Just that we can evolve the parser to retain the old usage global a,b,c while allowing a new usage global.a = value by removing 'global' from the list of reserved words and doing "fancy stuff" in the parser. Note that I very much don't know the details of the "fancy stuff". --david

[David]
Ah. *If* we want to parse both it would be easier to keep global as a keyword and do fancy stuff to recognize the second form... But I think somewhere in the mega-thread about this topic is hidden the conclusion that there are better ways to do this. --Guido van Rossum (home page: http://www.python.org/~guido/)

"Guido van Rossum" <guido@python.org> wrote in message news:200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com...
Eek. Global statement inside flow control should be deprecated, not abused to show that global is evil. :-)
Is there any good reason to ever use globals anywhere other than as the first statement (after doc string) of a function? If not, could its usage be so restricted (like __future__ import)?
Plus. EVERY newbie makes the mistake of taking "global" to mean "for ALL modules" rather than "for THIS module",
Part of my brain still thinks that, and another part has to say, 'no, just modular or mod_vars()'.
Only if they've been exposed to languages that have such globals.
Like Python with __builtins__? which I think of as the true globals. Do C or Fortran count as such a source of 'infection'?
uselessly using global in toplevel,
Which the parser should reject.
Good. The current nonrejection sometimes leads beginners astray because they think it must be doing something. While I use global/s() just fine, I still don't like the names. I decided awhile ago that they must predate import, when the current module scoop would have been 'global'.
[from another post] But I appreciate the argument; 'global' comes from ABC's >SHARE, but ABC doesn't have modules.
Aha! Now I can use this explanation as fact instead of speculation. Terry J. Reedy

Is there any good reason to ever use globals anywhere other than as the first statement (after doc string) of a function?
If the use of the global is fairly localized, I sometimes like to have the global declaration immediately proceed the first use, assuming all other uses are in the same indented block. (This means that I sometimes *do* have global inside flow control, but then all uses are also inside the same branch.) But I'm not sure this is a *good* reason.
If not, could its usage be so restricted (like __future__ import)?
This would break way too much stuff. It would have been a good idea for 0.1. But then I was trying to keep the grammar small while keeping syntactic checks out of the compilation phase if at all possible, and I thought "screw it -- if import can go anywhere, so can global."
Hardly, since they aren't normally thought of as variables.
Do C or Fortran count as such a source of 'infection'?
C, definitely -- it has the concept and the terminology. In Fortran, it's called common blocks (similar in idea to ABC's SHARE).
Just like x + 1 I suppose. I'm sure PyChecker catches this.
No, they were both there from day one. Frankly, I don't think in this case newbie confusion is enough of a reason to switch from global to some other keyword of mechanism. Yes, this means I'm retracting my support for Alex's "replace-global-with-attribute-assignment" proposal -- Jeremy's objection made me realize why I don't like it much. --Guido van Rossum (home page: http://www.python.org/~guido/)

On Tue, 2003-10-21 at 18:51, Guido van Rossum wrote:
I think copying semantics would be too surprising.
Woo hoo. I'm happy to hear you've had a change of heart on this topic. I think a simple, declarative statement would be clearer than assigning to an attribute of a special object. If a special object, like __global__, existed, could you create an alias, like: surprise = __global__ surprise.x = 1 print __global__.x ? It would apparently also allow you to use a local and global variable with the same name in the same scope. That's odd, although I suppose it would be clear from context whether the local or global was intended.
I would prefer to see a separate statement similar to global that meant "look for the nearest enclosing binding." Rather than specifying that you want to use x from outer, you could only say you don't want x to be local. That means you'd always get intermediate. I think this choice is more modular. If you can re-bind a non-local variable, then the name of the function where it is initially bound isn't that interesting. It would be safe, for example, to move it to another place in the function hierarchy without affecting the semantics of the program -- except that in the case of "global x in outer" you'd have to change all the referring global statements. Or would the semantics be to create a binding for x in outer, even if it didn't already exist? Jeremy

[Guido]
Maybe "global x in f" would work?
[Jeremy]
Right.
I don't care about that argument; it's no more confusing to have globals.x and x as it is to have self.x and x, and the latter happens all the time.
That would be fine; I think that code where you have a choice of more than one outer variable with the same name is seriously insane. An argument for naming the outer function is that explicit is better than implicit, and it might help the reader if there is more than one level; OTOH it is a pain if you decide to rename the outer function (easily caught by the parser, but creates unnecessary work). I admit that I chose this mostly because the syntax 'global x in outer' reads well and doesn't require new keywords.
I'm not sure what you mean here. Move x around, or move outer around? In both cases I can easily see how the semantics *would* change, in general.
-- except that in the case of "global x in outer" you'd have to change all the referring global statements.
Yes, that's the main downside.
Or would the semantics be to create a binding for x in outer, even if it didn't already exist?
That would be the semantics, right; just like the current global statement doesn't care whether the global variable already exists in the module or not; it will create it if necessary. But a relative global statement would be fine too; it would be an error if there's no definition of the given variable in scope. But all this is moot unless someone comes up with a way to spell this that doesn't require a new keyword or change the meaning of 'global x' even if there's an x at an intermediate scope (i.e. you can't change 'global x' to mean "search for the next outer scope that defines x"). And we still have to answer Alex's complaint that newbies misinterpret the word 'global'. --Guido van Rossum (home page: http://www.python.org/~guido/)

I'm not averse to introducing a new keyword, which would address both concerns. yield was introduced with apparently little problem, so it seems possible to add a keyword without causing too much disruption. If we decide we must stick with global, then it's very hard to address Alex's concern about global being a confusing word choice <wink>. Jeremy

[Guido]
[Jeremy]
OK, the tension is mounting. Which keyword do you have in mind? And would you use the same keyword for module-globals as for outer-scope variables? --Guido van Rossum (home page: http://www.python.org/~guido/)

At 14:27 09.12.2000 -0500, Jeremy Hylton wrote:
why exactly do we want write access to outer scopes? for completeness, to avoid the overhead of introducing a class here and there, to facilitate people using Scheme textbooks with Python? so far I have not been missing it, I don't find: def accgen(n): def acc(i): global n in accgen n += i return n return acc particulary more compelling than: class accgen: def __init__(self, n): self.n = n def __call__(self, i): self.n += i return self.n I'm not asking in order to polemize, I just would like to see the rationale spelled out. regards.

[Samuele]
Probably the latter; I think Jeremy Hylton does know more Scheme than I do. :-)
Some people have "fear of classes". Some people think that a function's scope can be cheaper than an object (someone should time this). Looking at the last example in the itertools docs: def tee(iterable): "Return two independent iterators from a single iterable" def gen(next, data={}, cnt=[0]): dpop = data.pop for i in count(): if i == cnt[0]: item = data[i] = next() cnt[0] += 1 else: item = dpop(i) yield item next = iter(iterable).next return (gen(next), gen(next)) This would have been clearer if the author didn't have to resort to representing his counter variable as a list of one element. Using 'global* x' to mean 'find x in an outer scope', and also moving data into the outer scope, again to emphasize that it is shared between multiple calls of gen() without abusing default arguments, it would become: def tee(iterable): "Return two independent iterators from a single iterable" data = {} cnt = 0 def gen(next): global* cnt dpop = data.pop for i in count(): if i == cnt: item = data[i] = next() cnt += 1 else: item = dpop(i) yield item next = iter(iterable).next return (gen(next), gen(next)) which is IMO more readable. But in 2.4 this will become a real object implemented in C. :-) --Guido van Rossum (home page: http://www.python.org/~guido/)

At 10:57 22.10.2003 -0700, Guido van Rossum wrote:
it's a subtle piece of code. I wouldn't mind a more structured syntax with both the outer function declaring that is ok for some inner function to rebind some of its locals, and the inner function declaring that a local is coming from an outer scope: def tee(iterable): "Return two independent iterators from a single iterable" data = {} # cnt = 0 here would be ok share cnt = 0: # the assignment is opt, # inner functions in the suite can rebind cnt def gen(next): use cnt # OR outer cnt dpop = data.pop for i in count(): if i == cnt: item = data[i] = next() cnt += 1 else: item = dpop(i) yield item # cnt = 0 here would be ok next = iter(iterable).next return (gen(next), gen(next)) yes it's heavy and unpythonic, but it makes very clear that something special is going on with cnt. no time to add anything else to the thread. regards.

Might as well declare a class then. :-)
no time to add anything else to the thread.
Ditto. --Guido van Rossum (home page: http://www.python.org/~guido/)

At 16:22 23.10.2003 -0700, Guido van Rossum wrote:
well, no, it's probably that I expect rebindable closed-over vars to be introduced but some kind of structured construct instead of the usual Python freeform. I think for this kind of situation I miss the Lisp-y 'let'. def counter(starval): share cnt = startval: def inc(i): use cnt cnt += i return cnt def dec(i) use cnt cnt -= i return cnt return inc,dec vs. def counter(starval): cnt = startval def inc(i): global cnt in counter cnt += i return cnt def dec(i) global cnt in counter cnt -= i return cnt return inc,dec vs. def counter(starval): class Counter: def __init__(self,startval): self.cnt = startval def inc(self,i): self.cnt += i return self.cnt def dec(self,i): self.cnt += i return self.cnt newcounter = Counter(startval) return newcounter.inc,newcounter.dec vs. (defun counter (startval) (let ((cnt startval)) (flet ((inc (i) (incf cnt i)) (dec (i) (decf cnt i))) (values #'inc #'dec)))) <wink>

Why does rebindability make a difference here? Local vars are already visible in inner scopes, and if they are mutable, they are already being modified from inner scopes (just not rebound, but to most programmers that's an annoying detail). --Guido van Rossum (home page: http://www.python.org/~guido/)

At 09:05 24.10.2003 -0700, Guido van Rossum wrote:
most Python programmers or most Python programmers using closures? Well, it's a gut feeling, let's try to articulate it. Because a) parametrizing a closure with some read-only variable b) possibly shared mutable state with indefinite extent are very different things. I think that people should recur to b) instead of using classes sparingly and make it clear when they do so. b) can feel like global variables with their problems, I think that's why I would prefer a syntax that still point out: this is some state and this are functions to manipulate it. Classes are fine for that, and knowing that it is common style/idiom in Lisp variants this is also fine there: (let ... introduces vars ... function defs) I think it is also about expectations when reading some code. Right now, reading Python code I expect at most to encounter a), although b) can be obtained using mutable objects, but also in that case IMHO an explicit uniform idiom would be preferable, like some Ref object inspired by ML references. I can live with all solutions, although I'm still unconviced apart from the Scheme textbook argument (which was serious) that this addition is really necessary. regards.

[Samuele]
[Guido]
[Samuele]
most Python programmers or most Python programmers using closures?
I meant both categories.
Raymond's tree() example is an unfortunate one in this category. (Unfortunately because it is obfuscated code for speed reasons and because it appears in an examples section of official docs.)
I don't think the Scheme textbook argument should weigh much, since that's such a small audience. My original approach has been to discourage (b) by not allowing rebinding. Maybe this should stay the way it is. But the use of 'global x in f' might be enough to tip the reader off -- not quite at the start of f, when x is defined, but at least at the start of the inner function that declares x global in f. --Guido van Rossum (home page: http://www.python.org/~guido/)

On Wednesday 22 October 2003 07:57 pm, Guido van Rossum wrote: ...
I need to simulate the "rebinding name in outer scope" with some kind of item or attribute, of course, but, given this, here comes: given this b.py: def accgen_attr(n): def acc(i): acc.n += i return acc.n acc.n = n return acc def accgen_item(n): n = [n] def acc(i): n[0] += i return n[0] return acc class accgen_clas(object): def __init__(self, n): self.n = n def __call__(self, i): self.n += i return self.n def looper(accgen, N=1000): acc = accgen(100) x = map(acc, xrange(N)) return x I measure: [alex@lancelot ext]$ timeit.py -c -s'import b' 'b.looper(b.accgen_attr)' 1000 loops, best of 3: 1.86e+03 usec per loop [alex@lancelot ext]$ timeit.py -c -s'import b' 'b.looper(b.accgen_item)' 1000 loops, best of 3: 1.18e+03 usec per loop [alex@lancelot ext]$ timeit.py -c -s'import b' 'b.looper(b.accgen_clas)' 100 loops, best of 3: 2.1e+03 usec per loop So, yes, a function IS slightly faster anyway (accgen_attr vs accgen_clas), AND simulating outer-scope-rebinding with a list item is somewhat faster than doing so with an attr (a class always uses an attr, and most of its not-too-terrible performance handicap presumably comes from that fact). I just don't think such closures would typically be used in bottlenecks SO tight that a 10%, or even a 40%, extra overhead are going to be crucial. So, I find it hard to get excited either way by this performance issue. Alex

In article <5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch>, Samuele Pedroni <pedronis@bluewin.ch> wrote:
I am currently working on implementing an algorithm with the following properties: - It is an algorithm, not a data structure; that is, you run it, it returns an answer, and it doesn't leave any persistent state afterwards. - It is sufficiently complex that I prefer to break it into several different functions or methods. - These functions or methods need to share various state variables. If I implement it as a collection of separate functions, then there's a lot of unnecessary code complexity involved in passing the state variables from one function to the next, returning the changes to the variables, etc. Also, it doesn't present a modular interface to the rest of the project -- code outside this algorithm is not prevented from calling the internal subroutines of the algorithm. If I implement it as a collection of methods of an object, I then have to include a separate function which creates an instance of the object and immediately destroys it. This seems clumsy and also doesn't fit with my intuition about what objects are for (representing persistent structure). Also, again, modularity is violated -- outside code should not be making instances of this object or accessing its methods. What I would like to do is to make an outer function, which sets up the state variables, defines inner functions, and then calls those functions. Currently, this sort of works: most of the state variables consist of mutable objects, so I can mutate them without rebinding them. But some of the state is immutable (in this case, an int) so I need to somehow encapsulate it in mutable objects, which is again clumsy. Write access to outer scopes would let me avoid this encapsulation problem. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science

I know the problem, I've dealt with this many times. Personally I would much rather define a class than a bunch of nested functions. I'd have a separate master function that creates the instance, calls the main computation, and then extracts and returns the result. Yes, the class may be accessible at the toplevel in the module. I don't care: I just add a comment explaining that it's not part of the API, or give it a name starting with "_". My problem with the nested functions is that it is much harder to get a grasp of what the shared state is -- any local variable in the outer function *could* be part of the shared state, and the only way to tell for sure is by inspecting all the subfunctions. With the class, there's a strong convention that all state is initialized in __init__(), so __init__() is self-documenting. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido:
That would be solved if, instead of marking variables in inner scopes that refer to outer scopes, it were the other way round, and variables in the outer scope were marked as being rebindable in inner scopes. def f(): rebindable x def inc_x_by(i): x += i # rebinds outer x x = 39 inc_x_by(3) return x Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

[Greg]
This would only apply to *assignment* from inner scopes, not to *use* from inner scopes, right? (Otherwise it would be seriously backwards incompatible.) I'm not sure I like it much, because it gives outer scopes (some) control over inner scopes. One of the guidelines is that a name defined in an inner scope should always shadow the same name in an outer scope, to allow evolution of the outer scope without affecting local details of inner scope. (IOW if an inner function defines a local variable 'x', the outer scope shouldn't be able to change that.) --Guido van Rossum (home page: http://www.python.org/~guido/)

>> That would be solved if, instead of marking variables in inner scopes >> that refer to outer scopes, it were the other way round, and >> variables in the outer scope were marked as being rebindable in inner >> scopes. ... Guido> This would only apply to *assignment* from inner scopes, not to Guido> *use* from inner scopes, right? (Otherwise it would be seriously Guido> backwards incompatible.) Given that the global keyword or something like it is here to stay (being preferable over some attribute-style access) and that global variable writes needs to be known to the compiler for future efficiency reasons, I think we need to consider modifications of the current global statement. The best thing I've seen so far (I forget who proposed it) is 'global' vars [ 'in' named_scope ] where named_scope can only be the name of a function which encloses the function containing the declaration. In Greg's example of inc_x_by nested inside f, he'd have declared: global x in f in inc_x_by. The current global statement (without a scoping clause) would continue to refer to the outermost scope of the module. This should be compatible with existing usage. The only problem I see is whether the named_scope needs to be known at compile time or if it can be deferred until run time. For example, should this import random def outer(a): x = a def inner(a): x = 42 def innermost(r): if r < 0.5: global x in inner else: global x in outer x = r print " inner, x @ start:", x innermost(random.random()) print " inner, x @ end:", x print "outer, x @ start:", x inner(a) print "outer, x @ end:", x outer(12.73) be valid? My thought is that it shouldn't. Skip

Skip Montanaro wrote:
How about (to abuse a keyword that's gone unmolested for too long) global foo from def to declare that foo refers a variable in a lexically enclosing function definition? This avoids to need to name a specific function (which IMHO is just a source of confusion over the semantics of strange cases) while still having some mnemonic value (foo "comes from" an enclosing function definition). jw

John> How about (to abuse a keyword that's gone unmolested for too long) John> global foo from def John> to declare that foo refers a variable in a lexically enclosing John> function definition? This avoids to need to name a specific John> function (which IMHO is just a source of confusion over the John> semantics of strange cases) while still having some mnemonic value John> (foo "comes from" an enclosing function definition). How do you indicate the particular scope to which foo will be bound (there can be many lexically enclosing function definitions)? Using my example again: def outer(a): x = a def inner(a): x = 42 def innermost(r): global x from def # <--- your notation x = r print " inner, x @ start:", x innermost(random.random()) print " inner, x @ end:", x print "outer, x @ start:", x inner(a) print "outer, x @ end:", x how do you tell Python that x inside innermost is to be associated with the x in inner or the x in outer? Skip

Skip Montanaro <skip@pobox.com> writes:
Maybe "global foo from <function_name>" ? Or, "from function_name global foo" is consistent with import, albeit somewhat weird. I would never use this feature; I avoid nested functions entirely. However, as long as we're talking about this stuff, I wish I could write "global foo" at module scope and have that mean "this variable is to be treated as global in all functions in this module". zw

This is similar to Greg Ewing's proposable to have 'rebindable x' at an outer function scope. My problem with it remains: It gives outer scopes (some) control over inner scopes. One of the guidelines is that a name defined in an inner scope should always shadow the same name in an outer scope, to allow evolution of the outer scope without affecting local details of inner scope. (IOW if an inner function defines a local variable 'x', the outer scope shouldn't be able to change that.) --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum <guido@python.org> writes:
Frankly, I wish Python required one to write explicit declarations for all variables in the program: var x, y, z # module scope class bar: classvar I, J, K # class variables var i, j, k # instance variables def foo(...): var a, b, c # function scope ... It's extra bondage and discipline, yeah, but it's that much more help comprehending the program six months later, and it also gets rid of the "how was this variable name supposed to be spelled again?" question. zw

On Friday 24 October 2003 12:27 am, Zack Weinberg wrote: ...
Seems like a great way to get uninitialized variables to me. Might as well mandate initialization, getting a hard-to-read classvar I=2.3, J=(2,3), K=23 or to force more readability one might say only one name per classvar statement classvar I=2.3 classvar J=(2,3) classvar K=23 But then what added value is that 'classvar' boilerplate dirtying things up? Might as well take it off and get I = 2.3 J = (2, 3) K = 23 which is just what we have now.
It's extra bondage and discipline, yeah, but it's that much more help comprehending the program six months later, and it also gets rid of
There is absolutely no help (not one minute later, not six months later) "comprehending" the program just because some silly language mandates redundancy, such as a noiseword 'classvar' in front of the assignments.
the "how was this variable name supposed to be spelled again?" question.
I disagree that the 'classvar' boilerplate would provide any help with that question. Just put the initializing assignment there and it's only clearer for NOT being obscured by that 'classvar' thingy. Document with docstrings or comments, not by changing the language. A language which, I suspect, MIGHT let you do exactly what you want, is Ruby. I don't know for sure that you can tweak Ruby into giving (at least) warnings for assignment to symbols outside of a certain set, but I suspect you might; you _can_ change the language's semantics pretty deeply. Yet in most other ways it's close enough to Python that the two are almost equivalent. I do believe (and hope!) you stand very little chance of ever getting into Python something as alien to its tradition and principles as variable declarations, so, if they're important to you, considering ruby might be a more productive option for you. Alex

Alex Martelli <aleaxit@yahoo.com> writes:
No, they get a magic cookie value that triggers an exception on use. Which, incidentally, disambiguates the present UnboundLocalError - is that a typo, or is that failure to initialize the variable on this code path? Consider, eg. def foo(x): s = 2 if x: a = 1 return a ...
Understand that I do almost all my programming in typed languages, where that keyword isn't noise, it's a critical part of the declaration. I see where you're coming from with regard to noisewords. There are plausible alternatives, although they're all more complicated to implement and explain, compared to var a, b = 2, c = foo() # a throws UninitializedLocalError if used # before set ... d # throws UnboundLocalError e = 1 # ALSO throws UnboundLocalError But in this domain, I am mostly content with the language as is. I think there really *is* a language deficiency with regard to declaring class versus instance variables. class foo: A = 1 # these are class variables B = 2 C = 3 def __init__(self): self.a = 4 # these are instance variables self.b = 5 self.c = 6 I find this imperative syntax for declaring instance variables profoundly unintuitive. Further, on my first exposure to Python, I thought A, B, C were instance variables, although it wasn't hard to understand why they aren't. People like to rag on the popularity of __slots__ (for reasons which are never clearly spelled out, but never mind) -- has anyone considered that it's popular because it's a way of declaring the set of instance variables, and there is no other way in the language? zw

At 01:39 PM 10/24/03 -0700, Zack Weinberg wrote:
A, B, and C *are* instance variables. Why do you think they aren't?
What good does declaring the set of instance variables *do*? This seems to be more of a mental comfort thing than anything else. I've spent most of my career in declaration-free languages, though, so I really don't understand why people get so emotional about being able to declare their variables.
and there is no other way in the language?
Actually, there are a great many ways to implement such a thing. One way might be something like: class RestrictedVars: vars = () def __setattr__(self,attr,name): if name not in self.vars: raise AttributeError("No such attribute",attr) class SomeClass(RestrictedVars): vars = 'a','b','c'

"Phillip J. Eby" <pje@telecommunity.com> writes:
You prove my point! I got it wrong! This is a confusing part of the language!
Yeah, it's a mental comfort thing. Mental comfort is important. Having the computer catch your fallible human mistakes is also important. zw

"Phillip J. Eby" <pje@telecommunity.com> wrote in message news:5.1.1.6.0.20031024170245.03260160@telecommunity.com...
What? They are class attributes that live in the class dictionary, not the instance dictionary. They can be (directly) directly accessed as foo.A, etc, while foo.a, etc don't work. While they *may* serve as default or backup same-for-all-instances values for when there is no instance-specific value of the same name, that not the same, which is why they are defined differently. And a class attribute like number_of_instances would, conceptually, only be a class variable. Let's not confuse Zack further. Terry J. Reedy

In article <bncf99$366$1@sea.gmane.org>, "Terry Reedy" <tjreedy@udel.edu> wrote:
They are instance variables on the class object, which is an instance of type 'class'. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science

David Eppstein wrote:
I think the confusion that is brewing here is how Python masks class attributes when you do an assignment on an instance::
Python's resolution order checks the instance first and then the class (this is ignoring a data descriptor somewhere in this chain; for the details read Raymond's essay on descriptors @ http://users.rcn.com/python/download/Descriptor.htm#invoking-descriptors ). -Brett

On Friday 24 October 2003 23:10, Phillip J. Eby wrote:
They're _accessible AS_ instance attributes (self.B will be 2 in a method), but they have the same value in all instances and to _rebind_ them you need to do so on the class object (you can bind an instance variable with the same name to shadow each and any of them, of course).
What good does declaring the set of instance variables *do*? This seems
It decreases productivity -- that's the empirical result of Prechelt's study and the feeling of people who have ample experience with both kinds of language (cfr Robert Martin's well-known blog for an authoritative one, but my own experience is quite similar). If you subscribe to the popular fallacy known as "lump of labour" -- there is a total fixed amount of work that needs to be done -- it would follow that diminishing productivity increases the number of jobs available. Any economist would be appalled, of course, but, what do THEY know?-)
Most of MY work has been with mandatory-declaration languages, and my theory is that a "Stockholm Syndrome" is in effect (google for a few tens of thousands of explanations of that syndrome).
and there is no other way in the language?
Actually, there are a great many ways to implement such a thing. One way
For instance variables, yes. Fewer for class variables (you need a custom metaclass). None for module variables (also misleadingly known as 'global' ones) nor for local variables. Alex

On Friday 24 October 2003 22:39, Zack Weinberg wrote: ...
I have a vast experience of typed languages, and, even there, the mandatory redundancy of declarations is just a cop-out. A _well-designed_ strictly typed language, such as Haskell or ML, lets the compiler infer all types, so you don't _have_ to provide declarations -- you can, much in the spirit as you can use assert in Python, but you need not.
I think there really *is* a language deficiency with regard to declaring class versus instance variables.
I don't: there is no declaration at all (save for the accursed 'global'), only _statements_. They DO things, and what they do is simple and obvious.
I find this imperative syntax for declaring instance variables profoundly unintuitive. Further, on my first exposure to Python, I
That's because you keep thinking of "declaring". Don't. There is no such thing. There is documenting (docstrings, comments) and actions. Period. Entities must not be multiplied beyond need: we don't NEED enforced redundancy. We don't WANT it: if we did, we could chose among a huge host of languages imposing it in a myriad of ways -- but we've chosen Python exactly BECAUSE it has no such redundancy. When I write in some scope x = 1 I am saying: x is a name in this scope and it refers to value 1. I have said all that is needed by either the compiler, or a reader who knows the language, to understand everything perfectly. Forcing me to say AGAIN "and oh by the way x is REALLY a name in this scope, I wasn't kidding, honest" is abhorrent. If you really like that why stop at ONE forced useless redundancy? Why not force me to provide a THIRD redundant "I really REALLY truly mean it, please DO believe me!!!", or a fourth one, or...? *ONCE, AND ONLY ONCE*. A key principle of agile programming.
thought A, B, C were instance variables, although it wasn't hard to understand why they aren't.
Reducing the productivity of all language users to (perhaps) help a few who hadn't yet understood one "not hard to understand" detail would be a disastrous trade-off.
Yes, or more precisely, at least it looks that way, and it's efficient (saves some per-instance memory). Much the same way as "type(x) is int" looks like a way to "declare a type" and so does isinstance(x, int) later on in one's study of the language (though no saving accrues there). But then, "Extraordinary Popular Delusions and the Madness of Crowds" IS quite deservedly a best-seller for the last 160+ years. Fortunately, Python need not pander to such madness and delusions, however popular:-). Alex

On Friday 24 October 2003 12:08 am, Guido van Rossum wrote:
I must be missing something, because I don't understand the value of that guideline. I see outer and inner functions as tightly coupled anyway; it's not as if they could be developed independently -- not even lexically, surely not semantically. I do prefer to have the reminder "this is _assigning_ a NON-local variable" _closer_ to the assignment -- and I DO think it would be great if such rebinding HAD to be an assignment, not some kind of "side effect" from statements such as def, class, for, btw. (Incidentally, we'd get the latter for free if the nonlocal was "an attribute of some object" -- outer.x = 23 YES, "def outer.x():..." NO. But i'd still feel safer, even with a deuced 'declarative statement', if it could somehow be allowed to rebind nonlocals ONLY with an explicit assignment). So, anyway, the closer to the assignment the reminder, the better, so if it has to be a "declarative statement" I'd rather have it in the inner function than in the outer one. But for reasons very different from that guideline which I don't grasp... (probably just sleepiness and tiredness on my part...). Alex

[Guido]
[Alex]
It's the same as the reason why name lookup (whether at compile time or at run-time) always goes from inner scope to outer. While you and I see nested functions as small amounts of closely-knit code, some people will go overboard and write functions of hundred lines long containing dozens of inner functions, which may be categorized into several functional groups. A decision to share a variable 'foo' between one group of inner functions shouldn't mean that none of the other inner functions can have a local variable 'foo'. Anyway, I hope you'll have a look at my reasons for why the compiler needs to know about rebinding variables in outer scopes from inside an inner scope. --Guido van Rossum (home page: http://www.python.org/~guido/)

On Friday 24 October 2003 23:32, Guido van Rossum wrote: ...
This doesn't look like a legitimate use case to me; i.e., I see no need to distort the language if the benefit goes to such "way overboard" uses. I think they will have serious maintainability problems anyway. Fortunately, I don't think of placing the "indication to the compiler" as close to the assignment-to-outer-variable as a distortion;-)
Sure! I do understand this. What I don't understand is why, syntactically, the reserved word that indicates this to the compiler should have to be a "statement that does nothing" -- the ONLY "declaration" in the language -- rather than e.g. an _operator_ which specifically flags such uses. Assume for the sake of argument that we could make 'scope' a reserved word. Now, what are the tradeoffs of using a "declaration" scope x in outer which makes all rebidings of x act in the scope of containing function outer (including 'def x():', 'class x:', 'import x', ...); versus an "operator" that must be used to indicate "which x" when specifically assigning it (no "side effect rebinding" via def &c allowed -- I think it helps the reader of code a LOT to require specific assignment!), e.g. scope(outer).x = 23 Don't think of scope as a built-in function, but as a keyword in either case (and we could surely have other syntax for the "scope operator", e.g. "(x in outer scope) = 23" or whatever, as long as it's RIGHT THERE where x is being assigned). So the compiler can catch on to the info just as effectively. The tradeoffs are: -- we can keep thinking of Python as declaration-free and by gradually deprecating the global statement make it more so -- the reader of code KNOWS what's being assigned to without having to scroll up "hundreds of lines" looking for possible declarations -- assignment to nonlocals is made less casually convenient by just the right amount to ensure it won't be overused -- no casual rebinding of nonlocals via def, class, import -- once we solve the general problem of allowing non-bare-names as iteration variables in 'for', nonlocals benefit from that resolution automatically, since nonlocals are never assigned-to as bare-names I see this as the pluses. The minus is, we need a new keyword; but I think we do, because stretching 'global' to mean something that ISN'T global in any sense is such a hack. Cutting both ways is the fact that this allows using the same name from more than one scope (since each use is explicitly qualified as coming from a specific scope). That's irrelevant for small compact uses of nesting, but it may be seen as offering aid and succour to those wanting to "go overboard" as you detail earlier (bad); OTOH, if I ever need to maintain such "overboard" code written by others, and refactoring it is not feasible right now, it may be helpful. In any case, the confusion is reduced by having the explicit qualification on assignment. Similarly for _accesses_ rather than rebindings -- access to the barename will keep using the same rules as today, of course, but I think the same syntax that MUST be used to assign nonlocals should also be optionally usable to access them -- not important either way in small compact functions, but more regular and offering a way to make code non-ambiguous in large ones. I don't see having two ways to access a name -- barename x or qualified scope(foo).x -- as a problem, just like today from inside a method we may access a classvariable as "self.x" OR "self.__class__.x" indifferently -- the second form is needed for rebinding and may be chosen for clarity in some cases where the first simpler ("barer") one would suffice. Alex

Alex Martelli <aleaxit@yahoo.com> writes:
I'm skimming this, so I apologise if I've missed something obvious. However, one significant issue with your notation scope(outer).x = 23 is that, although scope(outer) *looks like* a function call, it isn't - precisely because scope is a keyword. I think that, if you're using a keyword, you need something syntactically distinct. Now maybe you can make something like (x in f scope) work as an expression (I've deliberately used "f" not "outer" to highlight the fact that it may not always look as "nice" as your example), but I'm not sure it's as intuitive as you imply. But then again, I've no problem with "global x in f". Paul -- This signature intentionally left blank

On Saturday 25 October 2003 17:49, Paul Moore wrote: ...
Existing operator keywords, such as, e.g., 'not', get away without it. One can use parentheses, write not(x), or not (preferable style); and what's the problem if "not(x)" CAN indeed look like a function call while in fact it's not? I really makes no deep difference here that 'not' is a keyword and not a built-in function (it does matter when it's used with other syntax, of course, such as "x is not y" or "x not in y" or "not x" and so on -- but then, where 'scope' to be introduced, it, too, like other operator keywords, might admit of slightly different syntax uses). Similarly, that 'scope' is a keyword known to the compiler is not deeply important to the user coding scope(f) -- it might as well be a built-in, from the user's viewpoint. It's important to the compiler, it becomes important if the user erroneously tries to rebind "scope = 23", but those cases don't give problems. Alex

[Alex] Assume for the sake of argument that we could make 'scope' a reserved word. Now, what are the tradeoffs of using a "declaration" scope x in outer which makes all rebidings of x act in the scope of containing function outer (including 'def x():', 'class x:', 'import x', ...); versus an "operator" that must be used to indicate "which x" when specifically assigning it (no "side effect rebinding" via def &c allowed -- I think it helps the reader of code a LOT to require specific assignment!), e.g. scope(outer).x = 23 I don't see how either of your scope statements is really any better than "global". If I say global x in outer I am declaring to the compiler that x is global to the current function, and in particular I want you to bind x to the x which is local to the function outer. Maybe "global" isn't perfect, but it seems to suit the situation fairly well and avoids a new keyword to boot. With the "scope(outer).x = 23" notation you are mixing apples and oranges (declaration and execution). It looks like an executable statement but it's really a declaration to the compiler. Guido has already explained why the binding has to occur at compile time. The tradeoffs are: -- we can keep thinking of Python as declaration-free and by gradually deprecating the global statement make it more so How do you propose to subsume the current global statement's functionality? -- the reader of code KNOWS what's being assigned to without having to scroll up "hundreds of lines" looking for possible declarations As he would with an extension of the current global statement. I presume you mean for your scope pseudo function to be used at the "point of attack", so there would likely be less separation between the declaration and the assignment. Of course, using your argument about redundancy against you, would I have to use scope(outer).x = ... each time I wanted to change the value of x? What if I rename outer? -- assignment to nonlocals is made less casually convenient by just the right amount to ensure it won't be overused I don't see this as a big problem now. In my own code I rarely use global, and never use nested functions. I suspect that's true for most people. Skip

On Saturday 25 October 2003 06:09 pm, Skip Montanaro wrote: ...
I don't see this as a big problem now. In my own code I rarely use global, and never use nested functions. I suspect that's true for most people.
No doubt it's true that most people only care about their own code, and don't have much to do with teaching and advising others, mentoring them, maintaining and enhancing code originally written by others, etc. So, since my professional activity typically encompasses these weird activities, not of interest to most people, and that gives me a different viewpoint from that of most people, I guess it's silly of me to share it. Sorry if my past well-meant eagerness caused problems; it's obviously more sensible for people who never use nested functions to help shape their syntax and semantics, than for those who DO use them, after all -- and similarly, for people who only care about their own code to help determine if 'global' is, or isn't, a cause of problems out there in the wide world of Python newbies and users far from python-dev. Alex

Alex> Sorry if my past well-meant eagerness caused problems; it's Alex> obviously more sensible for people who never use nested functions Alex> to help shape their syntax and semantics, than for those who DO Alex> use them, after all -- and similarly, for people who only care Alex> about their own code to help determine if 'global' is, or isn't, a Alex> cause of problems out there in the wide world of Python newbies Alex> and users far from python-dev. Pardon me? Just because I don't use a particular feature of the language doesn't mean I have no interest in how the language evolves. I don't believe I ever disrespected your ideas or opinions. Why are you disrespecting mine? Hell, why are you disrespecting me? I would be more than happy if nested scopes weren't in the language. Their absence would also make your teaching, advising, mentoring, maintenance and enhancing simpler. I haven't proposed that they be removed, though that would be rather clean way to solve this problem. Alex, if a qualification for discussing improvements to Python is that one use every aspect of the language, please pronounce. I'll be happy to butt out of your turf. Skip

On Sunday 26 October 2003 11:51, Skip Montanaro wrote: ...
Just because I don't use a particular feature of the language doesn't mean I have no interest in how the language evolves. I don't believe I
Absolutely true. Any feature added to the language brings some weight to all, even those who will not use it (perhaps not much to those who will not use it AND only care about their own code, but I do believe that most should also care about _others'_ code, even if they don't realize that -- reusing others' code from the net, &c, are still possibilities).
ever disrespected your ideas or opinions. Why are you disrespecting mine? Hell, why are you disrespecting me?
I had no intention of expressing any disrespect to you. If I miscommunicated in this regard, I owe you an apology. As for opinions based on only caring about one's own code, I am, however, fully entitled to meta-opine that such opinions are too narrowly based, and that not considering the coding behavior of others is near-sighted.
Of course such a proposal would have to wait for 3.0 (i.e. who knows when) given backwards incompatibility. Personally, I think that would just bring back all the "foo=foo, bar=bar" default-argument abuse that we used to have before nested scopes appeared, and therefore would not make any of my activities substantially simpler nor more productive (even discounting the large work of porting code across such a jump in semantics -- I think that could be eased by tools systematically _introducing_ default-argument abuse, but the semantics of such 'snapshotting' is still far enough from today's nested arguments to require plenty of manual inspection and changing).
You got the wrong guy: I don't get to pronounce, and this ain't my turf. I only get to plead, cajole, whine, argue, entreaty, advocate, propose, appeal, supplicate, contend, suggest, insist, agree, and disagree, just like everybody else. Alex

One person here brought up (maybe David Eppstein) that they used this approach for coding up extensive algorithms that are functional in nature but have a lot of state referenced *during* the computation. Whoever it was didn't like using classes because the internal state would persist past the lifetime of the calculation. When I visited Google I met one person who was advocating the same coding style -- he was adamant that if he revealed any internal details of his algorithm then the users of his library would start using them, and he wouldn't be able to change the details in another revision. AFACT these were both very experienced Python developers who had thought about the issue and chosen to write large nested functions. So I don't think you can dismiss this so easily.
Maybe because I haven't seen such an operator proposed that I liked. :) And in its normal usage, I don't find 'global x' offensive; that it can be abused and sometimes misunderstood doesn't matter to me, that's the case for sooooo many language constructs...
What bugs me tremendously about this is that this isn't symmetric with usage: you can *use* the x from the outer scope without using all that verbiage, but you must *assign* to it with a special construct. This would be particularly confusing if x is used on the right hand side of the assignment, e.g.: scope(outer).x = x.lower()
Somehow I don't see "declaration-free" as an absolute goal, where 100% is better than 99%.
-- the reader of code KNOWS what's being assigned to without having to scroll up "hundreds of lines" looking for possible declarations
Yeah, but you can still *use* a variable that was set "hundreds of lines" before, so it's not a full solution (and will never be -- allowing *use* of nonlocals is clearly a much-wanted and very useful feature).
-- assignment to nonlocals is made less casually convenient by just the right amount to ensure it won't be overused
If we don't add "global x in f" or some equivalent, you can't assign to nonlocals except for module globals, where I don't see a problem.
-- no casual rebinding of nonlocals via def, class, import
I don't think that's a real issue.
This is obscure -- most readers here didn't even know you could do that, and all except Tim (whom I cut a certain amount of slack because he's from Wisconsin) said they considered it bad style. So again the argument is weak.
Well, if for some reason the entire Python community suddenly leaned on me to allow assignment to non-locals with a syntactic construct to be used in every assignment to a non-local, I would much favor the C++ style of <scope>::<name>.
There is no need for this even among those folks; a simple renaming allows access to all variables they need. (My earlier argument wasn't about this, it was about accidental shadowing when there was *no* need to share.)
Actually, self.__class__.x is probably a mistake, usually one should name the class explicitly. But I don't see that as the same, because the name isn't bare in either case. --Guido van Rossum (home page: http://www.python.org/~guido/)

In article <200310251640.h9PGePZ07536@12-236-54-216.client.attbi.com>, Guido van Rossum <guido@python.org> wrote:
Yes, that was me. You recommended refactoring the stateful part of the algorithm as an object despite its lack of persistence. It worked and my code is much improved thereby. Specifically, I recognized that one of the outer level functions of my code was appending to a sequence of strings, so I turned that function into the next() method of an iterator object, and the other nested functions became other methods of the same object. I'm not sure how much of the improvement was due to using an object-oriented architecture and how much was due to the effort of refactoring in general, but you convinced me that using an object to represent shared state explicitly rather than doing it implicitly by nested function scoping can be a good idea. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science

On Saturday 25 October 2003 20:05, David Eppstein wrote: ...
Great testimony, David -- thanks!!! So, maybe, rather than going out of our way to facilitate coding very large and complicated closures, it might be better to keep focusing on _simple_, small closures as the intended, designed-for use case, and convince users of complicated closures that refactoring, as David has done, into OO terms, can indeed be preferable. Alex

At 09:40 25.10.2003 -0700, Guido van Rossum wrote:
[seen David Eppstein's post, discarded obsolete comment]
I should be missing the details, it seems someone unhappy with the encapsulation support in Python, wanting it backward using closures. Yes, closures can be used to get strong encapsulation. If Python wanted again to support directly some form of sandboxed execution , then better support for encapsulation would very likely play a role. But as I said I should be missing something, if the point is stronger encapsulation I would add it to the OO part of the language. The schizophrenic split, use objects but if you want encapsulation use closures, seems odd. Aside: I have the maybe misled impression, that having a fully complete functional programming support in Python was not the point, but that the addition of generators have increased the interest in more functional programming support.
AFACT these were both very experienced Python developers who had thought about the issue and chosen to write large nested functions.
they seem to want to import idioms that before 2.1 were not even imaginable, and maybe I'm wrong, but idioms that come from somewhere else. Personally, e.g. I would like multi-method support in Python and I know where they come from <wink>. Every experienced Python developer probaly knows some other language, and miss or would like something from there. Sometimes I have the impression that seeing the additions incrementally, and already knowing the language well, make it hard to consider the learning curve for someone encountering the language for the first time. I think that evalaluating whether an addition really enhance expressivity, or makes the language more uniform vs the ref-man growth is very important. IMHO generators were a clear win, generator expressions seem a add/substract thing because list comprehension explanation becomes just list(gen expr). regards.

Zack> Maybe "global foo from <function_name>" ? Sounds just about like the "global foo in named_scope" (where "named_scope" means enclosing function) that I described earlier. I like "in" better than "from" because it tells you more clearly that you are messing with the variable in-place, not making a copy of it into the local scope. Zack> Or, "from function_name global foo" is consistent with import, Zack> albeit somewhat weird. That reads a bit weird to me. The nice thing about the other way is that "global foo" without any qualifiers means the same thing it does today. There's also no reason to use the from form as "global foo in function" doesn't imply that you will refer to foo as "function.foo". Zack> I would never use this feature; I avoid nested functions entirely. Zack> However, as long as we're talking about this stuff, I wish I could Zack> write "global foo" at module scope and have that mean "this Zack> variable is to be treated as global in all functions in this Zack> module". I've never actually used nested scopes either, nor have I ever felt the urge. Maybe it has something to do with not having done much recent programming in a language before Python which supported them. (Pascal does, but my last Pascal experience was nearly 20 years ago.) Skip

Skip Montanaro wrote:
I can think of two reasonable possibilities--either it refers to the innermost possible variable, or the compiler rejects this case outright. Either way the problem is easy to solve by renaming one of the variables. Sorry I wasn't clear--I really only meant to propose a new syntax for the already-proposed "global foo in def". For some reason I can't quite put my finger on, "in def" looks to me like it's referring to the function where the statement occurs, but "from def" looks like it refers to some other function. jw

[Skip]
Given that the global keyword or something like it is here to stay (being preferable over some attribute-style access)
(Actually I expect more pushback from Alex once he's back from his trip. He seems to feel strongly about this. :-)
That was my first suggestion earlier this week. The main downside (except from propagating 'global' :-) is that if you rename the function defining the scope you have to fix all global statements referring to it. I saw a variant where the syntax was 'global' vars 'in' 'def' which solves that concern (though not particularly elegantly).
Definitely compile time. 'f' has to be a name of a lexically enclosing 'def'; it's not an expression. The compiler nees to know which scope it refers to so it can turn the correct variable into a cell. --Guido van Rossum (home page: http://www.python.org/~guido/)

>> 'global' vars [ 'in' named_scope ] >> >> where named_scope can only be the name of a function which encloses >> the function containing the declaration. Guido> That was my first suggestion earlier this week. The main Guido> downside (except from propagating 'global' :-) is that if you Guido> rename the function defining the scope you have to fix all global Guido> statements referring to it. Well, the listed variables are "global" to the current local scope. I find the rename argument a bit specious. If I rename a function I have to change all the references to it today. This is just one more. Since "global" is a declarative statement, the compiler can tell you immediately that it can't find the old function name. Guido> I saw a variant where the syntax was Guido> 'global' vars 'in' 'def' Guido> which solves that concern (though not particularly elegantly). I don't see how that can work though. What does 'def' mean in this case? There can be multiple lexically enclosing functions, any of which have the same local variable x which you might want modify. >> This should be compatible with existing usage. The only problem I >> see is whether the named_scope needs to be known at compile time or >> if it can be deferred until run time. Guido> Definitely compile time. 'f' has to be a name of a lexically Guido> enclosing 'def'; it's not an expression. The compiler nees to Guido> know which scope it refers to so it can turn the correct variable Guido> into a cell. Okay, that was easily settled. ;-) Skip

Right, I tend to agree.
Yeah, but usually that's not a problem. The compiler knows about all those x-es, and uses the innermost (nearest) one. This matches what it does when *referencing* a non-local variable, which doesn't need a global statement. --Guido van Rossum (home page: http://www.python.org/~guido/)

On Friday 24 October 2003 12:06 am, Guido van Rossum wrote:
I do: I dislike "declarative statements" and I also dislike "global" as a spelling for anything that isn't actually global. But after a 3-day Bologna->Munich->Gothenburg->Stockholm->Amsterdam->Bologna whirl I'm just too bushed -- and have too many hundreds of msgs to go through (backwards as usual) -- to be very effective;-). With luck, I may be able to do better in the weekend...:-).
I seem to have seen many others say that the "renaming the function" downside is not a serious problem, and I concur with them; you're just as likely to rename e.g. the variable (where you have to hunt down and change every assignment and access as well as the "declarative stmt", AND get no compiler support for errors) as the function (where you only need to fix the "declarative stmts" AND the compiler will tell you if you miss some) Alex

"David Eppstein" <eppstein@ics.uci.edu> wrote in message news:eppstein-567571.16030622102003@sea.gmane.org... persistent
So why not define the class inside the master function to keep it private? For a complex algorithm, re-setup time should be relatively negligible. Terry J. Reedy

At 03:14 PM 10/21/03 -0700, Guido van Rossum wrote:
Why?
Never mind that idea then.
Actually, I consider Samuele's example a good argument in *favor* of the idea. Because of the similarity between listcomps and generator expressions (gen-X's? ;) ) it seems late binding of locals would lead to people thinking the behavior is a bug. Since a genex is not a function (at least in form) a late binding would be very non-obvious and counterintuitive relative to other kinds of expressions.

Hm. We do late binding of globals. Why shouldn't we do late binding of locals? There are lots of corners or the language where if you expect something else the actual behavior feels like a bug, until someone explains it to you. That's no reason to compromise. It's an opportunity for education about scopes! --Guido van Rossum (home page: http://www.python.org/~guido/)

At 04:30 PM 10/21/03 -0700, Guido van Rossum wrote:
Wha? Oh, you mean in a function. But that's what I'm saying, it's *not* a function. Sure, it's implemented as one under the hood, but it doesn't *look* like a function. In any normal (non-lambda) expression, whether a variable is local or global, its value is retrieved immediately. Also, even though there's a function under the hood, that function is *called* and its value returned immediately. This seems consistent with an immediate binding of parameters.
So far, I haven't seen you say any reason why the "arguments" approach is bad, or why the "closure" approach is good. Both are certainly Pythonic in some circumstances, but why do you feel that one is better than the other, here? I will state one pragmatic reason for using the default arguments approach: code converted from using a listcomp to a genex can immediately have bugs as a result of rebinding a local. Those bugs won't happen if rebinding the local has no effect on the genex's evaluation. (Obviously, an aliasing problem can still be created if one modifies a mutable used in the genex, but there's no way to remove that possibility and still end up with a lazy iterator.) Given that one of the big arguments in favor of genexes is to make "upgrading" from listcomps easy, it shouldn't fail so quickly and obviously. E.g., converting from: x = {} for i in range(10): x[i] = [y^i for y in range(10)] to: x = {} for i in range(10): x[i] = (y^i for y in range(10)) Shouldn't result in all of x's elements iterating over the same values!

No, everywhere. Global in generator expressions also have late binding: A = 1 def f(): return (x+A for x in range(3)) g = f() A = 2 print list(g) # prints [2, 3, 4]; not [1, 2, 3]
That's because the expression is evaluated immediately. When passing generator expressions around that reference free variables (whether global or from a function scope), the expression is evaluated when it is requested. Note that even under your model, A = [] g = (A for x in range(3)) A.append(42) print list(g) # prints [[42], [42], [42]]
But it's a generator function, and the call suspends immediately, and continues to execute only when the next() method on the result is called.
Unified semantic principles. I want to be able to explain generator expressions as a shorthand for defining and calling generator functions. Invoking default argument semantics makes the explanation less clean: we would have to go through the trouble of finding all references to fere variables. Do you want globals to be passed via default arguments as well? And what about builtins? (Note that the compiler currently doesn't know the difference.)
Hm. I think most generator expressions should be finished before moving on to the next line, as in for n in range(4): print sum(x**n for x in range(1, 11)) Saving a generator expression for later use should be something you rarely do, and you should really think of it as a shorthand for a generator function just as lambda is a shorthand for a regular function. --Guido van Rossum (home page: http://www.python.org/~guido/)

At 05:23 PM 10/21/03 -0700, Guido van Rossum wrote:
For a technical explanation, I would say, "any name that is not defined by the generator expression itself has the binding that was in effect for that name at the time the generator expression occurs." (Note that this statement is equally true for any other non-lambda expression.) For a non-technical explanation, I wouldn't say anything, because I don't think anybody is going to assume the late-binding behavior, who doesn't already have the mental model that "this is a shortcut for a generator function". IOW, the issue I see here is that if somebody runs into the problem, they need to learn about the free variables and closures concept in order to understand why their code is breaking. But, if it doesn't break, then why do they need to learn that?
This sounds like "if the implementation is hard to explain" grounds, which I agree with in principle. I'm not positive it's that hard to explain, though, mainly because I don't see how anyone would *question* it in the first place. I find it hard to imagine somebody *wanting* changes to the variable bindings to affect an iterator expression, and thus the issue of why that doesn't work should be *much* rarer than the other way around. Past this point I think I'll be duplicating either my or Tim's arguments for this, so I'll leave off now.

[Samuele Pedroni]
That is what I had in mind, and that if the first assignment to "y" were commented out, the assignment to "it" would raise UnboundLocalError.
Yes, but like it if you replaced the "def gen" and the line following it with: def gen(y=y, l=l): for x in l: yield x+y it = gen() This is worth some thought. My intuition is that we *don't* want "a closure" here. If generator expressions were reiterable, then (probably obnoxiously) clever code could make some of use of tricking them into using different inherited bindings on different (re)iterations. But they're one-shot things, and divorcing the values actually used from the values in force at the definition site sounds like nothing but trouble to me (error-prone and surprising). They look like expressions, after all, and after x = 5 y = x**2 x = 10 print y it would be very surprising to see 100 get printed. In the rare cases that's desirable, creating an explicit closure is clear(er): x = 5 y = lambda: x**2 x = 10 print y() I expect creating a closure instead would bite hard especially when building a list of generator expressions (one of the cases where delaying generation of the results is easily plausible) in a loop. The loop index variable will probably play some role (directly or indirectly) in the intended operation of each generator expression constructed, and then you severely want *not* for each generator expression to see "the last" value of the index vrlbl. For concreteness, test_generators.Queens.__init__ creates a list of rowgen() generators, and rowgen uses the default-arg trick to give each generator a different value for rowuses; it would be an algorithmic disaster if they all used the same value. Generator expressions are too limited to do what rowgen() does (it needs to create and undo side effects as backtracking proceeds), so it's not perfectly relevant as-is. I *suspect* that if people work at writing concrete use cases, though, a similar thing will hold. BTW, Icon can give no guidance here: in that language, the generation of a generator's result sequence is inextricably bound to the lexical occurrence of the generator. The question arises in Python because definition site and generation can be divorced.

Urgh, we need this sorted out before Raymond can rewrite PEP 289 and present it to c.l.py...
Right.
So, do you want *all* free variables to be passed using the default-argument trick (even globals and builtins), or only those that correspond to variables in the immediately outer scope, or only those corresponding to function scopes (as opposed to globals)? n = 0 def f(): global n n += 1 return n print list(n+f() for x in range(10)) --Guido van Rossum (home page: http://www.python.org/~guido/)

[Guido]
Urgh, we need this sorted out before Raymond can rewrite PEP 289 and present it to c.l.py...
That would be good <wink>. I don't feel a sense of urgency, though, and will be out of town the rest of the week. I sure *expect* that most generator expressions will "get consumed" immediately, at their definition site, so that there's no meaningful question to answer then (as in, e.g., the endless sum(generator_expression) examples, assuming the builtin sum). That means people have to think of plausible use cases where evaluation is delayed. There are some good examples of lists-of-generators in test_generators.py, and I'll just note that they use the default-arg mechanism to force a particular loop-variant non-local value, or use an instance variable, and/or use lexical scoping but know darned well that the up-level binding will never change over the life of each generator. That's all the concrete stuff I have to stare at now (& recalling that the question can't be asked in Icon -- no "divorce" possible there, and no lexical nesting even if it were possible to delay generation).
All or none make sense to me, as semantic models (not ruling out that a clever implementation may take shortcuts). I'm not having a hard time imagining that "all" will be useful; I haven't yet managed to dream up a plausible use case where "none" actually helps.
Like I just said <wink>. There's no question that semantics can differ between "all" and "none" (and at several points between to boot). Stick a "global f" inside f() and rebind f based on the current value of n too, if you like. I'm having a hard time imagining something *useful* coming out of such tricks combined with "none". Under "all", I look at the print and think "f is f, and n is 0, and that's it". I'm not sure it's "a feature" that print [n+f() for x in range(10)] looks up n and f anew on each iteration -- if I saw a listcomp that actually relied on this, I'd be eager to avoid inheriting any of author's code.

[Tim]
It's just a direct consequence of Python's general rule for name lookup in all contexts: variables are looked up when used, not before. (Note: lookup is different from scope determination, which is done mostly at compile time. Scope determination tells you where to look; lookup gives you the actual value of that location.) If n is a global and calling f() changes n, f()+n differs from n+f(), and both are well-defined due to the left-to-right rule. That's not good or bad, that's just *how it is*. Despite having some downsides, the simplicity of the rule is good; I'm sure we could come up with downsides of other rules too. Despite the good case that's been made for what would be most useful, I'm loathe to drop the evaluation rule for convenience in one special case. Next people may argue that in Python 3.0 lambda should also do this; arguably it's more useful than the current semantics there too. And then what next -- maybe all nested functions should copy their free variables? Oh, and then maybe outermost functions should copy their globals into locals too -- that will speed up a lot of code. :-) There are other places in Python where some rule is applied to "all free variables of a given piece of code" (the distinction between locals and non-locals in functions is made this way). But there are no other places where implicit local *copies* of all those free variables are taken. I'd need to find a unifying principle to warrant doing that beyond utility. --Guido van Rossum (home page: http://www.python.org/~guido/)

[Tim]
[Guido]
Sorry, but none of that follows unless you first insist that a listcomp is semantically equivalent to a particular for-loop. Which we did do at the start, and which is now being abandoned in part ("well, except for the for target(s) -- well, OK, they still work like exactly like the for-loop would work if the target(s) were renamed in a particular 'safe' way"). I don't mind the renaming trick there, but by the same token there's nothing to stop explaining the meaning of a generator expression as a particular way of writing a generator function either. It's hardly a conceptual strain to give the function default arguments, or even to eschew that technical implementation trick and just say the generator's frame gets some particular initialized local variables (which is the important bit, not the trick used to get there).
Despite the good case that's been made for what would be most useful,
I don't see that any good case had been made for or against it: the only cases I care about are real use cases. A thing stands or falls by that, purity be damned. I have since posted the first plausible use case that occurred to me while thinking about real work, and "closure semantics" turned out to be disastrous in that example (see other email), while "capture the current binding" semantics turned out to be exactly right in that example. I suspected that would be so, but I still want to see more not-100%-fabricated examples.
It's not analogous: when I'm writing a lambda, I can *choose* which bindings to capture at lambda definition time, and which to leave free. Unless generator expressions grow more hair, I have no choice when writing one of those, so the implementation-forced choice had better be overwhelmingly most useful most often. I can't judge the latter without plausible use cases, though.
And then what next -- maybe all nested functions should copy their free variables?
Same objection as to the lambda example.
Oh, and then maybe outermost functions should copy their globals into locals too -- that will speed up a lot of code. :-)
It would save Jim a lot of thing=thing arglist typing in Zope code too <wink>.
I didn't suggest to copy anything, just to capture the bindings in use at the time a generator expression is evaluated. This is easy to explain, and trivial to explain for people familiar with the default-argument trick. Whenever I've written a list-of-generators, or in the recent example a generator pipeline, I have found it semantically necessary, without exception so far, to capture the bindings of the variables whose bindings wouldn't otherwise be invariant across the life of the generator. It it turns out that this is always, or nearly almost always, the case, across future examples too, then it would just be goofy not to implement generator expressions that way ("well, yes, the implementation does do a wrong thing in every example we had, but what you're not seeing is that the explanation would have been a line longer had the implementation done a useful thing instead" <wink>).
I'd need to find a unifying principle to warrant doing that beyond utility.
No you don't -- you just think you do <wink>.

(This is drawing to a conclusion. Summary: Tim has convinced me.)
Sorry, I meant a pointer copy, not an object copy. That's a binding capture.
This is easy to explain, and trivial to explain for people familiar with the default-argument trick.
Phillip Eby already recommended not bothering with that; the default-argument rule is actually confusing for newbies (they think the defaults are evaluated at call time) so it's best not to bring this into the picture.
OK, I got it now. I hope we can find another real-life example; but there were some other early toy examples that also looked quite convincing. I'll take a pass at updating the PEP. --Guido van Rossum (home page: http://www.python.org/~guido/)

[Tim]
This is easy to explain, and trivial to explain for people familiar with the default-argument trick.
[Guido]
Of course it works equally well to pass regular (non-default) arguments, it just makes a precise explanation a little longer to type (because the arglist needs to be typed out in two places).
I expect we will find more, although I haven't had more time to think about it (and today was devoted to puzzling over excessive rates of ZODB conflict errors, where generator expressions didn't seem immediately applicable <wink>). I do think it's related to non-reiterablity. If generator expressions were reiterable, then a case could be made for them capturing a parameterized computation, reusable for different things by varying the bindings of the free variables. Like, say, you wanted to plot the squares of various functions at a set of points, and then: squares = (f(x)**2 for x in inputs) # assuming reiterability here for f in math.sin, math.cos, math.tan: plot(squares) But that doesn't make sense for a one-shot (not reiterable) generator, and even if it were reiterable I can't think of a real example that would want the bindings of free variables to change *during* a single pass over the results. For that matter, if it were reiterable, the "control by obscure side effect" style of the example is hard to like anyway.

Tim> squares = (f(x)**2 for x in inputs) # assuming reiterability here Tim> for f in math.sin, math.cos, math.tan: Tim> plot(squares) How much more expensive would this be than for f in math.sin, math.cos, math.tan: squares = (f(x)**2 for x in inputs) plot(squares) which would work without reiterability, right? The underlying generator function could still be created at compile-time and it (or its code object?) stored in the current function's constants. 'f' is simply an argument to it when the iterator is instantiated. Skip

No, the code object would be stored in the constants; the function object would be created each time around the loop. Good thing it came from an example that Tim himself didn't like. :-) --Guido van Rossum (home page: http://www.python.org/~guido/)

[Tim]
[Skip Montanaro]
How much more expensive
Stop right there. I must have been unclear. The only point of the example was semantic, not cost: even if generator expressions used closure semantics, the example *still* wouldn't work the way it appears to read, and because generator expressions aren't reiterable. What the example would do under closure semantics: 1. Plot the square of math.sin(x), for each x in inputs. then 2. Probably nothing more than that. The "squares" GE is exhausted after #1 completes, and no matter often it's run again it's simply going to raise StopIteration at once each time it's tried. A reasonable plot() would probably do nothing when fed an exhausted iterable, but maybe it would raise an exception. That's up to plot(). What it *won't* do under any scheme here is go on to plot the squares of math.cos(x) and math.tan(x) over the inputs too. The lack of reiterability (which is fine by me!) thus seems to make a plausible use for closure semantics hard to imagine. The example was one where closure semantics suck despite misleading appearance. Closures are very often used (in languages other than Python, and in Python too by people who haven't yet learned to write Python <0.9 wink>) to hold mutable state in outer scopes, for the use of functions in inner scopes, very much like an instance's data attributes hold mutable state for the use of methods defined in the instance's class. In those common cases, the power comes from being able to run functions (methods) more than once, or to reuse the mutable state among functions (methods). But generator expressions are always one-shot computations (you get to run a GE to completion no more than once). There may be some use for closure semantics in a collection of GEs that reference each other (similar to instance data being visible to multiple methods), but so far I've failed to dream up a plausible case of that nature either.
Despite the similar appearance, that does something very different, plotting all 3 functions (not just math.sin), and regardless of whether closure or capture semantics are used. I expect the body of the loop in real life would be a one-liner, though: plot(f(x)**2 for x in inputs)
which would work without reiterability, right?
Yup.
Guido expanded on that already. The code is compiled only once (at "compile time"), and there's a small runtime cost per outer-loop iteration to build a function object from the (pre-compiled) code object, and a possibly larger runtime cost per outer-loop iteration to start the GE. Passing 'f' and 'inputs' may be part of either of those costs, depending on how it's implemented -- but giving the synthesized generator function some initialized locals is the least of the runtime costs.

Tim> [Skip Montanaro] >> How much more expensive Tim> Stop right there. Okay, but I couldn't resist. ;-) >> for f in math.sin, math.cos, math.tan: >> squares = (f(x)**2 for x in inputs) >> plot(squares) Tim> Despite the similar appearance, that does something very different, ... >> which would work without reiterability, right? Tim> Yup. I shouldn't have mentioned performance. The above was really the point I was getting at. The mention of performance was simply because I couldn't understand why reiterability would be necessary in your example. I see you were just pointing out that someone not understanding the underlying nature of the generator would assume your example would work *and* save cycles because the definition of the generator expression was hoisted out of the loop. Skip

Guido van Rossum <guido@python.org>:
And what about foo = (f(x) for x in stuff) def f(x): ... for blarg in foo: ... ? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

I had a large file today, and needed to find lines matching several patterns simultaneously. It seemed a natural application for generator expressions, so let's see how that looks. Generalized a bit: Given: "source", an iterable producing elements (like a file producing lines) "predicates", a sequence of one-argument functions, mapping element to truth (like a regexp search returning a match object or None) Create: a generator producing the elements of source for which each predicate is true This is-- or should be --an easy application for pipelining generator expressions. Like so: pipe = source for p in predicates: # add a filter over the current pipe, and call that the new pipe pipe = e for e in pipe if p(e) Now I hope that for e in pipe: print e prints the desired elements. If will if the "p" and "pipe" in the generator expression use the bindings in effect at the time the generator expression is assigned to pipe. If the generator expression is instead a closure, it's a subtle disaster. You can play with this today like so: pipe = source for p in predicates: # pipe = e for e in pipe if p(e) def g(pipe=pipe, p=p): for e in pipe: if p(e): yield e pipe = g() for e in pipe: print e Those are the semantics for which "it works". If "p=p" is removed (so that the implementation of the generator expression acts like a closure wrt p), the effect is to ignore all but the last predicate. Instead predicates[-1] is applied to soucre, and then applied redundantly to the survivors len(predicates)-1 times each. It's not obvious then that the result is wrong, and for some inputs may even be correct. If "pipe=pipe" is removed instead, it should produce a "generator already executing" exception, since the "pipe" in the final for-loop is bound to the same object as the "pipe" inside g then (all of the g's, but only the last g matters).

Tim Peters <tim_one@email.msn.com>:
Lying awake thinking about this sort of thing last night, I found myself wondering if there should be a way of explicitly requesting that a name be evaluated at closure creation time, e.g. pipe = source for p in predicates: pipe = e for e in pipe if ^p(e) where the ^ means that p is evaluated in the enclosing scope when the closure is created, and bound to a slot which behaves like a default-argument slot (but is separate from the default arguments). This would allow the current delayed-evaluation semantics to be kept as the default, while eliminating any need for using the default-argument hack when you don't want delayed evaluation. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

At 02:49 PM 10/23/03 +1300, Greg Ewing wrote:
Does anybody actually have a use case for delayed evaluation? Why would you ever *want* it to be that way? (Apart from the BDFL's desire to have the behavior resemble function behavior.) And, if there's no use case for delayed evaluation, why make people jump through hoops to get the immediate binding?

Phillip J. Eby strung bits together to say:
The other thing to consider is that if generator expressions provide immediate evaluation, then anyone who wants delayed evaluation semantics still has the option of writing an actual generator function - at which point, it ceases to be an expression, and becomes a function. Which seems to fit with the way Python works at the moment: This displays '1': x = 0 y = x + 1 x = 1 print y This displays '2': x = 0 y = lambda: x + 1 x = 1 print y (I think someone already gave a similar example) Actually, the exact same no-argument-lambda trick used above would be enough to get you late binding of all of the elements in your generator expression. Being selective still requires writing a real generator function, though. Cheers, Nick. -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions."

Nick Coghlan strung bits together to say:
D'oh! That last line should be "print y()"! Regards, Nick. Still has to reinstall Python on new OS installation. . . -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions."

On Thursday 23 October 2003 04:12 am, Phillip J. Eby wrote:
I have looked far and wide over my code, present and desired, and can only find one example that seems perhaps tangentially relevant -- and I don't think it's a _good_ example. Anyway, here it comes: def smooth(N, sequence): def fifo(N): window = [] while 1: if len(window) < N: yield None else: yield window window.pop(0) window.append(item) latest = iter(fifo(N)).next for item in sequence: window = latest() if window is None: continue yield sum(window) / N as I said, I don't like it one bit; the non-transparent "argument passing" of item from the loop "down into" the generator is truly yecchy. There are MUCH better ways to do this, such as def fifo(N, sequence): it = iter(sequence) window = list(itertools.islice(it, N)) while 1: yield window window.pop(0) window.append(it.next()) def smooth(N, sequence): for window in fifo(N, sequence): yield sum(window) / N It's not clear that this would generalize to generator expressions, anyway. But I could imagine it might, e.g. IF we had "closure semantics" rather than "snapshot-binding" somebody COULD be tempted to such murky cases of "surreptitious argumentpassing down into genexprs"... and we're better off without any such possibility, IMHO.
And, if there's no use case for delayed evaluation, why make people jump through hoops to get the immediate binding?
I understand Guido's position that simplicity and regularity of the rules count (a LOT). But in this case I think Tim's insistence on practicality should count for more: the "bind everything at start time" semantics are NOT a weird special case, and the "lookup everything each time around the loop" ones don't seem to yield any non-weird use... Alex

[Greg Ewing]
As explained in the original email, the example is also a disaster if pipe's binding isn't captured at creation-time too.
Well, I have yet to see an example where delayed evaluation is of any use in a generator expression, except for a 100%-contrived example that simply illustrated that the semantics can in fact differ (which I hope isn't something anyone questioned to begin with <wink>). Try writing a real example. If it needs delayed evaluation in a plausible way, great. I'm still batting 0 at trying to find such a thing; I confess I wasn't moved by the it = f(x) for x in whatever def f(x): blah example (there being no apparent need to contort the order of the assignments except, again, to illustrate that semantics have consequences).

Bah. Arbitrary semantics bound to line-noise characters. Guess what that reminds me of. :-) --Guido van Rossum (home page: http://www.python.org/~guido/)

If anyone can think of anything less line-noisy, I'm open to suggestions. The important thing is the idea of explicitly capturing an enclosing binding, however it's expressed. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

I think that no matter what notation you invent, this will remain an unpythonic thing. I can't quite explain why I feel that way. Maybe it's because it feels very strongly like a directive to the compiler -- Python's compiler likes to stay out of the way and not need help. --Guido van Rossum (home page: http://www.python.org/~guido/)

On Thursday 23 October 2003 06:25, Guido van Rossum wrote: ...
it's because it feels very strongly like a directive to the compiler -- Python's compiler likes to stay out of the way and not need help.
*YES*!!! So, what about that 'declarative statement' g****l, hmm...?-) Alex

On Wednesday 22 October 2003 09:18 pm, Tim Peters wrote: <snip>
Those are the semantics for which "it works".
I'm convinced; not only that free variables should be frozen as if they'd been passed into a generator function as keyword arguments, but of the utility of generator expressions as a whole -- that code is just beautiful :) Jeremy

At 02:04 PM 10/21/03 -0700, Guido van Rossum wrote:
Argh, someone *could* pass around a copy of locals() and make an assignment into that.
Not when the locals() is that of a CPython function, and I expect the same is true of Jython functions.
But I think we're already deprecating non-read-only use of locals(), so I'd like to ban that as abuse.
FWIW, both Zope 3 and PEAK currently make use of 'locals()' (actually, sys._getframe()) to modify locals of a class or module scope (i.e. non-functions). For both class and module scopes, it seems to be implied by the language definition that the local namespace is the __dict__ of the corresponding object. So, is this deprecated usage for class and module objects too?

Well, the effect is undefined; there may be things you can do that would force the changes out to the real local variables.
It isn't. I'm not sure it shouldn't be; at some point it might be attractive to lock down the namespace of certain modules and classes, and in fact new-style classes already attempt to lock down their __dict__. Fortunately the __dict__ you see when executing a function during the class definition phase is not the class dict; the class dict is a copy of it taken by the class creation code. --Guido van Rossum (home page: http://www.python.org/~guido/)

On Tuesday 21 October 2003 10:46 pm, Guido van Rossum wrote:
module a.py being: R = [range(N) for N in (10, 100, 10000)] def lc(R): y = 1 sum([x*y for x in R]) def gen1(R): y = 1 def gen(): for x in R: yield y*y sum(gen()) def gen2(R): y = 1 def gen(R=R, y=y): for x in R: yield y*y sum(gen()) i measure: for N=10: [alex@lancelot bo]$ timeit.py -c -s'import a' 'a.lc(a.R[0])' 100000 loops, best of 3: 12.3 usec per loop [alex@lancelot bo]$ timeit.py -c -s'import a' 'a.gen1(a.R[0])' 100000 loops, best of 3: 10.4 usec per loop [alex@lancelot bo]$ timeit.py -c -s'import a' 'a.gen2(a.R[0])' 100000 loops, best of 3: 9.7 usec per loop for N=100: [alex@lancelot bo]$ timeit.py -c -s'import a' 'a.lc(a.R[1])' 10000 loops, best of 3: 93 usec per loop [alex@lancelot bo]$ timeit.py -c -s'import a' 'a.gen1(a.R[1])' 10000 loops, best of 3: 59 usec per loop [alex@lancelot bo]$ timeit.py -c -s'import a' 'a.gen2(a.R[1])' 10000 loops, best of 3: 55 usec per loop for N=10000: [alex@lancelot bo]$ timeit.py -c -s'import a' 'a.lc(a.R[2])' 100 loops, best of 3: 9.4e+03 usec per loop [alex@lancelot bo]$ timeit.py -c -s'import a' 'a.gen1(a.R[2])' 100 loops, best of 3: 5.6e+03 usec per loop [alex@lancelot bo]$ timeit.py -c -s'import a' 'a.gen2(a.R[2])' 100 loops, best of 3: 5.2e+03 usec per loop I think it's well worth overcoming come "community resistance to new syntax" to get this kind of advantage easily. The trick of binding outer-scope variables as default args is neat but buys less than the pure idea of just using a generator rather than a list comprehension. Alex

Thanks for the measurements! Is someone interested in writing up a PEP and taking it to the community? Or do I have to do it myself (and risk another newsgroup meltdown)? --Guido van Rossum (home page: http://www.python.org/~guido/)

On Wednesday 22 October 2003 00:11, Guido van Rossum wrote:
I'm interested, if it can wait until next week (in a few hours I'm flying off for a trip and I won't even have my laptop along). What's the procedure for requesting a PEP number, again? Alex

Raymond is going to give PEP 289 an overhaul. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido> Raymond is going to give PEP 289 an overhaul. Since you rejected PEP 289 at one point, it might be worth having a short explanation of why you've changed your mind. Skip

Guido> I expect that most iterator comprehensions (we need a better Guido> term!) You didn't like "lazy list comprehensions"? Guido> We can quibble about whether double parentheses are needed, ... You haven't convinced me that you're not going to want to toss out one of the two comprehension syntaxes and only retain the lazy semantics in Py3k. If that's the case and the current list comprehension syntax is better than the current crop of proposals, why even add (lazy list|iterator) comprehensions now? Just make do without them until Py3k and make all list comprehensions lazy at that point. There will be enough other bullets to bite that this shouldn't be a big deal (many programs will probably require significant rewriting anyway). Skip

No, because list comprehensions are no longer the fundamental building blocks. Generator expression sounds good to me now.
Too many double negatives. :-) Right now I feel like keeping both syntaxes, but declaring list comprehensions syntactic sugar for list(generator expression).
It's likely that generator experssions won't make it into Python 2.x for any x, just because of the effort to get the community to accept new syntax in general. --Guido van Rossum (home page: http://www.python.org/~guido/)

[Guido]
I expect that most iterator comprehensions (we need a better term!) ...
Well, calling it an iterator Aussonderungsaxiom would continue emphasizing the wrong thing <wink>. "Set comprehensions" in a programming language originated with SETL, and are named in honor of the set-theoretic Axiom of Comprehension (Aussonderungsaxiom). In its well-behaved form, that says roughly that given a set X, then for any predicate P(x), there exists a subset of X whose elements consist of exactly those elements x of X for which P(x) is true (in its ill-behaved form, it leads directly to Russell's Paradox -- the set of all sets that don't contain themselves). So "comprehension" emphasizes the "if" part of list comprehension syntax, which often isn't the most interesting thing. More interesting more often are (a) the computation done on the objects gotten from the for-iterator, and (b) that the results are generated one at a time. Put that all in a pot and stir, and the name "generator expression" seems natural and useful to me. In the Icon language, *all* expressions are generators, so maybe I'm biased by that. OTOH, "the results are generated one at a time" is close to plain English, and "generator expression" then brings to my mind an expression capable of delivering a sequence of results. Or you could call it an Orlijn flourish.

Thanks for an independent validation of "generator expressions"! It's a perfect term.
Or you could call it an Orlijn flourish.
No, that term is already reserved for something else (the details of which I'll spare you, as they involve intimate details about toddler hygiene :-). --Guido van Rossum (home page: http://www.python.org/~guido/)

[Guido]
That is somewhat beautiful. So, I drop my request for bracketed yields and throw my tiny weight behind this idea for an iterator expression.
We can quibble about whether double parentheses are needed
I vote for not requiring the outer parentheses unless there is an adjacent comma. That would unnecessarily complicate the simple, elegant proposal. Otherwise, I would anticipate frequent questions to the help list or tutor list on why something coded like your example doesn't work. Also, the double paren form just looks funny, like there is something wrong with it but you can't tell what. Timing ------ Based on the extensive comp.lang.python discussions when I first floated a PEP on the subject, I conclude that the user community will very much accept the new form and that there is no reason to not include it in Py2.4. If there is any doubt on that score, I would be happy to update the PEP to match the current proposal for iterator expressions and solicit more community feedback. Raymond Hettinger

On Tue, 2003-10-21 at 17:59, Raymond Hettinger wrote:
Indeed, as is the term "generator expression" and the relegation to syntactic sugar of list comprehensions.
I like that too. It mirrors other situations where the parentheses aren't needed except to disambiguate syntax. In the above example, there's no ambiguity. -Barry

OK. I think I can pull it off in the Grammar.
Wonderful! Rename PEP 289 to "generator expressions" and change the contents to match this proposal. Thanks for being the fall guy! --Guido van Rossum (home page: http://www.python.org/~guido/)

[Raymond]
[Guido]
Wonderful! Rename PEP 289 to "generator expressions" and change the contents to match this proposal. Thanks for being the fall guy!
Here is a rough draft on the resurrected PEP. I'm sure it contains many flaws and I welcome suggested amendments. In particular, the follow needs attention: * Precise specification of the syntax including the edge cases with commas where enclosing parentheses are required. * Making sure the acknowledgements are correct and complete. * Verifying my understanding of the issues surrounding late binding, modification of locals, and returning generator expressions. * Clear articulation of the expected benefits. There are so many, it was difficult to keep it focused. Raymond Hettinger ---------------------------------------------------------------------- PEP: 289 Title: Generator Expressions Version: $Revision: 1.2 $ Last-Modified: $Date: 2003/08/30 23:57:36 $ Author: python@rcn.com (Raymond D. Hettinger) Status: Active Type: Standards Track Created: 30-Jan-2002 Python-Version: 2.3 Post-History: 22-Oct-2003 Abstract This PEP introduces generator expressions as a high performance, memory efficient generalization of list expressions and generators. Rationale Experience with list expressions has shown their wide-spread utility throughout Python. However, many of the use cases do not need to have a full list created in memory. Instead, they only need to iterate over the elements one at a time. For instance, the following dictionary constructor code will build a full item list in memory, iterate over that item list, and, when the reference is no longer needed, delete the list: d = dict([(k, func(v)) for k in keylist]) Time, clarity, and memory are conserved by using an generator expession instead: d = dict((k, func(v)) for k in keylist) Similar benefits are conferred on the constructors for other container objects: s = Set(word for line in page for word in line.split()) Having a syntax similar to list comprehensions makes it easy to switch to an iterator expression when scaling up application. Generator expressions are especially useful in functions that reduce an iterable input to a single value: sum(len(line) for line.strip() in file if len(line)>5) Accordingly, generator expressions are expected to partially eliminate the need for reduce() which is notorious for its lack of clarity. And, there are additional speed and clarity benefits from writing expressions directly instead of using lambda. List expressions greatly reduced the need for filter() and map(). Likewise, generator expressions are expected to minimize the need for itertools.ifilter() and itertools.imap(). In contrast, the utility of other itertools will be enhanced by generator expressions: dotproduct = sum(x*y for x,y in itertools.izip(x_vector, y_vector)) BDFL Pronouncements The previous version of this PEP was REJECTED. The bracketed yield syntax left something to be desired; the performance gains had not been demonstrated; and the range of use cases had not been shown. After, much discussion on the python-dev list, the PEP has been resurrected its present form. The impetus for the discussion was an innovative proposal from Peter Norvig. The Gory Details 1) In order to achieve a performance gain, generator expressions need to be run in the local stackframe; otherwise, the improvement in cache performance gets offset by the time spent switching stackframes. The upshot of this is that generator expressions need to be both created and consumed within the context of a single stackframe. Accordingly, the generator expression cannot be returned to another function: return (k, func(v)) for k in keylist 2) The loop variable is not exposed to the surrounding function. This both facilates the implementation and makes typical use cases more reliable. In some future version of Python, list comprehensions will also hide the induction variable from the surrounding code (and, in Py2.4, warnings will be issued for code accessing the induction variable). 3) Variables references in the generator expressions will exhibit late binding just like other Python code. In the following example, the iterator runs *after* the value of y is set to one: def h(): y = 0 l = [1,2] def gen(S): for x in S: yield x+y it = gen(l) y = 1 for v in it: print v 4) List comprehensions will remain unchanged. So, [x for x in S] is a list comprehension and [(x for x in S)] is a list containing one generator expression. 5) It is prohibited to use locals() for other than read-only use in generator expressions. This simplifies the implementation and precludes a certain class of obfuscated code. Acknowledgements: Peter Norvig resurrected the discussion proposal for "accumulation displays". Alex Martelli provided critical measurements that proved the the performance benefits of generator expressions. Samuele Pedroni provided the example of late binding. Guido van Rossum suggested the bracket free, yield free syntax. Raymond Hettinger first proposed "generator comprehensions" in January 2002. References [1] PEP 255 Simple Generators http://python.sourceforge.net/peps/pep-0255.html [2] PEP 202 List Comprehensions http://python.sourceforge.net/peps/pep-0202.html [3] Peter Norvig's Accumulation Display Proposal http:///www.norvig.com/pyacc.html Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil fill-column: 70 End:

Here is a rough draft on the resurrected PEP.
Thanks -- that was quick!
Um, please change "list expressions" back to "list comprehensions" everywhere. Global substitute gone awry? :-)
I'd prefer to use the example sum([x*x for x in range(10)])
which becomes sum(x*x for x in range(10)) (I find the dict constructor example sub-optimal because it starts with two parentheses, and visually finding the match for the second of those is further complicated by the use of func(v) for the value.)
Similar benefits are conferred on the constructors for other container objects:
(Here you can use the dict constructor example.)
^^^^^^^^ generator
^^^^^^^^^^^^ That's not valid syntax; my example was something like sum(len(line) for line in file if line.strip())
Heh? Did you keep this from the old PEP? Performance tests show that a generator function is already faster than a list comprehension, and the semantics are now defined as equivalent to creating an anonymous generator function and calling it. (There's still discussion about whether that generator function should copy the current value of all free variables into default arguments.) We need a Gory Detail item explaining the exact syntax. I propose that a generator expression always needs to be inside a set of parentheses and cannot have a comma on either side. Unfortunately this is different from list comprehensions; while [1, x for x in R] is illegal, [x for x in 1, 2, 3] is legal, meaning [x for x in (1,2,3)]. With reference to the file Grammar/Grammar in CVS, I think these changes are suitable: (1) The rule atom: '(' [testlist] ')' changes to atom: '(' [listmaker1] ')' where listmaker1 is almost the same as listmaker, but only allows a single test after 'for' ... 'in'. (2) The rule for arglist is similarly changed so that it can be either a bunch of arguments possibly followed by *xxx and/or **xxx, or a single generator expression. This is even hairier, so I'm not going to present the exact changes here; I'm confident that it can be done though using the same kind of breakdown as used for listmaker. Yes, maybe the compiler may have to work a little harder to distinguish all the cases. :-)
There is still discussion about this one.
I wouldn't mention this. assigning into locals() has an undefined effect anyway.
Can you do inline URLs in the final version? Maybe an opportunity to learn reST. :-) Or else at least add [3] to the text.
Alex Martelli provided critical measurements that proved the the performance benefits of generator expressions.
And also argued with great force that this was a useful thing to have (as have several others).
Samuele Pedroni provided the example of late binding.
(But he wanted generator expressions *not* to use late binding!)
Guido van Rossum suggested the bracket free, yield free syntax.
I don't need credits, and I wouldn't be surprised if someone else had suggested it first.
Raymond Hettinger first proposed "generator comprehensions" in January 2002.
Phillip Eby suggested "iterator expressions" as the name and subsequently Tim Peters suggested "generator expressions".
I'd point to the thread in python-dev too. BTW I think the idea of having some iterators support __copy__ as a way to indicate they can be cloned is also PEPpable; we've pretty much reached closure on that one. PEP 1 explains how to get a PEP number. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido, thanks for the quick edits of the first draft. Here is a link to the second: http://users.rcn.com/python/download/pep-0289.html The reST version is attached. [Guido]
That one sounds like a job for Alex. Raymond Hettinger ------------------------------------------------------------------ PEP: 289 Title: Generator Expressions Version: $Revision: 1.3 $ Last-Modified: $Date: 2003/08/30 23:57:36 $ Author: python@rcn.com (Raymond D. Hettinger) Status: Active Type: Standards Track Content-Type: text/x-rst Created: 30-Jan-2002 Python-Version: 2.3 Post-History: 22-Oct-2003 Abstract ======== This PEP introduces generator expressions as a high performance, memory efficient generalization of list comprehensions [1]_ and generators [2]_. Rationale ========= Experience with list comprehensions has shown their wide-spread utility throughout Python. However, many of the use cases do not need to have a full list created in memory. Instead, they only need to iterate over the elements one at a time. For instance, the following summation code will build a full list of squares in memory, iterate over those values, and, when the reference is no longer needed, delete the list:: sum([x*x for x in range(10)]) Time, clarity, and memory are conserved by using an generator expession instead:: sum(x*x for x in range(10)) Similar benefits are conferred on constructors for container objects:: s = Set(word for line in page for word in line.split()) d = dict( (k, func(v)) for k in keylist) Generator expressions are especially useful in functions that reduce an iterable input to a single value:: sum(len(line) for line in file if line.strip()) Accordingly, generator expressions are expected to partially eliminate the need for reduce() which is notorious for its lack of clarity. And, there are additional speed and clarity benefits from writing expressions directly instead of using lambda. List comprehensions greatly reduced the need for filter() and map(). Likewise, generator expressions are expected to minimize the need for itertools.ifilter() and itertools.imap(). In contrast, the utility of other itertools will be enhanced by generator expressions:: dotproduct = sum(x*y for x,y in itertools.izip(x_vector, y_vector)) Having a syntax similar to list comprehensions also makes it easy to convert existing code into an generator expression when scaling up application. BDFL Pronouncements =================== The previous version of this PEP was REJECTED. The bracketed yield syntax left something to be desired; the performance gains had not been demonstrated; and the range of use cases had not been shown. After, much discussion on the python-dev list, the PEP has been resurrected its present form. The impetus for the discussion was an innovative proposal from Peter Norvig [3]_. The Gory Details ================ 1. The semantics of a generator expression are equivalent to creating an anonymous generator function and calling it. There's still discussion about whether that generator function should copy the current value of all free variables into default arguments. 2. The syntax requires that a generator expression always needs to be inside a set of parentheses and cannot have a comma on either side. Unfortunately, this is different from list comprehensions. While [1, x for x in R] is illegal, [x for x in 1, 2, 3] is legal, meaning [x for x in (1,2,3)]. With reference to the file Grammar/Grammar in CVS, two rules change: a) The rule:: atom: '(' [testlist] ')' changes to:: atom: '(' [listmaker1] ')' where listmaker1 is almost the same as listmaker, but only allows a single test after 'for' ... 'in'. b) The rule for arglist needs similar changes. 2. The loop variable is not exposed to the surrounding function. This facilates the implementation and makes typical use cases more reliable. In some future version of Python, list comprehensions will also hide the induction variable from the surrounding code (and, in Py2.4, warnings will be issued for code accessing the induction variable). 3. There is still discussion about whether variable referenced in generator expressions will exhibit late binding just like other Python code. In the following example, the iterator runs *after* the value of y is set to one:: def h(): y = 0 l = [1,2] def gen(S): for x in S: yield x+y it = gen(l) y = 1 for v in it: print v 4. List comprehensions will remain unchanged:: [x for x in S] # This is a list comprehension. [(x for x in S)] # This is a list containing one generator expression. Acknowledgements ================ * Raymond Hettinger first proposed the idea of "generator comprehensions" in January 2002. * Peter Norvig resurrected the discussion in his proposal for Accumulation Displays [3]_. * Alex Martelli provided critical measurements that proved the performance benefits of generator expressions. He also provided strong arguments that they were a desirable thing to have. * Phillip Eby suggested "iterator expressions" as the name. * Subsequently, Tim Peters suggested the name "generator expressions". * Samuele Pedroni argued against late binding and provided the example shown above. References ========== .. [1] PEP 202 List Comprehensions http://python.sourceforge.net/peps/pep-0202.html .. [2] PEP 255 Simple Generators http://python.sourceforge.net/peps/pep-0255.html .. [3] Peter Norvig's Accumulation Display Proposal http:///www.norvig.com/pyacc.html Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End:

On Wednesday 22 October 2003 03:57 am, Raymond Hettinger wrote:
I probably missed it in this monster of a thread, but how do generator expressions do this? It seems that they'd only make reduce more efficient, but it would still be just as needed as before. Jeremy

All we need is more standard accumulator functions like sum(). There are many useful accumulator functions that aren't easily expressed as a binary operator but are easily done with an explicit iterator argument, so I am hopeful that the need for reduce will disappear. 99% use cases for reduce were with operator.add, and that's replaced by sum() already. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido:
But this would still be true even if we introduced such functions *without* generator expressions, i.e. given some new standard accumulator foo_accumulator which accumulates using foo_function, you can write r = foo_accumulator(some_seq) instead of r = reduce(foo_function, some_seq) regardless of whether some_seq is a regular list or a generator expression. So it seems to me that generator expressions have *no* effect on the need or otherwise for reduce, and any suggestion to that effect should be removed from the PEP as misleading and confusing. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

After some thinking, I agree. The only (indirect) link is that generator expressions make it more attractive to start writing accumulator functions, and having more accumulator functions available eliminates the need for reduce(). I'll update the PEP as needed (Raymond already toned down its mention of reduce()). --Guido van Rossum (home page: http://www.python.org/~guido/)

At 23:02 21.10.2003 -0700, Guido van Rossum wrote:
Samuele Pedroni provided the example of late binding.
(But he wanted generator expressions *not* to use late binding!)
to be honest no, I was just arguing for coherent behavior between generator expressions and closures, Tim and Phillip J. Eby argued (are arguing) against late binding. It is true that successively in an OT-way I mildly proposed non-late-binding semantics but for _all_ closures wrt to free variables apart from globals, but I got that a fraction of people still would like rebinding support for closer-over vars (something I don't miss personally) , and there are subtle issues wrt recursive references, which while solvable would make the semantics rather DWIMish , not a good thing. Samuele.

On Wed, Oct 22, 2003 at 01:19:58AM -0400, Raymond Hettinger wrote:
I see generator expression as making available the iterator guts of list comprehensions available as a first class object. The list() call is not always wanted.
1) In order to achieve a performance gain, generator expressions need to be run in the local stackframe
[...]
Accordingly, the generator expression cannot be returned to another function:
That would be unacceptable, IMHO. Generator expressions should be first class. Luckily, generator functions are speedy little buggers. :-) Neil

[Skip]
Not necessarily. Maybe the time machine's stuck. :-)
Thanks for trying to bang some sense into this. Personally, I still like the idea best to make (x for x in S) be an iterator comprehension and [x for x in S] syntactic sugar for the common operation list((x for x in S)) I'm not 100% sure about requiring the double parentheses, but I certainly want to require extra parentheses if there's a comma on either side, so that if we want to pass a 2-argument function a list comprehension, it will have to be parenthesized, e.g. foo((x for x in S), 42) bar(42, (x for x in S)) This makes me think that it's probably fine to also require sum((x for x in S)) Of course, multiple for clauses and if clauses are still supported just like in current list comprehensions; they add no new syntactic issues, except if we also were to introduce conditional expressions. :-) --Guido van Rossum (home page: http://www.python.org/~guido/)

Yes. (Raymond, you might mention this in the PEP.) --Guido van Rossum (home page: http://www.python.org/~guido/)

[Skip]
Sorry, too late. You're hugely underestimating the backwards compatibility issues. And they have been in use at least since 2000 (they were introduced in 2.0). [Alex]
But that's not very common, so I don't see the point of putting in the effort, plus it's not safe. Using a LC as the sequence of a for loop is ugly, and usually for x in [y for y in S if P(y)]: ... means the same as for x in S: if P(x): ... except when it doesn't, and then making the list comprehension lazy can be a mistake: the following example for key in [k for k in d if d[k] is None]: del d[key] is *not* the same as for key in d: if d[key] is None: del d --Guido van Rossum (home page: http://www.python.org/~guido/)

On Tuesday 21 October 2003 06:53 pm, Guido van Rossum wrote: ...
It IS common, at least in the code I write, e.g.: d = dict([ (f(a), g(a)) for a in S ]) s = sets.Set([ a*a for a in S ]) totsq = sum([ x*x for x in S ]) etc. I detest the look of those ([ ... ]), but that's the closest I get to dict comprehensions, set comprehensions, etc.
Well, no, but even if that last statement was "del d[key]" you'd still be right:-). Even in a situation where the list comp is only looped over once, code MIGHT still be relying on the LC having "snapshotted" and/or exhausted iterators IT uses. I was basically thinking of passing the LC as argument to something -- the typical cases where I use LC now and WISH they were lazy, as above -- rather about for loops. And even when the LC _is_ an argument there might be cases where its current strict (nonlazy) semantics are necessary. Oh well! Alex

OK, but you hve very little hope of optimizing the incarnation away by the compiler (especially since our attempts at warning about surreptitious changes to builtins had to be withdrawn before 2.3 went out).
:-(
Yes, this is why iterator comprehensions (we need a better term!!!) would be so cool to have (I think much cooler than conditional expressions). --Guido van Rossum (home page: http://www.python.org/~guido/)

At 08:57 AM 10/21/03 -0500, Skip Montanaro wrote:
If you make it a list that's lazy, it doesn't lose the memory allocation overhead for the list. If I understand Alex's benchmarks, making a lazy list would end up being *slower* than list comprehension is now. I previously proposed a different solution earlier in this thread, where you get a pseudo-list that, if iterated, runs the underlying generator function. But there were issues with possible side-effects (not to mention reiterability) of the underlying iterator on which the comprehension was based.

Skip Montanaro wrote:
In an attempt to make sure I understand what is being discussed, I am going to take a stab at this. That way when someone corrects me two people get there questions; two birds, one shotgun.
It returns a generator function.
Extreme shorthand for a common idiom?
OTOH, if lambda: x: x+1 is okay, then why not:
yield: x for x in S
I was actually thinking that myself, but I would rather keep lambda as this weird little child of Python who can always be spotted for its predisposition toward pink hot pants (images of "Miami Vice" flash in my head...). Personally I am not seeing any extreme need for this feature. I mean the example I keep seeing is ``sum((yield x*2 for x in foo))``. But how is this such a huge win over ``sum([x*2 for x in foo])``? I know there is a memory perk since the entire list won't be constructed, but unless there is a better reason I see abuse on the horizon. The misuse of __slots__ has shown that when something is added that seems simple and powerful it will be abused by a lot of programmers thinking it is the best thing to use for anything they can shoe horn it into. I don't see this as such an abuse issue as __slots__, mind you, but I can still see people using it where a list comp may have been better. Or even having people checking themselves on whether to use this or a list comp and just using this because it seems cooler. I know I am personally +0 on this even after my above worries since I don't see my above arguments are back-breakers and those of us who do know how to properly to use it will get a perk out of it. -Brett

At 12:46 PM 10/17/03 -0700, Brett C. wrote:
No, it returns an iterator. Technically a generator-iterator, but definitely not a generator function, just as [x for x in y] doesn't return a function that returns a list. :)
It's not an extreme need; if it were, it'd have been added in 2.2, where all extreme Python needs were met. ;)
I'm sort of +0 myself; there are probably few occasions where I'd use a gencomp. But I'm -1 on creating special indexing or listcomp-like accumulator syntax, so gencomps are a fallback position. I'm not sure gencomp is the right term for these things anyway... calling them iterator expressions probably makes more sense. Then there's not the confusion with generator functions, which get called. And this discussion has made it clearer that having 'yield' in the syntax is just plain wrong, because yield is a control flow statement. These things are really just expressions that act over iterators to return another iterator. In essence, an iterator expression is just syntax for imap and ifilter, in the same way that a listcomp is syntax for map and filter. Really, you could now write imap and ifilter as functions that compute iterator expressions, e.g.: imap = lambda func,items: func(item) for item in items ifilter = lambda func, items: item for item in items if func(item) Which of course means there'd be little need for imap and ifilter, just as there's now little need for map and filter. Anyway, if you look at '.. for .. in .. [if ..]' as a ternary or quaternary operator on an iterator (or iterable) that returns an iterator, it makes a lot more sense than thinking of it as having anything to do with generator(s). (Even if it might be implemented that way.)

"Phillip J. Eby" <pje@telecommunity.com> writes:
I've reached the point of skimming this discussion, but this struck a chord. I think the original proposal (for special syntax for accumulators) is too limited, and if anything is needed (not clear on that) it should be a generalised iterator comprehension construct. In that context, it seems to me that iterator comprehensions bear a very similar relationship to imap/ifilter to the relationship between map/filter and list comprehensions. Paul. -- This signature intentionally left blank

On Friday 17 October 2003 08:57 pm, Skip Montanaro wrote: ...
Neither: it returns an iterator, _equivalent_ to the one that would be returned by _calling_ a generator such as def xxx(): for x in S: yield x like xxx() [the result of the CALL to xxx, as opposed to xxx itself], (yield: x for x in S) is not callable; rather, it's loopable-on.
you don't like lambda, I can't quite see why syntax this is all that appealing.
I don't really like the current state of lambda (and it will likely never get any better), I particularly don't like the use of the letter lambda for this idea (Church's work notwithstanding, even Paul Graham in his new lispoid language has chosen a more sensible keyword, 'func' I believe), but I like comprehensions AND iterators, and the use of the word yield in generators. I'm not quite sure what parallels you see between the two cases. Alex

>> Is it supposed to return a generator function which I can assign to a >> variable (or pass to the builtin function sum() as in your example) >> and call later, or is it supposed to turn the current function into a >> generator function (so that each executed yield statement returns a >> value to the caller of the current function)? Alex> Neither: it returns an iterator, _equivalent_ to the one that Alex> would be returned by _calling_ a generator such as Alex> def xxx(): Alex> for x in S: Alex> yield x All the more reason not to like this. Why not just define the generator function and call it? While Perl sprouts magical punctuation, turning its syntax into line noise, Python seems to be sprouting multiple function-like things. We have * functions * unbound methods * bound methods * generator functions * iterators (currently invisible via syntax, but created by calling a generator function?) * instances magically callable via __call__ and now this new (rather limited) syntax for creating iterators. I am beginning to find it all a bit confusing and unsettling. Skip

At 03:20 PM 10/17/03 -0500, Skip Montanaro wrote:
The last item on the list encompasses at least the first three. But you also left out __init__ and __new__, which are really ClassType.__call__ or type.__call__, though. :) To me (and the interpreter, actually), there's just tp_call, tp_iter, and tp_iternext (or whatever their actual names are). Callability, iterability, and iterator-next. Many kinds of objects may have these aspects, just as many kinds of objects may be addable with '+'. Of the things you mention, however, most don't actually have different syntax for creating them, and some are even the same object type (e.g. unbound and bound methods). And the syntax for *using* them is always uniform: () always calls an object, for ... in ... creates an iterator from an iterable, .next() goes to the next item.
and now this new (rather limited) syntax for creating iterators.
Actually, as now being discussed, list comprehensions would be a special case of an iterator expression.
I am beginning to find it all a bit confusing and unsettling.
Ironically, with iterator comprehension in place, a list comprehension would now look like a list containing an iterator, which I agree might be confusing. Too bad we didn't do iterator comps first, or list(itercomp) would be the idiomatic way to make a listcomp. That's really the only confusing bit I see about itercomps... that you have to be careful where you put your parentheses, in order to make your intentions clear in some contexts. However, that's true for many kinds of expressions even now.

In article <5.1.0.14.0.20031017162527.03eb58a0@mail.telecommunity.com>, "Phillip J. Eby" <pje@telecommunity.com> wrote:
Along with that confusion, (x*x for x in S) would look like a tuple comprehension, rather than a bare iterator. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science

Along with that confusion, (x*x for x in S) would look like a tuple comprehension, rather than a bare iterator.
Well, () is already heavily overloaded, so I can live with that. --Guido van Rossum (home page: http://www.python.org/~guido/)

On Friday 17 October 2003 10:38 pm, Phillip J. Eby wrote: ...
Yes. But don't mind me, I'm still sad that we have range and xrange when iter(a:b) and list(a:b:c) would be SUCH good replacements for them if slicing-notation was accepted elsewhere than in indexing, or iter[a:b] and list[a:b:c] if some people didn't so strenuously object to certain perfectly harmless uses of indexing...;-)
Yes. But since iterator comprehensions are being designed from scratch I think we can MANDATE parentheses around them, and a 'yield' right after the open parenthesis for good measure, to ensure they are not ambiguous to human readers as well as to parsers. Alex

This has been proposed more than once (I think the last time by Paul Dubois, who wanted x:y:z to be a general expression), and has a certain elegance, but is probably too terse. --Guido van Rossum (home page: http://www.python.org/~guido/)

On Friday 17 October 2003 11:50 pm, Guido van Rossum wrote:
Perhaps mandatory parentheses around it (as sole argument in a function call, say) might make it un-terse enough for acceptance...? The frequence of counted loops IS such that replacing for x in range(9): ... with for x in (0:9): ... WOULD pay for itself soon in reduced wear and tear on keyboards...;-) [Using iter(0:9) instead would be only "conceptually neat", no typing advantage on range -- conceded]. Alex

On 17-okt-03, at 22:20, Skip Montanaro wrote:
And you even forget lambda:-) I agree with Skip here: there's all this magic that crept into Python since 2.0 (approximately) that really hampers readability to novices. And here I mean novices in the wide sense of the word, i.e. including myself (novice to the new concepts). Some of these look like old concepts but are really something completely different (generators versus functions), some are really little more than keystroke savers (list comprehensions). -- Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman

On Friday 17 October 2003 10:20 pm, Skip Montanaro wrote: ...
The usual problems: having to use several separate statements, and name something that you are only interested in using once, is a bit conceptually cumbersome when you could use a clear inline expression "right where you need it" for the same purpose. Moreover, it seems a bit strange to be able to use the well-liked comprehension syntax only at the price of storing all intermediate steps in memory -- and have to zoom up to several separate statements + a name if you'd rather avoid the memory overhead, e.g.: sum( [x+x*x for x in short_sequence if x >0] ) is all right, BUT if the sequence becomes too long then def gottagiveitaname(): for x in long_sequence: if x>0: yield x+x*x sum( gottagiveitaname() ) That much being said, I entirely agree that the proposal is absolutely NOT crucial to Python -- it will not enormously expand its power nor its range of applicability. I don't think it's SO terribly complicated to require application of such extremely high standards, though. But if the consensus is that ONLY lists are important enough to deserve the beauty of comprehensions, and EVERY other case must either pay the memory price of a list or the conceptual one of calling and then invoking a one-use-only generator, so be it, I guess.
Every one of these was in Python when I first met it, except generators -- and iterators, which are NOT function-like in the least, nor "invisible" (often, an iterator is an instance of an explicitly coded class or type with a next() method). You seem to have forgotten lambda, though -- and classes/types (all callable -- arguably via __call__ in some sense, but you could say just the same of functions &c). Which ALSO were in Python when I first met it. So, I see no "sprouting" -- Python has "always" (from my POV) had a wide variety of callables.
and now this new (rather limited) syntax for creating iterators.
...which isn't function-like either, neither in syntax nor in semantics. Yes, it's limited -- basically to the same cases as list comprehensions, except that (being an iterator and not a list) there is no necessary implication of finiteness.
I am beginning to find it all a bit confusing and unsettling.
I hear you, and I worry about this general effect on you, but I do not seem to be able to understand the real reasons. Any such generalized objection from an experienced Pythonista like you is well worthy of making everybody sit up and care, it seems to me. But exactly because of that, it might help if you were able to articulate your unease more precisely. Python MAY well have accumulated a few too many things in its long, glorious story -- because (and for good reason!) we keep the old cruft around for backwards compatibility, any change means (alas) growth. Guido is on record as declaring that release 3.0 will be about simplification: removing some of the cruft, taking advantage of the 2->3 bump in release number to break a little bit (not TOO much) backwards compatibility. Is Python so large today that we can't afford another release, 2.4, with _some_ kind of additions to the language proper, without confusing and unsettling long-time, experienced, highly skilled Pythonistas like you? Despite the admirable _stationariety_ of the language proper throughout the 2.2 and 2.3 eras...? If something like that is your underlying feeling, it may be well worth articulating -- and perhaps we need to sit back and listen and take stock (hey, I'd get to NOT have to write another edition of the Nutshell for a while -- maybe I should side strongly with this thesis!-). If it's something else, more specific to this set of proposals for accumulators / comprehensions, then maybe there's some _area_ in which any change is particularly unwelcome? But I can't guess with any accuracy... Alex

On Friday 17 October 2003 07:15 pm, Guido van Rossum wrote:
Let it rest in peace, then.
Let's look for an in-line generator notation instead. I like
sum((yield x for x in S))
So do I, _with_ the mandatory extra parentheses and all, and in fact I think it might be even clearer with the extra colon that Phil had mentioned, i.e. sum((yield: x for x in S))
but perhaps we can make this work:
sum(x for x in S)
Perhaps the parser can be coerced to make this work, but the mandatory parentheses, the yield keyword, and possibly the colon, too, may all help, it seems to me, in making this syntax stand out more. Yes, some uses may "read" more naturally with as little extras as feasible, notably [examples that might be better done with list comprehensions except for _looks_...]: even_digits = Set(x for x in range(0, 10) if x%2==0) versus even_digits = Set((yield: x for x in range(0, 10) if x%2==0)) but that may be because the former notation leads back to the "set comprehensions" that list comprehensions were originally derived from. I don't think it's that clear in other cases which have nothing to do with sets, such as, e.g., Peter Norvig's original examples of "accumulator displays". And as soon as you consider the notation being used in any situation EXCEPT as the ONLY argument in a call...: foo(x, y for y in glab for x in blag) yes, I know this passes ONE x and one iterator, because to pass one iterator of pairs one would have to write foo((x, y) for y in glab for x in blag) but the distinction between the two seems quite error prone to me. BTW, semantically, it WOULD be OK for these iterator comprehension to NOT "leak" their control variables to the surrounding scope, right...? I do consider the fact that list comprehensions "leak" that way a misfeature, and keep waiting for some fanatic of assignment-as-expression to use it IN EARNEST, e.g., to code his or her desired "while c=beep(): boop(c)", use while [c for c in [beep()] if c]: boop(c) ...:-). Anyway, back to the subject, those calls to foo seem very error-prone, while: foo(x, (yield: y for y in glab for x in blag)) (mandatory extra parentheses, 'yield', and colon) seems far less likely to cause any such error. Alex

[Guido]
Let's look for an in-line generator notation instead. I like
sum((yield x for x in S))
[Alex]
Hm. I'm not sure that it *should* stand out more. The version with the yield keyword and the colon draws undue attention to the mechanism. I bet that if you showed sum(x for x in range(10)) to a newbie they'd have no problem understanding it (their biggest problem would be that range(10) is [0, 1, ..., 9] rather than [1, 2, ..., 10]) but if you showed them sum((yield: x for x in S)) they would probably scratch their heads. I also note that if it wasn't for list comprehensions, the form <expr> for <vars> in <expr> poses absolutely no problems to the parser, since it's just a ternary operator (though the same is true for the infamous <expr> if <test> else <expr> :-). List comprehensions make this a bit difficult because they use the same form in a specific context for something different; at the very best this would mean that [x for x in S] and [(x for x in S)] are completely different beasts: the first would be equivalent to list(S) while the second would be equivalent to [iter(S)] i.e. a list whose only only element is an iterator over S (not a very useful thing to have, except perhaps if you had a function taking a list of iterators as an argument).
Let's go over the examples from http://www.norvig.com/pyacc.html : [Sum: x*x for x in numbers] sum(x*x for x in numbers) [Product: Prob_spam(word) for word in email_msg] product(Prob_spam(word) for word in email_msg) [Min: temp(hour) for hour in range(24)] min(temp(hour) for hour in range(24)) [Mean: f(x) for x in data] mean(f(x) for x in data) [Median: f(x) for x in data] median(f(x) for x in data) [Mode: f(x) for x in data] mode(f(x) for x in data) So far, these can all be written as simple functions that take an iterable argument, and they look as good with an iterator comprehension as with a list argument. [SortBy: abs(x) for x in (-2, -4, 3, 1)] This one is a little less obvious, because it requires the feature from Norvig's PEP that if add() takes a second argument, the unadorned loop control variable is passed in that position. It could be done with this: sortby((abs(x), x) for x in (-2, 3, 4, 1)) but I think that Raymond's code in CVS is just as good. :-) Norvig's Top poses no problem: top(humor(joke) for joke in jokes) In conclusion, I think this syntax is pretty cool. (It will probably die the same death as the ternary expression though.)
And as soon as you consider the notation being used in any situation EXCEPT as the ONLY argument in a call...:
Who said that? I fully intended it to be an expression, acceptable everywhere, though possibly requiring parentheses to avoid ambiguities (in list comprehensions) or excessive ugliness (e.g. to the right of 'in' or 'yield').
It would requier extra parentheses here: foo(x, (y for y in glab for x in blag))
Yes. (I think list comprehensions shouldn't do this either; it's just a pain to introduce a new scope; maybe such control variables should simply be renamed to "impossible" names like the names used for the anonymous first argument to f below: def f((a, b), c): ...
Yuck. Fortunately that would be quite slow, and the same fanatics usually don't like that. :-)
I could live with the extra parentheses. Then we get: (x for x in S) # iter(S) [x for x in S] # list(S) --Guido van Rossum (home page: http://www.python.org/~guido/)

On Friday 17 October 2003 11:45 pm, Guido van Rossum wrote: ...
In conclusion, I think this syntax is pretty cool. (It will probably die the same death as the ternary expression though.)
Ah well -- in this case I guess I won't go to the bother of deciding whether I like your preferred "lighter" syntax or the "stands our more" one. The sad, long, lingering death of the ternary expression was too painful to repeat -- let's put this one out of its misery sooner. Alex

Alex Martelli <aleaxit@yahoo.com> writes:
The saddest thing about the ternary operator saga (and it may be the fate of this as well) was that the people who wanted the *semantics* destroyed their own case by arguing over *syntax*. I suspect that the only way out of this would be for someone to have just implemented it, with whatever syntax they preferred. Then it either goes in or not, with Guido's final veto applying, as always. Possibly the same is the case here. Unless someone implements iterator comprehensions, with whatever syntax they feel happiest with, arguments about syntax are sterile, and merely serve to fragment the discussion, obscuring the more fundamental question of whether the semantics is wanted or not. Paul -- This signature intentionally left blank

I don't see it that way. There were simply too many people who didn't want it in *any* form (and even if they weren't a strict majority, there were certainly too many to ignore).
It was implemented (several times). That wasn't the point at all.
Not true. There are only two major syntax variations contending (with or without yield) and some quibble about parentheses, and everybody here seems to agree that either version could work. The real issue is whether it adds enough to make it worthwhile to change the language (again). My current opinion is that it isn't: for small datasets, the extra cost of materializing the list using a list comprehension is negligeable, so there's no need for a new feature, and if you need to support truly large datasets, you can afford the three extra lines of code it takes to make a custom iterator or generator. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido on iterator comprehensions:
Maybe it's time to get back to what started all this, which was a desire for an accumulation syntax. (Actually it was a proposal to abuse a proposed accumulation syntax to get sorting, if I remember correctly, but let's ignore that detail for now...) Most of us seem to agree that having list comprehensions available as a replacement for map() and filter() is a good thing. But what about reduce()? Are there equally strong reasons for wanting an alternative to that, too? If not, why not? And if we do, maybe a general iterator comprehension syntax isn't the best way to go. It seemed that way at first, but that seems to have led us into a bit of a quagmire. So, taking the original accumulator display idea, and incorporating some of the ideas that have come up along the way, such as getting rid of the square brackets, how about sum of x*x for x in xvalues average of g for g in grades maximum of f(x, y) for x in xrange for y in yrange top(10) of humour(joke) for joke in comedy etc.? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

On Monday 20 October 2003 06:44 am, Greg Ewing wrote: ...
Wow. I'm speechless. [later, having recovered speech] IF (big if) we could pull THAT off, it WOULD be well worth making 'of' a keyword (and thus requiring a "from __future__ import"). It's SO beautiful, SO pythonic, the only risk I can see is that we'd have newbie people coding: sum of the_values rather than: sum(the_values) or: sum of x for x in the_values We could (and hopefully will) quibble about the corresponding semantics (particularly for the top(10) example, implicitly requiring some "underlying sequence" to be made available while all other uses require no such black magic). But this is the first proposed new syntax I've seen in a long time -- not just on this thread -- that is SO pretty it makes me want it in the language FOR ITSELF -- to reinforce the "Python is executable pseudocode" idea!!! -- rather than just as a means to the end of having the underlying semantics available. I can but hope others share my fascination with it... in any case, whatever happens to it, *BRAVO*, Greg!!! Alex

Alex Martelli strung bits together to say:
Except, if it was defined such that you wrote: sum of [x*x for x in the_values] then: sum of the_values would actually be a valid expression, and Greg's examples would become: sum of xvalues average of grades maximum of [f(x, y) for x in xrange for y in yrange] top(10) of [humour(joke) for joke in comedy] Either way, that's some seriously pretty executable psuedocode he has happening! And a magic method "__of__" that takes a list as an argument might be enough to do the trick, too. Cheers, Nick. -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions."

On Monday 20 October 2003 03:22 pm, Nick Coghlan wrote: ...
Yes, you COULD extend the syntax from Greg's NAME 'of' listmaker to _also_ accept NAME 'of' test or thereabouts (in the terms of dist/src/Grammar/Grammar of course), I don't think it would have any ambiguity. As to whether it's worth it, I dunno.
sum of xvalues
Nope, he's summing the _squares_ -- sum of x*x for x in xvalues it says.
average of grades
Yes, this one would then work.
maximum of [f(x, y) for x in xrange for y in yrange]
Yes, you could put brackets there, but why?
top(10) of [humour(joke) for joke in comedy]
Ditto -- and it doesn't do the job unless the magic becomes even blacker. top(N) is supposed to return jokes, not their humor values; so it needs to get an iterable or iterator of (humor(joke), joke) PAIRS -- I think it would DEFINITELY be better to have this spelled out, and in fact I'd prefer: top(10, key=humour) of comedy or top(10, key=humour) of joke for joke in comedy using the same neat syntax "key=<callable>" just sprouted by lists' sort method.
Agreed on the prettiness. I would prefer to have the special method be defined to receive "an iterator or iterable" -- so we can maybe put together a prototype where we just make and pass it a list, BUT keep the door open to passing it an "iterator comprehension" in the future. Or maybe make it always an iterator (in the prototype we can just build the list and call iter on it anyway... so it's not any harder to get started playing with it). Oh BTW, joining another still-current thread -- for x in sorted_copy of mylist: ... now doesn't THAT read just wonderfully, too...?-) Alex

Alex Martelli strung bits together to say:
Actually, I was suggesting that if 'of' is simply designated as taking a list* on the right hand side, then you can just write a list comprehension there, without needing the parser to understand the 'for' syntax in that case. But I don't know enough about the parser to really know if that would be a saving worth making. (* a list is what I was thinking, but as you point out, an iterable would be better)
D'oh - and I got that one right higher up, too. Ah, well.
maximum of [f(x, y) for x in xrange for y in yrange]
Yes, you could put brackets there, but why?
I though it would be easier on the parser (only accepting a list/iterable on the right hand side). I don't know if that's actually true, though.
Yes, that would make it a lot clearer what was going on.
Well, I think we've established that at least two people on the planet love this idea. . . and agreed on the iterator/iterable vs lists, too. I only thought of that distinction after I'd already hit send :)
Not to mention: for x in sorted_copy of reversed_copy of my_list: ... for x in sorted_copy(key=len) of my_list: ... Indeed, _that_ is a solution that looks truly Pythonic! Hmm, just had a strange thought: y = copy of x How would that be for executable pseudocode? It's entirely possible to do all the iterator related things without having this last example work. But what if it did? Cheers, Nick. __of__: just a single-argument function call? -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions."

On Monday 20 October 2003 04:37 pm, Nick Coghlan wrote: ...
Well, I think we've established that at least two people on the planet love
Right, hopefully 3 with Greg (though it's not unheard of for posters to this list to change their minds about their own proposals. So I told myself I should stay out of the thread to let others voice their opinion, BUT...:
for x in sorted_copy of reversed_copy of my_list:
Ooops -- sorting a reversed copy of my_list is just like sorting my_list... I think for x in sorted_copy(reverse=True) of my_list: ... (again borrowing brand-new keyword syntax from lists' sort method) is likely to work better...:-)
Awesomely pseudocoder (what a comparative...!-) wrt the current "y = copy.copy(x)". You WOULD need to "from copy import copy" first, presumably, but still...
all the iterator related things without having this last example work. But what if it did?
Then the special method would have to be passed the right-hand operand verbatim, NOT an iterator on it, for the "NAME 'of' test" case; otherwise, this would be a terrible "attractive nuisance" in such cases as x = copy of my_dict (if the hypothetical special method was passed iter(my_dict), it would only get the KEYS -- shudder -- so x would presumably end up as a list -- a trap for the unwary, and one I wouldn't want to have to explain to newbies!-). However, if I had to choose, I would forego this VERY attractive syntax sugar, and go for Greg's original suggestion -- 'of' for iterator comprehensions only. Syntax sugar is all very well (at least in this case), but if it _only_ amounts to a much neater-looking way of doing what is already quite possible, it's a "more-than-one-way-to-do-itis". [Just to make sure I argue both sides: introducing "if key in mydict:" as a better way to express "if mydict.has_key(key):" was a HUGE win, and so was letting "if needle in haystack:" be used as a better way to express "haystack.find(needle) >= 0" for substring checks -- so, 'mere' syntax sugar DOES sometimes make an important difference...] Alex

Alex Martelli strung bits together to say:
(slightly OT for this thread, but. . .) I got the impression that: l.sort(reverse=True) was stable w.r.t. items that sort equivalently, while: l.reverse() l.sort() was not. I.e. the "reverse" in the sort arguments refers to reversing the order of the arguments to the comparison operation, rather than to reversing the list.
Yes - quite pretty, but ultimately confusing, I think (as a few people have pointed out). However, getting back to Greg's original point - that our goal is to find a syntax that does for "reduce" what list comprehensions did for "map" and "filter", I realised last night that this "of" syntax isn't it. The "of" syntax requires us to have an existing special operator to handle the accumulation (e.g. sum or max), whereas what reduce does is let us take an existing binary function (e.g. operator.add), and feed it a sequence element-by-element, accumulating the result. If we already have a method that can extract the result from we want from a seqeunce, then list comprehensions and method calls are perfectly adequate. (starts thinking about this from the basics of what the goal is) So what we're really talking about is syntactic sugar for: y = 0 for x in xvalues: if (x > 0): y = y + (x*x) We want to be able to specify the object to iterate over, the condition for which elements to consider (filter), an arbitrary function involving the element (map), and the method we want to use to accumulate the elements (reduce) If we had a list comprehension: squares_of_positives = [x*x for x in xvalues if x > 0] the original unrolled form would have been: squares_of_positives = [] for x in xvalues: if (x > 0): squares_of_positives.append(x*x) So list comprehensions just fix the accumulation method (appending to the result list). So what we need is a way to describe how to accumulate the result, as well as the ability to initialise the cumulative result: y = y + x*x from y = 0 for x in xvalues if x > 0 Yuck. Looks like an assignment, but is actually an accumulation expression. Ah, how about: y + x*x from y = 0 for x in xvalues if x > 0 The 'from' clause identifies the accumulation variable, in just the same way the 'for' clause identifies the name of the current value from the iterable. Cheers, Nick. -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions."

Except, if it was defined such that you wrote: sum of [x*x for x in the_values]
I don't think that would be a good idea, because the square brackets make it look less efficient than it really is, and leave you wondering why you shouldn't just write a function call with a listcomp as argument instead. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

Greg Ewing wrote:
I've thought about this, and I don't think I like it. "of" just seems like a new and confusingly different way to spell a function call. E.g., if I read this max([f(x,y) for x in xrange for y in yrange]) out-loud, I'd say: "the maximum of f of x and y for x in xrange, and y in yrange" So perhaps that third example should be spelt: maximum of f of x, y for x in xrange for y in yrange <wink>. This particularly struck me when I read Alex's comment:
Actually, that strikes me as an odd way of spelling: for x in sorted_copy(mylist): ... I think the lazy iteration syntax approach was probably a better idea. I don't like the proposed use of "yield" to signify it, though -- "yield" is a flow control statement, so the examples using it in this thread look odd to me. Perhaps it would be best to simply use the keyword "lazy" -- after all, that's the key distinguishing feature. I think my preferred syntax would be: sum([lazy x*x for x in sequence]) But use of parens instead of brackets, and/or a colon to make the keyword stand out (and look reminisicent to a lambda! which *is* a related concept, in a way -- it also defers evaluation), e.g.: sum((lazy: x*x for x in sequence)) Would be fine with me as well. -Andrew.

"Andrew Bennetts" <andrew-pythondev@puzzling.org> wrote in message ...
Same here.
I like this the best of suggestions so far. Easy to understand, easy to teach: [lazy ...] = iter([...]) but produced more efficiently
I prefer sticking with [...] for 'make a (possibly virtual) list'. Having removed ':' when abbreviating _[] for i in seq: _.append[expr] as an expression, it seems odd to bring it back for a special case. I wish ':' could have also been removed from the lambda abbreviation of def. Terry J. Reedy

-1. An iterator is not a lazy list. A lazy list would support indexing, slicing, etc. while calculating its items on demand. An iterator is inherently sequential and single-use -- a different concept. But maybe some other keyword could be added to ease any syntactic problems, such as "all" or "every": sum(all x*x for x in xlist) sum(every x*x for x in xlist) The presence of the extra keyword would then distinguish an iterator comprehension from the innards of a list comprehension. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

On Tuesday 21 October 2003 09:27 am, Greg Ewing wrote: ...
Heh, you ARE a volcano of cool syntactic ideas these days, Greg. As between them, to me 'all' sort of connotes 'all at once' while 'every' connotes 'one by one' (so would a third possibility, 'each'); so 'all' is the one I like least. Besides accumulators &c we should also think of normal loops: for a in all x*x for x in xlist: ... for a in every x*x for x in xlist: ... for a in each x*x for x in xlist: ... Of these three, 'every' looks best to me, personally. Alex

I'd rather reserver these keywords for conditions using quantifiers, like in ABC. --Guido van Rossum (home page: http://www.python.org/~guido/)

Terry> "Andrew Bennetts" <andrew-pythondev@puzzling.org> wrote in message >> ... I think the lazy iteration syntax approach was probably a better >> idea. I don't like the proposed use of "yield" to signify it, though >> -- >> "yield" is a flow control statement, so the examples using it in this >> thread look odd to me. Terry> Same here. And probably contributed to my initial confusion about what the proposed construct was supposed to do. (I'm still not keen on it, but at least I understand it better.) Skip

If anything, the desire there is *more* pressing. Except for operator.add, expressions involving reduce() are notoriously hard to understand (except to experienced APL or Scheme hackers :-). Things like sum, max, average etc. are expressed very elegantly with iterator comprehensions. I think the question is more one of frequency of use. List comps have nothing over e.g. result = [] for x in S: result.append(x**2) except compactness of exprssion. How frequent is result = 0.0 for x in S: result += x**2 ??? (I've already said my -1 about your 'sum of ...' proposal.) --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum strung bits together to say:
Just so this suggestion doesn't get buried in the part of the thread where I was getting rather carried away about Greg's 'of' syntax (sorry!). What about: result + x**2 from result = 0.0 for x in S Essentially short for: result = 0.0 for x in S: result = result + x**2 Cheers, Nick. -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions."

On Tuesday 21 October 2003 03:41 pm, Nick Coghlan wrote: ---
Not bad, but I'm not sure I like the strict limitation to "A = A + f(x)" forms (possibly with some other operator in lieu of + etc, of course). Say I want to make a sets.Set out of the iterator, for example: result.union([ x**2 ]) from result = sets.Set() for x in theiter now that's deucedly _inefficient_, consarn it!, because it maps to a loop of: result = result.union([ x** ]) so I may be tempted to try, instead: real_result = sets.Set() real_result.union_update([ x**2 ]) from fake_result = None for x in theiter and hoping the N silly rebindings of fake_result to None cost me less than not having to materialize a list from theiter would cost if I did real_result = sets.Set([ x**2 for x in theiter ]) I don't think we should encourage that sort of thing with the "implicit assignment" in accumulation. So, if it's an accumulation syntax we're going for, I'd much rather find ways to express whether we want [a] no assignment at all (as e.g for union_update), [b] plain assignment, [c] augmented assignment such as += or whatever. Sorry, no good idea comes to my mind now, but I _do_ think we'd want all three possibilities... Alex

Alex Martelli strung bits together to say:
I had a similar thought about 5 minutes after turning my computer off last night. The alternative I came up with was: y = (from result = 0.0 do result += x**2 for x in values if x > 0) The two extra clauses (from & do) are pretty much unavoidable if we want to be able to express both the starting point, and the method of accumulation. And hopefully those clauses would be enough to disambiguate this from the new syntax for generator expressions. The 'from' clause would allow a single plain assignment statement. It names the accumulation variable, and also gives it an initial value (if you don't want an initial value, an explicit assignment to None should suffice) The 'do' clause would allow single plain or augmented assignment statements, as well as allowing any expression. 'from' is already a keyword (thanks to 'from ... import ...') and it might be possible to avoid making 'do' a keyword (in the same way that 'as' is not a keyword despite its use in 'from ... import ... as ...') (And I'll add my vote to pointing out that generator expressions don't magically eliminate the use of the reduce function or accumulation loops any more than list comprehensions did. We still need the ability to express the starting value and the accumulation method). Cheers, Nick. P.S. I'm heading off to Canberra early tomorrow morning, so I won't be catching up on this discussion until the weekend. -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions."

I think you're aiming for the wrong thing here; I really see no reason why you'd want to avoid writing this out as a real for loop if you don't have an existing accumulator function (like sum()) to use. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum strung bits together to say:
One interesting thing is that I later realised that iterator comprehensions combined with the sum function would actually cover 90% of the accumulation functions I would ever write. So Raymond turns out to be correct when he suggests that generator expressions may limit the need for reduce functions and accumulation loops. With the sum() built in around, they will cover a large number of the reduction operations encountered in real life. Previously, sum() was not available, and even if it had been the cost of generating the entire list to be summed may have been expensive (if the values to be summed are a function of the stored values, rather than a straight sum). So while I think a concise reduction syntax was worth aiming for, I'm also willing to admit that it seems to be basically impossible to manage without violating Python's maxim of "one obvious way to do it". The combination of generator expressions and the various builtins that operate on iterables (especially sum()) is a superior solution. Still, I learned a few interesting things I didn't know last week :) Cheers, Nick. -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions."

The alternative I came up with was:
y = (from result = 0.0 do result += x**2 for x in values if x > 0)
As has been pointed out, this hardly gains you anything over writing it all out explicitly. It seems like nothing more than a Perlesque another-way-to-do-it. This seems to be the fate of all reduce-replacement suggestions that try to be fully general -- there are just too many degrees of freedom to be able to express it all succinctly. The only way out of this I can see (short of dropping the whole idea) is to cut out some of the degrees of freedom by restrict ourselves to targeting the most common cases. Thinking about the way this works in APL, where you can say things like total = + / numbers one reason it's so compact is that the system knows what the identity is for each operator, so you don't have to specify the starting value explicitly. Another is the use of a binary operator. So if we postulate a "reducing protocol" that requires function objects to have a __div__ method that performs reduction with a suitable identity, then we can write total = operator.add / numbers Does that look succinct enough? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

It still suffers from my main problem with reduce(), which is not its verbosity (far from it) but that except for some special cases (mainly sum and product) I have to stand on my head to understand what it does. This is even the case for examples like reduce(lambda x, y: x + y.foo, seq) which is hardly the epitomy of complexity. Who here knows for sure it shouldn't rather be reduce(lambda x, y: x.foo + y, seq) without going through an elaborate step-by-step execution? This is inherent in the definition of reduce, and no / notation makes it go away for me. --Guido van Rossum (home page: http://www.python.org/~guido/)

It occurs to me that, with generator expressions, such cases could be rewritten as reduce(lambda x, y: x + y, (z.foo for z in seq)) i.e. any part of the computation that only depends on the right argument can be factored out into the generator. So I might have to take back some of what I said earlier about generator comprehensions being independent of reduce. But if I understand you correctly, what you're saying is that the interesting cases are the ones where there isn't a ready-made binary function that does what you want, in which case you're going to have to spell everything out explicitly anyway one way or another. In that case, the most you could gain from a reduce syntax would be that it's an expression rather than a sequence of statements. But the same could be said of list comprehensions -- and *was* said quite loudly by many people in the early days, if I recall correctly. What's the point, people asked, when writing out a set of nested loops is just about as easy? Somehow we came to the conclusion that being able to write a list comprehension as an expression was a valuable thing to have, even if it wasn't significantly shorter or clearer. What about reductions? Do we feel differently? If so, why? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

[Guido]
[Greg]
(And then spelling it out so that it works with reduce() reduces clarity.)
Some people still hate LC's for this reason.
IMO LC's *are* significantly clearer because the notation lets you focus on what goes into the list (e.g. the expresion "x**2") and under what conditions (e.g. the condition "x%2 == 1") rather than how you get it there (i.e. the initializer "result = []" and the call "result.append(...)"). This is an incredibly common idiom in the use of loops; for experienced programmers the boilerplate disappears when they read the code, but for less experienced readers it takes more time to recognize the idiom. I think this is at least in part due to the fact that there are more details that can be written differently, e.g. the name of the result variable, and exactly at which point it is initialized. I think that for reductions the gains are less clear. The initializer for the result variable and the call that updates its are no longer boilerplate, because they vary for each use; plus the name of the result variable should be chosen carefully because it indicates what kind of result it is (e.g. a sum or product). So, leaving out the condition for now, the pattern or idiom is: <result> = <initializer> for <variable> in <iterable>: <result> = <expression> (Where <expression> uses <result> and <variable>.) If we think of this as a template with parameters, there are five parameters! (A LC without a condition only has 3: <expression>, <variable> and <iterable>.) No matter how hard you try, a macro with 5 parameters will have a hard time conveying the meaning of each without being at least as verbose as the full template. We could reduce the number of template parameters to 4 by leaving <result> anonymous; we could then refer to it by e.g. "_" in <expression>, which is more concise and perhaps acceptable, but makes certain uses more strained (e.g. mean() below). Just for fun, let me try to propose a macro syntax: reduction(<initializer>, <expression>, <variable>, <iterable>) (I think it's better to have <initializer> as the first parameter, but you can ) For example: reduction(0, _+x**2, x, S) Lavishly sprinkle syntactic sugar, and perhaps it can become this ('reduction' would have to be a reserved word): reduction(0, _+x**2 for x in S) A few more examples using this notation: # product(S), if Raymond's product() builtin is accepted reduction(1, _*x for x in S) # mean of f(x); uses result tuple and needs result postprocessing total, n = reduction((0, 0), (_[0]+f(x), _[1]+1) for x in S) mean = total/n # horner(S, x): evaluate a polynomial over x: [6, 3, 4] => 6*x**2 + 3*x + 4 reduction(0, _*x + c for c in S) In each of these cases I have the same gut response as to writing these using reduce(): the notation is too "concentrated", I have to think so hard before I understand what it does that I wouldn't mind having it spread over three lines. Compare the above four examples to: sum = 0 for x in S: sum += x**2 product = 1 for x in S: product *= x total, n = 0, 0 for x in S: total += f(x) n += 1 mean = total/n horner = 0 for c in S: horner = horner*x + c I find that these cause much less strain on the eyes. (BTW the horner example shows that insisting on augmented assignment would reduce the power.) Concluding, I think the reduce() pattern is doomed -- the template is too complex to capture in special syntax. --Guido van Rossum (home page: http://www.python.org/~guido/)

On Wed, Oct 22, 2003, Guido van Rossum wrote:
Actually, even that doesn't quite capture the expressiveness needed, because <expression> needs in some cases to be a sequence of statements and there needs to be an opportunity for a finalizer to run after the for loop (e.g. average()). -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan

On Thursday 23 October 2003 07:43, Guido van Rossum wrote: ...
I concur, particularly because the assignment in the pattern sketched above is too limiting. You point out that forcing augmented assignment would lose power (e.g., Horner's polynomials need bare assignment), but the inability to use it would imply inefficiencies -- e.g., flatlist = [] for sublist in listoflists: flatlist += sublist or flatlist.extend(sublist) is better than forcing a "flatlist = flatlist + sublist" as the loop body. Indeed, that's a spot where even 'sum' can be a performance trap; consider the following z.py: lol = [ [x] for x in range(1000) ] def flat1(lol=lol): return sum(lol, []) def flat2(lol=lol): result = [] for sub in lol: result.extend(sub) return result and the measurements: [alex@lancelot pop]$ timeit.py -c -s'import z' 'z.flat1()' 100 loops, best of 3: 8.5e+03 usec per loop [alex@lancelot pop]$ timeit.py -c -s'import z' 'z.flat2()' 1000 loops, best of 3: 940 usec per loop sum looks cooler, but it can be an order of magnitude slower than the humble loop of result.extend calls. We could fix this specific performance trap by specialcasing in sum those cases where the result has a += method -- hmmm... would a patch for this performance bug be accepted for 2.3.* ...? (I understand and approve that we're keen on avoiding adding functionality in any 2.3.*, but fixed-functionality performance enhancements should be just as ok as fixes to functionality bugs, right?) Anyway, there's a zillion other cases that sum doesn't cover (well, unless we extend dict to have a += synonym for update, which might be polymorphically useful:-), such as totaldict = {} for subdict in listofdicts: totaldict.update(subdict) Indeed, given the number of "modify in place and return None" methods of both built-in and user-coded types, I think the variant of "accumulation pattern" which simply calls such a method on each item of an iterator is about as prevalent as the variant with assignment "result = ..." as the loop body. Alex

On Saturday 25 October 2003 11:32 am, Alex Martelli wrote: ...
Ah well -- it's the most trivial fix one can possibly think of, just changing PyNumber_Add to PyNumber_InPlaceAdd -- so the semantics are _guaranteed_ to be equal in all _sane_ cases, i.e. excepting only weird user-coded types that have an __iadd__ with a weirdly different semantic than __add__ -- and DOES make sum's CPU time drop to 490 usec in the above (making it roughly twice as fast as the loop, as it generally tends to be in typical cases of summing lots of numbers). So I wend ahead and committed the tiny change on both the 2.4 and 2.3 maintenance branches (easy enough to revert if the "insane" cases must keep working in the same [not sane:-)] way in 2.3.*)... Alex

No way. There's nothing that guarantees that a+=b has the same semantics as a+b, and in fact for lists it doesn't. I wouldn't even want this for 2.4. --Guido van Rossum (home page: http://www.python.org/~guido/)

On Saturday 25 October 2003 10:20 pm, Guido van Rossum wrote:
You mean because += is more permissive (accepts any sequence RHS while + insists the RHS be specifically a list)? I don't see how this would make it bad to use += instead of + -- if we let the user sum up a mix of (e.g.) strings and tuples, why are we hurting him? And it seemed to me that cases in which the current semantics of "a = a + b" would work correctly, while the potentially-faster "a += b" wouldn't, could be classified as "weird" and ignored in favour of avoiding "sum" be an orders-of-magnitude performance trap for such cases (see my performance measurements in other posts of mine to this thread). Still, you're the boss. Sorry -- I'll immediately revert the commits I had made and be less eager in the future.
I wouldn't even want this for 2.4.
Aye aye, cap'n. I'll revert the 2.4 commits too, then. Sorry. Alex

We specifically decided that sum() wasn't allowed for strings, because it's a quadratic algorithm. Other sequences are just as bad, we just didn't expect that to be a common case. Also see my not-so-far-fetched example of a semantic change. --Guido van Rossum (home page: http://www.python.org/~guido/)

At 04:16 PM 10/25/03 -0700, Guido van Rossum wrote:
Maybe I'm confused, but when Alex first proposed this change, I mentally assumed that he meant he would change it so that the *first* addition would use + (in order to ensure getting a "fresh" object) and then subsequent additions would use +=. If this were the approach taken, it seems to me that there could not be any semantic change or side-effects for types that have compatible meaning for + and += (i.e. += is an in-place version of +). Maybe I'm missing something here?

On Sunday 26 October 2003 04:41 pm, Phillip J. Eby wrote:
A better architecture than the initial copy.copy I was now thinking of -- thanks. But it doesn't solve Guido's objection as above shown.
Only the fact that "there's nothing that guarantees" this, as Guido says. alist = alist + x only succeds if x is also a list, while alist += x succeeds also for tuples and other sequences, for example. Personally, I don't think this would be a problem, but it's not my decision. Alex

In the context of sum(), I think it would be nice to allow iterables to be added together: sum(['abc', range(3), ('do', 're', 'me')], []) This fits in well with the current thinking that the prohibition of adding sequences of unlike types be imposed only on operators and not on functions or methods. For instance, in sets.py, a|b requires both a and b to be Sets; however, a.union(b) allows b to be any iterable. The matches the distinction between list.__iadd__() and list.extend() where the former requires a list argument and the latter does not. Raymond Hettinger

You're forgiven, at some point in the past they *were* different. --Guido van Rossum (home page: http://www.python.org/~guido/)

"Guido van Rossum" <guido@python.org> wrote in message news:200310230248.h9N2meb01254@12-236-54-216.client.attbi.com...
I do and Raymond Hettinger should. Doc bug 821701 addressed this confusion. I suggested the addition of "The first (left) argument is the accumulator; the second (right) is the update value from the sequence. The accumulator starts as the initializer, if given, or as seq[0]. " but don't know yet what Raymond actually did. For remembering, the arg order corresponds to left associativity: ...(((a op b) op c) op d) ... . For clarity, the updater should be written with real arg names: lambda sum, item: sum + item.foo Now sum.foo + item is pretty obviously wrong. I think it a mistake to make the two args of the update function look symmetric when they are not. Even if the same type, the first represents a cumulation of several values (and the last return value) while the second is just one (new) value. Terry J. Reedy

You're kidding right? --Guido van Rossum (home page: http://www.python.org/~guido/)

[GvR]
Which is why I didn't like the 'sum[x for x in S]' notation much.
[Alex]
Let it rest in peace, then.
Goodbye, weird __getitem__ hack! [GvR]
Let's look for an in-line generator notation instead. I like
sum((yield x for x in S))
[Alex]
+1 [David Eppstein, in a separate note]
Along with that confusion, (x*x for x in S) would look like a tuple comprehension, rather than a bare iterator.
Phil's idea cleans that up pretty well: (yield: x*x for x in S) This is no more tuple-like than any expression surrounded by parens. Raymond Hettinger

At 10:55 PM 10/17/03 +0200, Alex Martelli wrote:
That is positively *evil*. Good thing you didn't post it on python-list. :)
And also much uglier. Even though I originally proposed it, I like Guido's version (sans yield) much better. OTOH, I can also see where the "tuple comprehension" and other possible confusing uses seem to shoot it down. Hm. What if list comprehensions returned a "lazy list", that if you took an iterator of it, you'd get a generator-iterator, but if you tried to use it as a list, it would populate itself? Then there'd be no need to ever *not* use a listcomp, and only one syntax would be necessary. More specifically, if all you did with the list was iterate over it, and then throw it away, it would never actually populate itself. The principle drawback to this idea from a semantic viewpoint is that listcomps can be done over expressions that have side-effects. :(

I don't think this can be done without breaking b/w compatibility. Example: a = [x**2 for x in range(10)] for i in a: print i print a Your proposed semantics would throw away the values in the for loop, so what would it print in the third line? --Guido van Rossum (home page: http://www.python.org/~guido/)

At 03:47 PM 10/17/03 -0700, Guido van Rossum wrote:
I should've been more specific... some pseudocode: class LazyList(list): materialized = False def __init__(self, generator_func): self.generator = generator_func def __iter__(self): # When iterating, use the generator, unless # we've already computed contents. if self.materialized: return super(LazyList,self).__iter__() else: return self.generator() def __getitem__(self,index): if not self.materialized: self[:] = list(self.generator()) self.materialized = True return super(LazyList,self).__getitem__(index) def __len__(self): if not self.materialized: self[:] = list(self.generator()) self.materialized = True return super(LazyList,self).__len__() # etc. So, the problem isn't that the code you posted would fail on 'print a', it's that the generator function would be run *twice*, which would be a no-no if it had side effects, and would also take longer. It was just a throwaway idea, in the hopes that maybe it would lead to an idea that would actually work. Ah well, maybe in Python 3.0, there'll just be itercomps, and we'll use list(itercomp) when we want a list.

On Friday 17 October 2003 11:58 pm, Phillip J. Eby wrote: ...
The big problem I see is e.g. as follows: l1 = range(6) lc = [ x for x in l1 ] for a in lc: l1.append(a) (or insert the LC inline in the for, same thing either way I'd sure hope). Today, this is perfectly well-defined, since the LC "takes a snapshot" when evaluated -- l1 becomes a 12-elements list, as if I had done l1 *= 2. But if lc _WASN'T_ "populated"... shudder... it would be as nonterminating as "for a in l1:" same loop body. Unfortunately, it seems to me that turning semantics from strict to lazy is generally unfeasible because of such worries (even if one could somehow ignore side effects). Defining semantics as lazy in the first place is fine: as e.g. "for a in iter(l1):" has always produced a nonterminating loop for that body (iter has always been lazy), people just don't use it. But once it has been defined as strict, going to lazy is probably unfeasible. Pity... Alex

Hello. Perhaps looking at some examples of what nested itercomps might look like (because they _will_ be used if they're available...) using each of the leading syntaxes would be useful in trying to decide which form, if any, is most acceptable (or least unacceptable, whichever the case may be): # (1) without parentheses: B(y) for y in A(x) for x in myIterable # (2) for clarity, we'll add some optional parentheses: B(y) for y in (A(x) for x in myIterable) # (3) OK. Now, with required parentheses: (B(y) for y in (A(x) for x in myIterable)) # (4) And, now with the required "yield:" and parentheses: (yield: B(y) for y in (yield: A(x) for x in myIterable)) #(5) And, finally, for completeness, using the rejected PEP 289 syntax: [yield B(y) for y in [yield A(x) for x in myIterable]] Hope that's useful, Sean p.s. I'm only a Python user, and not a developer, so if my comments are not welcome here, please let me know, and I will refrain in future. Thanks for your time.

Sean Ross <seandavidross@hotmail.com>:
# (1) without parentheses: B(y) for y in A(x) for x in myIterable
Er, excuse me, but that had better *not* be equivalent to
# (2) for clarity, we'll add some optional parentheses: B(y) for y in (A(x) for x in myIterable)
because the former ought to be a single iterator expression with two nested loops (albeit an erroneous one, since x is being used before it's bound). Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

Guido:
but perhaps we can make this work:
sum(x for x in S)
But if "x for x in S" were a legal expression on its own, returning a generator, then [x for x in S] would have to be a 1-element list containing a generator. Unless you're suggesting that it should be a special feature of the function call syntax? That would be bizarre... Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

"Alex Martelli" <aleaxit@yahoo.com> wrote in message news:200310171903.42578.aleaxit@yahoo.com...
In your commercial programming group, would you accept such a slice usage from another programmer, especially without prior agreement of the group? Or would you want to edit, as you would with 'return x<y and True or False' and might with 'return x<z and 4 or 2'? If you would reject it in practice, then it is hardly an argument for something arguably even odder. Terry J. Reedy

On Friday 17 October 2003 08:17 pm, Terry Reedy wrote: ...
Such slice usage would presumably be made necessary by the behavior of foo's type -- that indexing would cause errors with all types I've ever seen actually used, so, if it were used in production code, it would no doubt have to be because foo's type requires weird indexing. Now, I'd _like_ to say that no external component with such weird behavior would ever be tolerated in any group I work for... BUT, given that I've had to write programs using "external components" such as the win32 API's, MS Office, etc (and may well still have to use others such as OO's new Python interface -- almost makes one nostalgic for MS Office...:-), if I said that I'd be lying:-). Seriously, I can't imagine how such weird interface requirements might ever end up piled onto indexing in particular (though, come to think of it, some "indexed properties" in some COM object models... eeek... well, not QUITE that bad!-). But the point is quite another: by composing elementary and perfectly sensible language elements (slices, generalized indexing) it IS quite possible, even today, to write weird things. This is no argument for removing the perfectly sensible elements from the language, just as the possible abuses of 'and' and 'or' don't mean we should in general do without THEM.
would reject it in practice, then it is hardly an argument for something arguably even odder.
I'm happy to be using a language which supplies good elementary components and good general "composability", even though it IS possible to overuse the composition and end up with weird constructs. Personally, I don't think that allowing comprehensions in indices would be particularly odd: just another "good elementary component". So would "iterator comprehensions", as an alternative. Both of them quite usable in composition with other existing components and rules to produce weirdness, sure: but showing that weirdness is already quite possible whether new constructs are allowed or not appears to me to be a perfectly valid argument for a new construct that's liable to be used in either good or weird ways. Alex

In article <16272.6895.233187.510629@montanaro.dyndns.org>, Skip Montanaro <skip@pobox.com> wrote:
foo[ anything ] does not look like an identifier followed by a list, it looks like an indexing operation. So I would interpret foo[x*x for x in bar] to equal foo.__getitem__(i) where i is an iterator of x*x for x in bar. In particular if iter.__getitem__ works appropriately, then iter[x*x for x in bar] could be a generator comprehension and iter[1:n] could be an xrange. Similarly sum and max could be given appropriate __getitem__ methods. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science

On Friday 17 October 2003 06:12 pm, Skip Montanaro wrote:
Hmmm, how is, e.g. foo[x*x for x in bar] any more an "expression bracketed by [ and ]" than, say, foo = {'wot': 'tow'} foo['wot'] ...? Yet the latter doesn't involve any lists that I can think of. Nor do I see why the former need "mean something completely different from indexing" -- it means to call foo's __getitem__ with the appropriately constructed object, just as e.g. foo[ 'va':23:2j, {'zip':'zop'}:45:(3,4) ] today calls it with a tuple of two weird slice objects (and doesn't happen to involve any lists whatsoever). Alex

>> I agree. Any expression bracketed by '[' and ']', no matter how many >> other clues to the ultimate result it might contain, ought to result >> in a list as far as I'm concerned. Alex> Hmmm, how is, e.g. Alex> foo[x*x for x in bar] Alex> any more an "expression bracketed by [ and ]" than, say, Alex> foo = {'wot': 'tow'} Alex> foo['wot'] Alex> ...? When I said "expression bracketed by '[' and ']' I agree I was thinking of list construction sorts of things like: foo = ['wot'] not indexing sorts of things like: foo['wot'] I'm not in a mood to try and explain anything in more precise terms this morning (for other reasons, it's been a piss poor day so far) and must trust your ability to infer my meaning. I have no idea at this point how to interpret foo[x*x for x in bar] That looks like a syntax error to me. You have a probably identifier followed by a list comprehension. Here's a slightly more precise term: If a '['...']' construct exists in context where a list constructor would be legal today, it ought to evaluate to a list, not to something else. Alex> ... just as e.g. foo[ 'va':23:2j, {'zip':'zop'}:45:(3,4) ] ... I have absolutely no idea how to interpret this. Is this existing or proposed Python syntax? Skip

On Friday 17 October 2003 06:38 pm, Skip Montanaro wrote: ...
Perfectly valid and current existing Python syntax:
Not particularly _sensible_, mind you, and I hope nobody's yet written any container that IS to be indexed by such tuples of slices of multifarious nature. But, indexing does stretch quite far in the current Python syntax and semantics (in Python's *pragmatics* you're supposed to use it far more restrainedly). Alex

Which is why I didn't like the 'sum[x for x in S]' notation much. Let's look for an in-line generator notation instead. I like sum((yield x for x in S)) but perhaps we can make this work: sum(x for x in S) (Somebody posted a whole bunch of alternatives that were mostly picking random delimiters; it didn't look like the right approach.) --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
but perhaps we can make this work:
sum(x for x in S)
Being able to use generator compressions as an expression would be useful. In that case, I assume the following would be possible as well: mygenerator = x for x in S for y in x for x in S: print y return x for x in S Thanks, -Shane Holloway

[Shane Holloway]
You'd probably have to add extra parentheses around (x for x in S) to help the poor parser (and the human reader). --Guido van Rossum (home page: http://www.python.org/~guido/)

Hello, On Fri, Oct 17, 2003 at 11:55:53AM -0600, Shane Holloway (IEEE) wrote:
Interesting but potentially confusing: we could expect the last one to mean that we executing 'return' repeatedly, i.e. returning a value more than once, which is not what occurs. Similarily, yield x for x in g() in a generator would be quite close to the syntax discussed some time ago to yield all the values yielded by a sub-generator g, but in your proposal it wouldn't have that meaning: it would only yield a single object, which happens to be an iterator with the same elements as g(). Even with parenthesis, and assuming a syntax to yield from a sub-generator for performance reason, the two syntaxes would be dangerously close: yield x for x in g() # means for x in g(): yield x yield (x for x in g()) # means yield g() Armin

I'm not sure what you mean by executing 'return' repeatedly; the closest thing in Python is returning a sequence, and this is pretty close (for many practical purposes, returning an iterator is just as good as returning a sequence).
IMO this is not at all similar to what it suggests for return, as executing 'yield' multiple times *is* a defined thing. This is why I'd prefer to require extra parentheses; yield (x for x in g()) is pretty clear about how many times yield is executed.
I don't see why we need yield x for x in g() when we can already write for x in g(): yield x This would be a clear case of "more than one way to do it". --Guido van Rossum (home page: http://www.python.org/~guido/)

Armin Rigo wrote:
Yes, this is one of the things I trying to getting at -- If gencomps are expressions, then they must be expressions everywhere, or my poor brain will explode. As for the subgenerator "unrolling", I think there has to be something added to the yield statement to accomplish it -- because it is also useful to yield a generator itself and not have it unrolled. My favorite was "yield *S" for that discussion...\ -Shane Holloway

At 10:15 AM 10/17/03 -0700, Guido van Rossum wrote:
Offhand, it seems like the grammar might be rather tricky, but it actually does seem more Pythonic than the "yield" syntax, and it retroactively makes listcomps shorthand for 'list(x for x in s)'. However, if gencomps use this syntax, then what does: for x in y*2 for y in z if y<20: ... mean? ;) It's a little clearer with parentheses, of course, so perhaps they should be required: for x in (y*2 for y in z if y<20): ... It would be more efficient to code that stuff inline in the loop, if the gencomp creates another frame, but it *looks* more efficient to put it in the for statement. But maybe I worry too much, since you could slap a listcomp in a for loop now, and I've never even thought of doing so.

"Phillip J. Eby" <pje@telecommunity.com> writes:
At 10:15 AM 10/17/03 -0700, Guido van Rossum wrote:
I like the look of this. In this context, it looks very natural.
It means you're trying to be too clever, and should use parentheses :-)
I'd rather not require parentheses in general. Guido's example of sum(x for x in S) looks too nice for me to want to give it up without a fight. But I'm happy to have cases where the syntax is ambiguous, or even out-and-out unparseable, without the parentheses. Whether it's possible to express this in a way that Python's grammar can deal with, I don't know. Paul. -- This signature intentionally left blank

>>> sum((yield x for x in S)) >>> >>> but perhaps we can make this work: >>> >>> sum(x for x in S) Paul> I like the look of this. In this context, it looks very natural. How would it look if you used the optional start arg to sum()? Would either of these work? sum(x for x in S, start=5) sum(x for x in S, 5) or would you have to parenthesize the first arg? sum((x for x in S), start=5) sum((x for x in S), 5) Again, why parens? Why not sum(<x for x in S>, start=5) sum(<x for x in S>, 5) or something similar? Also, sum(x for x in S) and sum([x for x in S]) look very similar. I don't think it would be obvious to the casual observer what the difference between them was or why the first form didn't raise a SyntaxError. >> It's a little clearer with parentheses, of course, so perhaps they >> should be required: >> >> for x in (y*2 for y in z if y<20): >> ... Paul> I'd rather not require parentheses in general. Parens are required in certain situations within list comprehensions around tuples (probably for syntactic reasons, but perhaps to aid the reader as well) where tuples can often be defined without enclosing parens. Here's a contrived example: >>> [(a,b) for (a,b) in zip(range(5), range(10))] [(0, 0), (1, 1), (2, 2), (3, 3), (4, 4)] >>> [a,b for (a,b) in zip(range(5), range(10))] File "<stdin>", line 1 [a,b for (a,b) in zip(range(5), range(10))] ^ SyntaxError: invalid syntax Paul> Guido's example of sum(x for x in S) looks too nice for me to want Paul> to give it up without a fight. But I'm happy to have cases where Paul> the syntax is ambiguous, or even out-and-out unparseable, without Paul> the parentheses. Whether it's possible to express this in a way Paul> that Python's grammar can deal with, I don't know. I rather suspect parens would be required for tuples if they were added to the language today. I see no reason to make an exception here. Skip

In article <16272.22369.546606.870697@montanaro.dyndns.org>, Skip Montanaro <skip@pobox.com> wrote:
This one has bitten me several times. When it does, I discover the error quickly due to the syntax error, but it would be bad if this became valid syntax and returned a list [a,X] where X is an iterator. I don't think you could count on this getting caught by a being unbound, because often the variables in list comprehensions can be single letters that shadow previous bindings. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science

Generally, when we talk about something "biting", we mean something that *doesn't* give a syntax error, but silently does something quite different than what you'd naively expect. This was made a syntax error specifically because of this ambiguity.
No, [a,X] would be a syntax error if X was an iterator comprehension. --Guido van Rossum (home page: http://www.python.org/~guido/)

Because the parser doesn't know whether the > after S is the end of the <...> brackets or a binary > operator. (Others can answer your other questions.) --Guido van Rossum (home page: http://www.python.org/~guido/)

>> But, indexing does stretch quite far in the current Python syntax and >> semantics (in Python's *pragmatics* you're supposed to use it far >> more restrainedly). Guido> Which is why I didn't like the 'sum[x for x in S]' notation much. Guido> Let's look for an in-line generator notation instead. I like Guido> sum((yield x for x in S)) Guido> but perhaps we can make this work: Guido> sum(x for x in S) Forgive my extreme density on this matter, but I don't understand what (yield x for x in S) is supposed to do. Is it supposed to return a generator function which I can assign to a variable (or pass to the builtin function sum() as in your example) and call later, or is it supposed to turn the current function into a generator function (so that each executed yield statement returns a value to the caller of the current function)? Assuming the result is a generator function (a first class object I can assign to a variable then call later), is there some reason the current function notation is inadequate? This seems to me to suffer the same expressive shortcomings as lambda. Lambda seems to be hanging on by the hair on its chinny chin chin. Why is this construct gaining traction? If you don't like lambda, I can't quite see why syntax this is all that appealing. OTOH, if lambda: x: x+1 is okay, then why not: yield: x for x in S ? Skip

At 01:57 PM 10/17/03 -0500, Skip Montanaro wrote:
Neither. It returns an *iterator*, conceptually equivalent to: def temp(): for x in S: yield x temp = temp() Except of course without creating a 'temp' name. I suppose you could also think of it as: (lambda: for x in S: yield x)() except of course that you can't make a generator lambda. If you look at it this way, then you can consider [x for x in S] to be shorthand syntax for list(x for x in S), as they would both produce the same result. However, IIRC, the current listcomp implementation actually binds 'x' in the current local namespace, whereas the generator version would not. (And the listcomp version might be faster.)

"Phillip J. Eby" <pje@telecommunity.com>:
Are we sure about that? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

On Monday 20 October 2003 02:08 am, Greg Ewing wrote:
We are indeed sure (sadly) that list comprehensions leak control variable names. We can hardly be sure of what iterator comprehensions would be defined to do, given they don't exist, but surely we can HOPE that in an ideal world where iterator comprehensions were part of Python they would not be similarly leaky:-). Alex

We are indeed sure (sadly) that list comprehensions leak control variable names.
But they shouldn't. It can be fixed by renaming them (e.g. numeric names with a leading dot).
It's highly likely that the implementation will have to create a generator function under the hood, so they will be safely contained in that frame. --Guido van Rossum (home page: http://www.python.org/~guido/)

At 05:45 PM 10/20/03 +0200, Alex Martelli wrote:
He was talking about having the bytecode compiler generate "hidden" names for the variables... ones that can't be used from Python. There's one drawback there, however... If you're stepping through the listcomp generation with a debugger, you won't be able to print the current item in the list, as (I believe) is possible now.

Good point. But this could be addressed in many ways; the debugger could grow a way to quote nonstandard variable names, or it could know about the name mapping, or we could use a different name-mangling scheme (e.g. prefix the original name with an underscore, and optionally append _1 or _2 etc. as needed to distinguish it from a real local with the same name). Or we could simply state this as a deficiency (I'm not sure I've ever needed to debug that situation). --Guido van Rossum (home page: http://www.python.org/~guido/)

I meant that the compiler should rename it. Just like when you use a tuple argument: def f(a, (b, c), d): ... this actually defines a function of three (!) arguments whose second argument is named '.2'. And the body starts with something equivalent to b, c = .2 For list comps, the compiler could maintain a mapping for the listcomp control variables so that if you write [x for x in range(3)] it knows to generate bytecode as if x was called '.7'; at the bytecode level there's no requirement for names to follow the identifier syntax. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum <guido@python.org> writes:
Implementing this might be entertaining. In particular what happens if the iteration variable is a local in the frame anyway? I presume that would inhibit the renaming, but then there's a potentially confusing dichotomy as to whether renaming gets done. Of course you could *always* rename, but then code like def f(x): r = [x+1 for x in range(x)] return r, x becomes even more incomprehensible (and changes in behaviour). And what about horrors like [([x for x in range(10)],x) for x in range(10)] vs: [([x for x in range(10)],y) for y in range(10)] ? I suppose you could make a case for throwing out (or warning about) all these cases at compile time, but that would require significant effort as well (I think). Cheers, mwh -- This song is for anyone ... fuck it. Shut up and listen. -- Eminem, "The Way I Am"

Here's the rule I'd propose for iterator comprehensions, which list comprehensions would inherit: [<expr1> for <vars> in <expr2>] The variables in <vars> should always be simple variables, and their scope only extends to <expr1>. If there's a variable with the same name in an outer scope (including the function containing the comprehension) it is not accessible (at least not by name) in <expr1>. <expr2> is not affected. In comprehensions you won't be able to do some things you can do with regular for loops: a = [1,2] for a[0] in range(10): print a
I think the semantics are crisply defined, users who write these deserve what they get (confusion and the wrath of their readers). --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido> Here's the rule I'd propose for iterator comprehensions, which list Guido> comprehensions would inherit: Guido> [<expr1> for <vars> in <expr2>] Guido> The variables in <vars> should always be simple variables, and Guido> their scope only extends to <expr1>. If there's a variable with Guido> the same name in an outer scope (including the function Guido> containing the comprehension) it is not accessible (at least not Guido> by name) in <expr1>. <expr2> is not affected. I thought the definition for list comprehension syntax was something like '[' <expr> for <vars> in <expr> [ for <vars> in <expr> ] * [ if <expr> ] * ']' The loop <vars> in an earlier for clause should be visible in all nested for clauses and conditional clauses, not just in the first <expr>. Skip

Absolutely, good point! --Guido van Rossum (home page: http://www.python.org/~guido/)

Michael Hudson <mwh@python.net>:
In particular what happens if the iteration variable is a local in the frame anyway? I presume that would inhibit the renaming
Why?
Anyone who writes code like that *deserves* to have the behaviour changed on them! If this is really a worry, an alternative would be to simply forbid using a name for the loop variable that's used for anything else outside the loop. That could break existing code too, but at least it would break it in a very obvious way by making it fail to compile. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

Greg Ewing <greg@cosc.canterbury.ac.nz> writes:
Well, because then you have the same name for two different bindings.
This was not my impression of the Python way. I know I'd be pretty pissed if this broke my app. I have no objection to breaking the above code, just to breaking it silently! Having code *silently change in behaviour* (not die with an expection, post a warning at compile time or fail to compile at all) is about an evil a change as it's possible to contemplate, IMO.
This would be infinitely preferable! Cheers, mwh -- I like silliness in a MP skit, but not in my APIs. :-) -- Guido van Rossum, python-dev

Not so fast. We introduced nested scopes, which could similarly subtly change the meaning of code without giving an error. Instead, we had at least one release that *warned* about situations that would change meaning silently under the new semantics. The same release also implemented the new semantics if you used a __future__ import. We should do that here too (both the warning and the __future__). I don't want this kind of code to cause an error; it's not Pythonic to flag an error when a variable name in an inner scope shields a variable of the same name in an outer scope. --Guido van Rossum (home page: http://www.python.org/~guido/)

>> We can hardly be sure of what iterator comprehensions would be >> defined to do, given they don't exist, but surely we can HOPE that in >> an ideal world where iterator comprehensions were part of Python they >> would not be similarly leaky:-). Guido> It's highly likely that the implementation will have to create a Guido> generator function under the hood, so they will be safely Guido> contained in that frame. Which suggests they aren't likely to be a major performance win over list comprehensions. If nothing else, they would push the crossover point between list comprehensions and iterator comprehensions toward much longer lists. Is performance is the main reason this addition is being considered? They don't seem any more expressive than list comprehensions to me. Skip

[Skip]
They are more expressive in one respect: you can't use a list comprehension to express an infinite sequence (that's truncated by the consumer). They are more efficient in a related situation: a list comprehension buffers all its items before the next processing step begins; an iterator comprehension doesn't need to do any buffering. So iterator comprehensions win if you're pipelining operations just like Unix pipes are a huge win over temporary files in some situations. This is particularly important when the consumer is some accumulator like 'average' or 'sum'. Whether there is an actual gain in speed depends on how large the list is. You should be able to time examples like sum([x*x for x in R]) vs. def gen(R): for x in R: yield x*x sum(gen(R)) for various lengths of R. (The latter would be a good indication of how fast an iterator generator could run.) --Guido van Rossum (home page: http://www.python.org/~guido/)

On Monday 20 October 2003 07:21 pm, Guido van Rossum wrote: ...
with a.py having: def asum(R): sum([ x*x for x in R ]) def gen(R): for x in R: yield x*x def gsum(R, gen=gen): sum(gen(R)) I measure: [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(100)' 'a.asum(R)' 10000 loops, best of 3: 96 usec per loop [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(100)' 'a.gsum(R)' 10000 loops, best of 3: 60 usec per loop [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(1000)' 'a.asum(R)' 1000 loops, best of 3: 930 usec per loop [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(1000)' 'a.gsum(R)' 1000 loops, best of 3: 590 usec per loop [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(10000)' 'a.asum(R)' 100 loops, best of 3: 1.28e+04 usec per loop [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(10000)' 'a.gsum(R)' 100 loops, best of 3: 8.4e+03 usec per loop not sure why gsum's advantage ratio over asum seems to be roughly constant, but, this IS what I measure!-) Alex

Great! This is a plus for iterator comprehensions (we need a better term BTW). I guess that building up a list using repeated append() calls slows things down more than the frame switching used by generator functions; I knew the latter was fast but this is a pleasant result. BTW, if I use a different function that calculates list() instead of sum(), the generator version is a few percent slower than the list comprehension. But that's because list(a) has a shortcut in case a is a list, while sum(a) always uses PyIter_Next(). So this is actually consistent: despite the huge win of the shortcut, the generator version is barely slower. I think the answer lies in the bytecode:
def lc(a): return [x for x in a]
The list comprehension executes 7 bytecodes per iteration; the generator version only 5 (this could be more of course if the expression was more complicated than 'x'). The YIELD_VALUE does very little work; falling out of the frame is like falling off a log; and gen_iternext() is pretty sparse code too. On the list comprehension side, calling the list's append method has a bunch of overhead. (Some of which could be avoided if we had a special-purpose opcode which called PyList_Append().) But the executive summary remains: the generator wins because it doesn't have to materialize the whole list. --Guido van Rossum (home page: http://www.python.org/~guido/)
participants (35)
-
Aahz
-
Alex Martelli
-
Andrew Bennetts
-
Armin Rigo
-
Barry Warsaw
-
Bill Janssen
-
Brett C.
-
Dave Brueck
-
David Ascher
-
David Eppstein
-
Greg Ewing
-
Guido van Rossum
-
Ian Bicking
-
Jack Jansen
-
Jeremy Fincher
-
Jeremy Hylton
-
Jeremy Hylton
-
John Williams
-
Just van Rossum
-
Michael Hudson
-
Neal Norwitz
-
Neil Schemenauer
-
Nick Coghlan
-
Paul Moore
-
Phillip J. Eby
-
Raymond Hettinger
-
Samuele Pedroni
-
Sean Ross
-
Shane Holloway (IEEE)
-
Skip Montanaro
-
Terry Reedy
-
Tim Peters
-
Tim Peters
-
Walter Dörwald
-
Zack Weinberg