
Hello all, I have noticed that sometimes "while" loops produce "unpythonic" patterns, such as the following: data = my_file.read(1024) while data: do_something(data) data = my_file.read(1024) The assignment is repeated, which is less than optimal. Since we don't have a while statement that does the check in the end, would it not be better if the syntax of while could be amended to include something like this (in the spirit of the new "with" keyword)?: while my_file.read(1024) as data: do_something(data) This would terminate the loop when myfile.read() evaluated to False, and it is more pythonic than repeating onesself. I contacted GvR about this, and he replied that this syntax would have to be part of the expression than part of the while, which I agree would be less than ideal. However, I don't see why it would have to be part of the expression, since the "while" could easily assign the value of the expression to the variable and break if it evaluates to False. I would appreciate any thoughts on this, Stavros Korokithakis

It's not currently possible to determine if a file/stream is at its end, is it? If that were the case you could easily do the read before your do_something call. Something like: while not my_file.eof: data = my_file.read(1024) do_something(data) Can anyone explain why file objects don't support some sort of eof check? Something gives me the impression that it was an intentional decision. IMO something like this would be better than more syntax. Cheers, T Stavros Korokithakis wrote:

I don't think they do, if I'm not mistaken the only way is to call read() and see if it returns the empty string. I agree that this would be better, but the use case I mentioned is not the only one this would be useful in... Unfortunately I can't think of any right now, but there have been a few times when I had to initialise things outside the loop and it always strikes me as ugly. Thomas Lee wrote:

There doesn't really exist an "end of file" unless you are using a stream (pipe, socket, ...). Real files can be appended to at any time on the three major platforms. Also, if you know that you are going to be reading blocked data, perhaps it's better just to write your blocked reader (with your single non-pythonic idiom) once. def blocked_reader(handle, blocksize=4096): data = handle.read(blocksize) while data: yield data data = handle.read(blocksize) Then you can use a better common idiom than a while loop: for block in blocked_reader(handle): handle_data(block) - Josiah On Thu, Jul 3, 2008 at 7:24 AM, Stavros Korokithakis <stavros@korokithakis.net> wrote:

Stavros Korokithakis wrote:
Well that depends, on the situation really. The only use case I can think of is exactly the one you mentioned. And since you can't think of any other scenarios where such a thing might be handy, I've got no better suggestion to offer. If you can conjure up another scenario, post it back here and we'll see if we can generalize the pattern a little. In the meantime, I'd love a way to check if a file is at its end without having to read data out of it ... Cheers, T P.S. you might be interested in using something like the following to hide the ugliness in a function: def reader(stream, size): data = stream.read(size) while data: yield data data = stream.read(size) Then you could use it as follows: for block in reader(my_file, 1024): do_something(block)

This was apparently resuggested in PEP 315 but they couldn't get the syntax right: http://www.python.org/dev/peps/pep-0315/ I believe that my proposal deals with that issue by keeping the syntax coherent with the rest of Python (especially with the "with" keyword) and has all the advantages of the restructured code. There are also some use cases in that PEP. Stavros Thomas Lee wrote:

The alternate idiom I've seen for "do-while"-like loops such as this in Python is (using the original example): while True: data = my_file.read(1024) if not data: break do_something(data) which seems perfectly acceptable and in keeping with TOOWTDI by minimizing the number of loop constructs in the language. It obviously avoids the distasteful duplication of "data = my_file.read(1024)". - Chris On Thu, Jul 3, 2008 at 7:58 AM, Thomas Lee <tom@vector-seven.com> wrote:

On Jul 3, 2008, at 11:21 AM, Stavros Korokithakis wrote:
The style with the check "in the middle", however, is fully general: you can do different checks at different points to determine when to break or skip the rest of the indented suite; the "while <expr> as <var>:" form only handles a specific case (though common). The "while <expr> as <var>:" for has a certain level of attraction, because it does deal very well with a very common case, but I don't know that it would make that much difference in my code. (Of course, for the specific use case being used to seed this discussion, the fileinput module may be helpful.) -Fred -- Fred Drake <fdrake at acm.org>

2008/7/3 Fred Drake <fdrake@acm.org>:
Note that what I like in it is *not* that it solves a lot of cases, but that it makes the syntax more coherent, and solves a common case in the way, :) -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

On Thu, Jul 3, 2008 at 8:45 AM, Facundo Batista <facundobatista@gmail.com> wrote:
That's not how I would define "coherent". It adds one more special case. See the zen of Python for that. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

On Thu, Jul 3, 2008 at 8:45 AM, Facundo Batista <facundobatista@gmail.com> wrote:
Typically a good way of making a point that a new syntax proposal will have impact is to find as many places as possible in the standard library where the code was improved by the change. If a bunch of places become easier to read then that will squash the need for it solve a ton of different cases because it solves a very common case instead. -Brett

You're right, it's the way it has to be done, but I personally find it awful code. "while True" is a do-forever, so that's what you read first, but that's not really what we want to write. Then we read, OK, but then we break out of the forever loop, then we don't, and do something with the data we read. It's a mess. What I'd like to be able to write is something like this: while my_file.has_more_data(): do_something_with_data(my_file.read(1024)) For instance: for line in open(filename): do_something_with_data(line) Which, in its non-line form, is pretty much what Josiah suggested. +1 for adding "blocked_reader" as a static method on the "file" class, or perhaps one of the IOStream ABC classes. Bill

On Thu, Jul 3, 2008 at 2:58 PM, Thomas Lee <tom@vector-seven.com> wrote:
Randomly choose a number 0-100 that is not already picked: # Current Python number = random.randint(1, 100) while number in picked_numbers: number = random.randint(1, 100) # Do-Until do: number = random.randint(1, 100) until number not in picked_numbers # While-As Doesn't seem to be doable. Maybe with the proposed syntax either: while random.randint(1, 100) as number in picked_numbers: pass or while random.randint(1, 100) in picked_numbers as number: pass would work. But it doesn't look pretty anymore. I would definitely prefer a do-until or do-while construct instead. -- mvh Björn

From: "BJörn Lindqvist" <bjourne@gmail.com>
I would definitely prefer a do-until or do-while construct instead.
I co-authored a PEP for a do-while construct and it died solely because we couldn't find a good pythonic syntax. The proposed while-as construct will neatly solve some of the use cases -- the rest will have to live with what we've got. Raymond

On Thu, Jul 3, 2008 at 8:21 PM, Fred Drake <fdrake@acm.org> wrote:
Same here. The while ... as ... syntax introduces a special case that doesn't handle enough cases to be worth the cost of a special case. Zen of Python: "Special cases aren't special enough to break the rules." To summarize why while...as... doesn't handle enough cases: it only works if the <setup> can be expressed a single assignment and the value assigned is also the value to be tested. Like it or not, many uses of assignment expressions in C take forms like these: if ((x = <expr>) > 0 && (y = <expr>) < 0) { ... } while ((p = foo()) != NULL && p->bar) { ... } If we were to add "while x as y:" in 3.1, I'm sure we'd get pressure to add "if x as y:" in 3.2, and then "if (x as y) and (p as q):" in 3.3, and so on. Assignment-in-test: Just Say No. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
All that could be short-circuited by making 'expr as name' a valid expression in general from the outset. However, that would make 'with x as y' inconsistent, because it doesn't bind the value of x to y, but rather the result of x.__enter__(). (Often it's the same thing, but it doesn't have to be.) Then there's the new form of except clause that uses 'as' in another slightly different way. So while this proposal might at first appear to make things more consistent, it actually doesn't. The way things are, each use of 'as' is unique, so it can be regarded as consistently inconsistent. Introducing a general 'expr as name' construct would set up an expectation that 'as' is simply a binding operator, and then the existing uses would stick out more by being inconsistently inconsistent. :-) -- Greg

Guido van Rossum wrote:
I'm actually more interested in the "as" clause being added to "if" rather than "while". The specific use case I am thinking is matching regular expressions, like so: def match_token(s): # Assume we have regex's precompiled for the various tokens: if re_comment.match(s) as m: # Do something with m process(TOK_COMMENT, m.group(1)) elif re_ident.match(s) as m: # Do something with m process(TOK_IDENT, m.group(1)) elif re_integer.match(s) as m: # Do something with m process(TOK_INT, m.group(1)) elif re_float.match(s) as m: # Do something with m process(TOK_COMMENT, m.group(1)) elif re_string.match(s) as m: # Do something with m process(TOK_STRING, m.group(1)) else: raise InvalidTokenError(s) In other words, the general design pattern is one where you have a series of tests, where each test can return either None, or an object that you want to examine further. Of course, if it's just a single if-statement, it's easy enough to assign the variable in one statement and test in the next. There are a number of workarounds, of course - put the tests in a function and use return instead of elif; Use 'else' instead of elif and put the subsequent 'if' in the else block (and deal with indentation crawl) and so on. So it's not a huge deal, I'm just pointing out one possible use case where it might make the code a bit prettier. (Also, there's a much faster way to do the above example, which is to smush all of the patterns into a single massive regex, and then check the last group index of the match object to see which one matched.) -- Talin

On Mon, Jul 7, 2008 at 7:10 PM, Talin <talin@acm.org> wrote:
I'm actually more interested in the "as" clause being added to "if" rather than "while".
Thanks for the support. :-) The more examples we get suggesting that this might also come in handy in other contexts the more ammo I have against the request to add it *just* to the while loop. And adding it everywhere is not an option, as has been explained previously. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Talin wrote:
One of the things I do not like above is the repetition of the assignment target, and statement ends, where it is a little hard to see if it is spelled the same each time. I think it would be more productive to work on adding to the stdlib a standardized store-and-test class -- something like what others have posted on c.l.p. In the below, I factored out the null test value(s) from the repeated if tests and added it to the pocket instance as a nulls test attribute. def pocket(): def __init__(self, nulls): if type(nul;s) != tuple or nulls == (): nulls = (nulls,) self.nulls = nulls def __call__(self, value): self.value = value return self def __bool__(self) return not (self.value in self.nulls) then def match_token(s): # Assume we have regex's precompiled for the various tokens: p = pocket(None) # None from above, I don't know match return if p(re_comment.match(s)): # Do something with p.value process(TOK_COMMENT, p.value.group(1)) elif p(re_ident.match(s)): # Do something with p.value process(TOK_IDENT, p.value.group(1)) elif p(re_integer.match(s)): # etc Pocket could be used for the while case, but I will not show that because I strongly feel 'for item in iteratot' is the proper idiom. Your specific example is so regular that it also could be written (better in my opinion) as a for loop by separating what varies from what does not: match_tok = ( (re_comment, TOK_COMMENT), (re_indent, TOK_INDENT), ... (re_string, TOK_STRING), ) # if <re_x> is actually re.compile(x), then tokens get bundled with # matchers as they are compiled. This strikes me as good. for matcher,tok in match_tok: m = matcher.match(s) if m: process(tok, m.group(1) break else: raise InvalidTokenError(s) Notice how nicely Python's rather unique for-else clause handles the default case. Terry Jan Reedy

If I'm understanding you correctly, this would address something I've wanted list comprehensions to do for a little while: [x for x in range(0,10) until greaterthanN(4,x)] or, trying to take from the above example, tokenchecks = [token for regex,token in match_tok until regex.match(s)] # do whatever from this point forward. return tokenchecks[-1] This ignores the case where you might want to bind a name to the expression in an if and until to prevent redundancy, like so: tokenchecks = [token for regex,token in match_tok until regex.match(s) != None if regex.match(s)] #"!= None" included to make the point that you can evaluate more than simple presence of a value if len(tokenchecks) == 0: #do stuff else: process(tokenchecks[-1]) Perhaps you could add some variable binding in there (...maaaaybe.): #Bad example? tokenchecks = [token for regex,token in match_tok until MatchP != None if MatchP with MatchP as regex.match(s)] #Better example? tokenchecks = [token for regex,token in match_tok until MatchP.group(0) == 'somethingspecific' if MatchP with MatchP as regex.match(s)] There are arguments for or against making list comprehensions more complex to accommodate what I hope I understand this thread is getting at, but this is my two cents on a possible syntax. It seems the pattern (or idiom, or whatever the proper term is) of stopping a loop that operates on data before you're done is common enough to justify it. One could argue that this complicates list comprehensions needlessly, or one could argue that list comprehensions already do what for loops can, but serve as shortcuts for common patterns. It also occurs to me that this may not net as much benefits in parallel (or OTOH it could make it easier), but as I do not have significant experience in parallel programming I'll shy away from further commenting. I'm tired and this email has undergone several revisions late at night on an empty stomach. I apologize for the resulting strangeness or disjointedness, if any. Come to think of it, my example might be worse than my fatigued brain is allowing me to believe. --Andy Afterthought: what would happen if the 'until' and 'if' conditions were opposite? Presumably, if the until were never fulfilled, all elements would be present, but if the reverse were true, would it filter out the result, then, and stop? On Tue, Jul 8, 2008 at 11:16 AM, Terry Reedy <tjreedy@udel.edu> wrote:

Clearly, the list comprehension is intended to map and filter in a functional matter. My [-1] operation was a trivial reduce operation. Certainly another example could be concocted to demonstrate it used in such a matter that would not offend your sensibilities, but as far as I'm concerned it is simple, beautiful, readable, unambiguous, etc. Your counterpoint takes exception to the example rather than the suggestion. Say someone is going through a flat file with a large list of numbers, and they want to filter out all odd numbers, and stop once they encounter a 0 value. [int(line) for line in file if evenP(int(line)) until int(line)==0] or [int(line) for line in file if evenP(int(line)) until int(line)==0 with open(filename) as file] Bam, one line to open the file, iterate through them, filter out unneeded entries, define the base case, and return value. Do you believe that this also abuses list comprehensions? If so, please explain your reasoning. There's nothing keeping a programmer from writing anything this syntax could do as a normal for loop, but neither is there anything stopping said programmer from using a for loop with ifs and breaks in place of a list comprehension. One can also make the reasoning that for infinitely iterable sequences, in order for the shorter syntax to remain useful it should be able to define a stopping point at which it ceases to map over said sequence. This is my justification for suggesting 'until'; say someone created a function that yielded the fibonacci numbers: it would be easy to iterate over this with a full for statement and forget or misplace the base case which in turn would lead to an infinite loop. 'until' offers a relatively simple solution that at worst adds a few words to a program; on average, could reduce a little bit of processing time; and at best, prevents an infinite for loop -- all in a simple, readable, and compact syntax that, as far as I know (I admit I am very new to python-ideas), would add little complexity to the interpreter. --Andy On Wed, Jul 9, 2008 at 4:59 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:

Not everything should be a 1-line function or operation. And, in fact, your examples show that you really don't understand *current* Python semantics, so trying to introduce new syntax is a non-starter (never mind that it was killed earlier). [int(line) for line in file if evenP(int(line)) until int(line)==0 with open(filename) as file] Could be replaced by... [int(line) for line in open(filename) if evenP(int(line)) until int(line)==0] If you kept the until syntax. But even then, since Python files iterate lines with trailing newlines, it would raise an exception in the first pass. [int(line.rstrip('\r\n')) for line in open(filename) if evenP(int(line.rstrip('\r\n')) until ...] But now there's all this .rstrip() crap in there, an explicit evenP call, etc. At this point, it's probably *clearer* (and faster) to pull everything out and be explicit about your cases. out = [] for line in open(filename): val = int(line.rstrip('\r\n')) if val == 0: break if not val & 1: out.append(val) - Josiah On Fri, Jul 11, 2008 at 2:10 AM, Andrew Akira Toulouse <andrew@atoulou.se> wrote:

Andrew Akira Toulouse wrote:
Since you don't like for loops, what's wrong with: from itertools import takewhile with open(filename) as f: intlines = (int(line) for line in f if evenP(int(line))) result = list(takewhile(lambda x: x, intlines)) This is perfectly clear, not too much longer than your suggestion, and doesn't require adding large amounts of extra syntax to the language (though using a lambda for the identity function is a bit more verbose than I'd prefer). Personally, I’d rather have this broken up than in one gigantic line that looks like: result = [int(line) for line in f if evenP(int(line)) until int(line) == 0 with open(filename) as f] As for:
tokenchecks = [token for regex,token in match_tok until regex.match(s)] return tokenchecks[-1]
This is much more readably expressed as: for regexp, token in match_tok: if regexp.match(s): return token Cheers, Jacob Rus

Jacob Rus wrote:
Andrew Akira Toulouse wrote:
tokenchecks = [token for regex,token in match_tok until regex.match(s)] return tokenchecks[-1]
This is missing either ';' or '\n' between the statements.
Besides which, creating an unbounded list one does not actually want strikes me as conceptually ugly ;-).

hi! Jacob Rus wrote:
Andrew Akira Toulouse wrote:
...
I have not followed the thread, but the line above looks cool and terrible at the same time. I can't think what other operator can be squeezed into expression, maybe try-except, eh? result = [int(line) for line in f try if evenP(int(line)) until int(line) == 0 except ValueError("blabla") with open(filename) as f] I am not sure if this kind of SQLness should enter Python syntax at all... Regards, Roman

Point taken -- it's a very valid point, but at the same time building an unnecessary list isn't really all that good-looking. Being able to do query-like things is very useful, and I'd prefer to do it as close to the Python core as possible, rather than (for example) creating a in-memory sqlite database or something. LINQ, I think, somewhat validates this point of view. I feel that an integrated way to query data would be a useful addition. List comprehensions seem to be the natural Pythonic way to do this, but if you can think of something better... On Tue, Jul 29, 2008 at 3:15 AM, Roman Susi <rnd@onego.ru> wrote:

Andrew Akira Toulouse wrote:
How about something liter, like a function that searches containers in a way similar to how Yahoo searches are expressed. While this can be done with regex expressions, they aren't something you want to expose to a typical user. It would be a good candidate for a separately developed module that may could be included in the library at some point in the future. Ron

Andrew Akira Toulouse wrote:
What I am not sure about is that Python needs special syntactic provisions for "query-like" things even though its temptative and I am sure a lot of use cases can be facilitated (not only SQL queries but Prolog-like, SPARQL-like, etc, etc). Because it makes the language unnecessarily complex. Python already has dict/list literals + list comprehensions. And universal functional syntax could be used for the same things if only Python easier "laziness", because all those proposed untils and while could be then simply realized as functions. The below is not syntax proposition, but illustrates what I mean: my_qry = select(` from(t1=`table1), ` where(`t1.somefield < 5), ...) Those things after backticks function select can eval if needed. Similarly, there could be triple backtick for lazy expressions as a whole: qry = ```select(from(t2=table), where(...), ... ``` which could be then evaluated accordind to some domain-specific language (Python syntax for domain specific language): res = sqlish(qry) One obvious usage is regular expressions. Right now ``` is just r""" for them. And sqlish() above is re.compile(). In short, I do not like ad hoc additions to the language (even inline if was imho a compromise), but more generic features (syntactic support for embedded declarative/query languages) like simple lazy evaluation. (PEP 312 "Simple implicit lambda" co-authored by me is just one example toward same goal of simpler laziness.) Please, do not understand me wrong. I am not against adding features you want/need and many people who design ORMs and the likes will be glad to see in Python. I am against making particular syntactic changes to Python to enable those features instead of thinking on more generic feature which would cover many cases. And I think that feature boils down to something spiritually like Lisp's quote. The syntax above reuses backtick dropped by Py3k to mean not just a string to evaluate but a closure. Its possible to do those things with lambdas even now but the readability will suffer to the degree of unpythonic code. Also, its possible to do the same (and there are a lot of examples like PyParsing) with specially crafted classes which overload all operations to perform their logic in a lazy evaluation fashion (that is, function calls are not really function calls any more but data structure specifiers/builders). Sorry for going too far from the original topic. I just wanted to point out that IMHO the while/until feature is not quite good for Python as very specific syntactic addition. I do not believe GvR will ever make list comprehensions complete sql with wheres, groupings, orderbys, havings, limits, even though for (SQL's from) and if (SQL's where) are there. Regards, Roman

Roman Susi wrote:
Python already has dict/list literals + list comprehensions.
Better to think that Python has generator expressions which can be used in list, set, and dict comprehensions (the latter two in 3.0 and maybe 2.6).
Generator functions using loops and alternation are easy laziness for cases not easily abbreviated by generator expressions. I see no reason to complicate the language with new keywords. tjr

On Tue, Jul 29, 2008 at 3:19 PM, Terry Reedy <tjreedy@udel.edu> wrote:
You probably don't want to think about it that way - a list/set/dict comprehension does not actually create a generator. Instead, it basically just inlines the equivalent for loop. Note that there's no YIELD_VALUE opcode for the comprehensions:
Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

Steven Bethard wrote:
Yes I do. For normal purposes, in 3.0, [genexp] == list(genexp), and so on. That it do something else internally for efficiency is implementation detail, not semantics.
Instead, it basically just inlines the equivalent for loop.
In 3.0, not quite. The manual says "Note that the comprehension is executed in a separate scope, so names assigned to in the target list don’t “leak” in the enclosing scope." The difference between this and the conceptual equivalence above is one function call versus many. Were you using 2.6? tjr

On Tue, Jul 29, 2008 at 8:55 PM, Terry Reedy <tjreedy@udel.edu> wrote:
But the semantics are pretty different. The semantics are basically: def _(seq): _1 = [] for x in seq: _1.append(<expression with x>) return _1 return _(seq) def _(seq): for x in seq: yield <expression with x> return list(_(seq)) Yes, they produce the same results, but semantically they're different. In one case, a list is created and items are appended to it and then that list is returned. In the other case, a generator object is created, and then that generator is repeatedly resumed to yield items by the list() constructor. I think it's fine to say they achieve the same result, but I think saying they're really the same thing is a mistake. Steve P.S. Yes, it's executed in a different scope - that's why the code I showed you before had to do ``dict_comp.__code__.co_consts[1]`` instead of just ``dict_comp.__code__``. -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

Steven Bethard wrote:
First of all, the difference in semantics is not that crucial for the point I made (no extra syntax for Python built-in "query language"). Both ways (list comprehensions and generators) are ways to abstract iterative processing of items of some data structures (they are iterative) where generators are much more generic. The problem is not on the result side. Syntax proposed for while/until and other possible additions (try-except, with) are basically the filters in the pipeline which are possible to do with functional approach (using generator protocol or not and whether or not it is good "laziness" - is a matter of implementation) without introducing obscure syntactic additions and new or reused keywords. As someone pointed out, maybe more advanced itertools / "gentools" can achieve desired results without loosing efficiency but helping readable iterators/generators pipeline construction for modeling query languages. -Roman

Steven Bethard wrote:
I could have mentioned here that Python does not have dict or list *literals*. It only have number and string/bytes literals. It has set, dict, and list *displays*. One form of display content is a sequence of expressions, a tuple list (with a special form for dicts). Another form of display content is a genexp (with a special form for dict comprehensions).
Better to think that Python has generator expressions which can be used in list, set, and dict comprehensions (the latter two in 3.0 and maybe 2.6).
The above is a slightly better way to say this.
You probably don't want to think about it that way - a list/set/dict comprehension does not actually create a generator.
I never said it necessarily did. The fact that [tuple_list] == list((tuple_list)) does not mean that [tuple_list] actually creates a tuple object. An interpreter could, but CPython does not. Similarly, f(1,2,3) could create a tuple object, but CPython does not.
I never said that listcomp == genexp, I said that listcomp (setcomp,dictcomp) == bracketed genexp == list/set/dict(genexp) The semantics are basically:
The results of the expressions are their semantics. When you write 547 + 222, the semantics is the result 769, not the internal implementation of how a particular interpreter arrives at that. Anyway, in Python, '==' is value comparison, not algorithm comparison.
This is a CPython internal implementation choice. In each case, a list object is created and items appended to it. The internal question is how the objects are generated. The list(genexp) implementation was considered until the faster version was created. [genexp] == list(genexp) was the design goal. I said 'for normal purposes' because it turns out that the the faster implementation allows that equality to be broken with somewhat bizarre and abusive code that should never appear in practical code. The developers tried to fix this minor glitch but could not without reverting to the slower list(genexp) version, so they decided to leave things alone. My point is that if one understands genexp, one really understands the comprehensions also. It is a matter of reducing cognitive load. I happen to be in favor of that. Terry Jan Reedy

On Wed, Jul 30, 2008 at 9:00 AM, Terry Reedy <tjreedy@udel.edu> wrote:
I see we have a very different definition of semantics. For me, the semantics of ``547 + 222`` are different from ``1538 / 2 are different from 769 because the operations the interpreter goes through are different. Given that by "semantics" you mean, "result of the expression", I can happily conceded that the result of the expression of list(<genexp>) is the same as [<listcomp>]. Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

Precisely: you're talking about two different accepted definitions of semantics: http://en.wikipedia.org/wiki/Formal_semantics_of_programming_languages I'm not going to say one type of formal semantics is better than another, but it's nice when everyone is at least on the same page :) On Wed, Jul 30, 2008 at 09:26, Steven Bethard <steven.bethard@gmail.com>wrote:
This is denotational semantics.
This is operational semantics.
-- It is better to be quotable than to be honest. -Tom Stoppard Borowitz

David Borowitz wrote:
Thank you for the reference. I see that denotational and operational are just the beginning of the semantics of semantics. In writing about and comparing algorithms, I need a term that is slightly in between but closer to operational than merely denotational. I am thinking of using 'operationally equivalent' to describe two algorithms that are denotationally (functionally) identical and which compute the same essential intermediate objects (values) in the same order, while not being 'operationally identical' (which would make them the same thing). Wikipedia refers 'operational equivalence' to 'observational equivalence'. Other hits from Google suggest that it have been used variably to mean things from operationally identical to merely denotationally (functionally identical), but therefore a complete substitute in the users operations. In the present case, we agree that [genexp] == list(genexp] in the Python meaning, denotational and operational, of '==', which is to compare the denotational meaning of two expressions (the resulting objects and their values). I also claim a bit more: [genexp] could be but does not have to be, and in current CPython is not, operationally identical to list(genexp). But I do expect that it is operationally equivalent in the sense above. The essential objects and operations are the list contents and the list that starts empty and grows one reference at a time via append(). Terry Jan Reedy

Another good reference is the "as-if" rule in C. It's similar to operational semantics, but may have enough denotational semantics semantics to be what you're looking for. Don't know of a canonical reference, but Google is useful: http://docs.sun.com/app/docs/doc/819-5265/6n7c29d9f?l=ko&a=view On Thu, Jul 31, 2008 at 20:00, Terry Reedy <tjreedy@udel.edu> wrote:
-- It is better to be quotable than to be honest. -Tom Stoppard Borowitz

Boris Borcic wrote:
I don't think it a good idea to abuse StopIteration in this way. It is intended only for use in iterators. The obvious list comp abbreviation does not work in CPython 3.0.
So it breaks the intended identity in Py3: list(genexp) == [genexp]. This was discussed on the Py3 dev list and it was agreed the difference in behavior was a bug but would be difficult to impossible to fix without performance degradation and unnecessary to fix because the problem only arises if someone uses StopIteration outside its documented usage in iterators.

Terry Reedy wrote:
This was discussed on the Py3 dev list and it was agreed the difference in behavior was a bug
I don't think it was agreed that it was a bug. Rather that it's just something that shows that a list comp and list(genexp) are not exactly equivalent in all ways, and that it doesn't matter if they differ in some corner cases like this. -- Greg

Boris Borcic wrote:
Terry Reedy wrote:
So it breaks the intended identity in Py3: list(genexp) == [genexp].
FYI, there is nothing specific to Py3, Python 2.5 and 2.4 behave the same.
FYI, there *is* something specific to Py3: the willingness to break old code. Originally, list comps were defined as having the same result as an equivalent series of for loops and conditional statements: x = [f(i) for i in seq if g(i)] # same as _ = [] for i in seq: if g(i): _.append(f(i)) x = _ About the time 2.5 came out (or maybe before), and generator expressions were in, it was decided that this definition, which results in 'leaking' iteration variables out of comprehensions, was a design mistake, and that the new identify given above would be better. When implementing [genexp] as list(genexp) was found to take much longer (2x?), an alternative was found that was nearly as fast as the 2.x implementation but stopped the leakage and seemed otherwise identical in effect to list(genexp). The only exception I have seen discussed arises from the use of StopIteration outside an iterator .__next__ metrhod. tjr

Boris Borcic wrote:
However, note that this doesn't work if you use a list comprehension instead of list(genexp). The reason is that the StopIteration is not stopping the for-loop, but rather the implicit iteration over the results of the genexp being done by list(). So it only appears to work by accident. -- Greg

2008/7/3 Stavros Korokithakis <stavros@korokithakis.net>:
while my_file.read(1024) as data: do_something(data)
Python explicitly disallows inline assignment ("while a=myfile.read():") because of the error propensity when confusing it with the comparation ("=="). But here we're gaining the same advantage, without that risk. So, taking into account that... a) We already have "as" as an statement. b) We already use "as" as an assignment [1] c) This will allow more concise and intuitive code ... I'm definitely +1 to this proposition. In any case, Stavros, this would need a PEP... Thank you! [1] http://docs.python.org/dev/reference/compound_stmts.html#the-with-statement -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

On Thu, Jul 3, 2008 at 7:46 AM, Facundo Batista <facundobatista@gmail.com> wrote:
I would argue at a -.5. Not every 5-line function needs to be syntax, and unless someone can really show a common use-case for tossing 'as' in the syntax of the while loop, it smells like a one-off syntax. Don't get me wrong, I've needed to do blocked reading before, but I typically write the block reader generator and call it good. - Josiah

The proposed syntax would be (mostly) equivalent to C's do ... while loop without adding another looping construct, so the usefulness would be that of do ... while. I'm not sure whether everyone considers this to be indispensable, but it can't be totally useless if it was included in C... Stavros Korokithakis Josiah Carlson wrote:

Hallöchen! Facundo Batista writes:
I don't like it. Taken as an English expression, it suggests something misleading. I looked at my own code (not much; 15.000 lines) and found three "while True", and none of them could be expressed with "while ... as ...". It is not worth it in my opinion. However, it *is* an issue in Python. The almost official repeat...until is "while True:", and I got accustomed to reading it as a repeat...until. But actually, it is even more flexible since it is the most general loop at all. Granted that "while True" is not eye-pleasant. Maybe we could introduce the idion "while not break:". ;-) Tschö, Torsten. -- Torsten Bronger, aquisgrana, europa vetus Jabber ID: torsten.bronger@jabber.rwth-aachen.de

On Jul 3, 2008, at 1:11 PM, Torsten Bronger wrote:
Granted that "while True" is not eye-pleasant. Maybe we could introduce the idion "while not break:". ;-)
A little tongue-in-cheek, but works today: # in some devious place: import __builtin__ __builtin__.broken = False # in application code: while not broken: break_something() -Fred -- Fred Drake <fdrake at acm.org>

I am against while...as syntax for a reason given above (doesn't map to an english statement nearly as well as with...as does). Besides, you can write much clearer code by making yourself a generator somewhere else that does the nasty bits, and then using a for loop to actually consume the data, like you should: def kilogenerator(file): while True: try: yield file.read(1024) except EOFError: # or whatever return for kilo in kilogenerator(myfile): process(kilo) -- Cheers, Leif

On Thu, Jul 3, 2008 at 11:11 AM, Torsten Bronger <bronger@physik.rwth-aachen.de> wrote:
I had similar results. Around 10 "while True" constructs, and none of them would have worked with "while ... as ..." -1 for me. Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

On Thu, Jul 3, 2008 at 2:22 PM, Stavros Korokithakis <stavros@korokithakis.net> wrote:
The "while True" concept "snuck in" because many of us write this idiom as:: while True: <setup> if not <expression>: break Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

On Thu, Jul 3, 2008 at 1:22 PM, Stavros Korokithakis <stavros@korokithakis.net> wrote:
And "while True" is the currently accepted idiom to replace that, which possibly makes "while as" unnecessary. while True: <setup> if not <expression>: break <stuff> Note that <setup> is no longer repeated. - Chris

2008/7/3 Facundo Batista <facundobatista@gmail.com>:
For me, the main advantage of binding being provided by a statement rather than an expression is not that it eliminates confusion with test for equality. The main advantage is that it prevents *sneaky* (re)binding. You can't bind a name from within an expression, it has to be clearly advertised: foo = 'new value' When I'm scanning code later I'm not going to miss these. IMHO, giving that up would be a mistake. -- Arnaud

On Thu, Jul 3, 2008 at 10:06 AM, Stavros Korokithakis < stavros@korokithakis.net> wrote:
There is already an idiom for this, although admittedly not obvious or well-known: for data in iter(lambda: my_file.read(1024), ''): do_something(data) or in 2.5+: from functools import partial for data in iter(partial(my_file.read,1024), ''): do_something(data) George

2008/7/3 George Sakkis <george.sakkis@gmail.com>:
Yes, but note that these feels like workarounds. One thing that I love in Python is that you can learn a concept, and apply it a lot of times. So, suppose I'm learning Python and find that I can do this... with_stmt ::= "with" expression ["as" target] ":" suite ...why can't I do this?: while_stmt ::= "while" expression ["as" target] ":" suite I mean, I don't care if this solves a lot of issues, or just a few [1]... this just make the whole syntax more coherent. Regards, [1] Note, however, that one of the issues that it solves is one that raises always a lot of questions for newbies... how to make a "do-while". -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

On Thu, Jul 3, 2008 at 11:22 AM, Facundo Batista <facundobatista@gmail.com> wrote: 2008/7/3 George Sakkis <george.sakkis@gmail.com>:
Actually even more newbies look for a general assignment-as-expression (e.g. http://tinyurl.com/6bwmp2), so it's not a far step for someone to suggest for example: if_stmt ::= "if" expression ["as" target] ":" suite FWIW I am not necessarily against (rather +0 for now), I'm just pointing out that striving for coherence just for the sake of it is prone to opening a can of worms. Best, George

Facundo Batista wrote:
I disagree, and will explain. To me, iterators are one of the great unifying concept of Python, and for statements are the canonical way to iterate. The end of a sequence, if there is one, can be signaled by any sentinel object or exception. The iterator protocol abstracts away the different possible signals and replaces them all with one exception used just for this purpose and which therefore cannot conflict with any object in the sequence nor be confused with any unexpected exception raised by their production. When used in a for statement, it separates the concern of detecting the end of the sequence from the concern of processing the items of the sequence. So, in this view, making a block-reading-and-yielding iterator, if one does not exist out of the box, is the proper thing to do for processing blocks. While statements are for all other loops for which for statements do not work (including, sometimes, the code buried within a generator function). The suggestion of 'while <expression> as <name>' proposes a new 'expression-iterator' protocol for expression-produced sequences of non-null objects terminated by *any* null object. -1 from me, because a) It unnecessarily duplicates one of Python's gems. b) The use of *all* null objects as sentinels is almost never what one wants in any particular situation. If it is, it is easy enough to write def non_nulls(f): while True: o = f() if o: yield o else: raise StopIteraton # or just break c) Using all nulls as sentinels restricts the sequence more than using just one. It is also bug bait. Suppose f returns next_number or None. A slightly naive or careless programmer writes 'while f as x: ...'. At some point, f returns 0. Whoops. It is better to be specific and only terminate on the intended terminator. For this example, iter(f,None), or iter(f,'' as George suggested for the OP's use case. Careful programming is *not* a workaround. Terry Jan Reedy

On Thu, Jul 3, 2008 at 7:06 AM, Stavros Korokithakis <stavros@korokithakis.net> wrote:
What if the condition is ever so slightly more complicated? Then it looks a bit less than natural... while foo() != 17 as baz: do_something(baz) Now it looks like baz should equal the boolean value from the comparison, but what if i want it to be foo() ? And if it is somehow magically foo(), then that's really unintuitive-looking. - Chris

On Thu, Jul 3, 2008 at 10:06 AM, Stavros Korokithakis <stavros@korokithakis.net> wrote:
while my_file.read(1024) as data: do_something(data)
I dislike this for aesthetic reasons. One of the nice things about Python is that is reads like executable psuedocode. It feels more or less intuitively obvious what the various looping constructs do. I'd argue "while foo as bar" is non-obvious. I realize this syntax proposal is strongly parallel to "with foo as bar", but "with foo as bar" just seems to read better. Taken as an English clause, it gives at least a hint as to what's going on. I can imagine an experienced programmer, new to Python, seeing "while foo as bar" and assuming "as" is some sort of binary operator. This would be a minor quibble if this proposed syntax solved a problem. But as Chris Rebert points out, you can get the same effect today at the cost of two or three lines extra lines by writing it as a "while True" loop with a test and break inside. This form is only slightly longer, and is immediately obvious to programmers coming from many other languages. Greg F

One thing I don't like about using "as" here is that with has very different semantics for that assignment than while does. I'd rather have: while data in [[my_file.read(1024)]]: where [[...]] is a shorthand for iter(lambda: ..., None) since at least the concept of taking an expression and turning it into an iterator is fairly general. OK, that's pretty ugly so maybe I wouldn't want it. :-) If you add this to while, you're going to want this in "if" and there are lot of other cases like: while (read_a_number() as data) > 0: I'd rather see something along the lines of PEP 315. --- Bruce On Thu, Jul 3, 2008 at 7:06 AM, Stavros Korokithakis < stavros@korokithakis.net> wrote:

It's not currently possible to determine if a file/stream is at its end, is it? If that were the case you could easily do the read before your do_something call. Something like: while not my_file.eof: data = my_file.read(1024) do_something(data) Can anyone explain why file objects don't support some sort of eof check? Something gives me the impression that it was an intentional decision. IMO something like this would be better than more syntax. Cheers, T Stavros Korokithakis wrote:

I don't think they do, if I'm not mistaken the only way is to call read() and see if it returns the empty string. I agree that this would be better, but the use case I mentioned is not the only one this would be useful in... Unfortunately I can't think of any right now, but there have been a few times when I had to initialise things outside the loop and it always strikes me as ugly. Thomas Lee wrote:

There doesn't really exist an "end of file" unless you are using a stream (pipe, socket, ...). Real files can be appended to at any time on the three major platforms. Also, if you know that you are going to be reading blocked data, perhaps it's better just to write your blocked reader (with your single non-pythonic idiom) once. def blocked_reader(handle, blocksize=4096): data = handle.read(blocksize) while data: yield data data = handle.read(blocksize) Then you can use a better common idiom than a while loop: for block in blocked_reader(handle): handle_data(block) - Josiah On Thu, Jul 3, 2008 at 7:24 AM, Stavros Korokithakis <stavros@korokithakis.net> wrote:

Stavros Korokithakis wrote:
Well that depends, on the situation really. The only use case I can think of is exactly the one you mentioned. And since you can't think of any other scenarios where such a thing might be handy, I've got no better suggestion to offer. If you can conjure up another scenario, post it back here and we'll see if we can generalize the pattern a little. In the meantime, I'd love a way to check if a file is at its end without having to read data out of it ... Cheers, T P.S. you might be interested in using something like the following to hide the ugliness in a function: def reader(stream, size): data = stream.read(size) while data: yield data data = stream.read(size) Then you could use it as follows: for block in reader(my_file, 1024): do_something(block)

This was apparently resuggested in PEP 315 but they couldn't get the syntax right: http://www.python.org/dev/peps/pep-0315/ I believe that my proposal deals with that issue by keeping the syntax coherent with the rest of Python (especially with the "with" keyword) and has all the advantages of the restructured code. There are also some use cases in that PEP. Stavros Thomas Lee wrote:

The alternate idiom I've seen for "do-while"-like loops such as this in Python is (using the original example): while True: data = my_file.read(1024) if not data: break do_something(data) which seems perfectly acceptable and in keeping with TOOWTDI by minimizing the number of loop constructs in the language. It obviously avoids the distasteful duplication of "data = my_file.read(1024)". - Chris On Thu, Jul 3, 2008 at 7:58 AM, Thomas Lee <tom@vector-seven.com> wrote:

On Jul 3, 2008, at 11:21 AM, Stavros Korokithakis wrote:
The style with the check "in the middle", however, is fully general: you can do different checks at different points to determine when to break or skip the rest of the indented suite; the "while <expr> as <var>:" form only handles a specific case (though common). The "while <expr> as <var>:" for has a certain level of attraction, because it does deal very well with a very common case, but I don't know that it would make that much difference in my code. (Of course, for the specific use case being used to seed this discussion, the fileinput module may be helpful.) -Fred -- Fred Drake <fdrake at acm.org>

2008/7/3 Fred Drake <fdrake@acm.org>:
Note that what I like in it is *not* that it solves a lot of cases, but that it makes the syntax more coherent, and solves a common case in the way, :) -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

On Thu, Jul 3, 2008 at 8:45 AM, Facundo Batista <facundobatista@gmail.com> wrote:
That's not how I would define "coherent". It adds one more special case. See the zen of Python for that. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

On Thu, Jul 3, 2008 at 8:45 AM, Facundo Batista <facundobatista@gmail.com> wrote:
Typically a good way of making a point that a new syntax proposal will have impact is to find as many places as possible in the standard library where the code was improved by the change. If a bunch of places become easier to read then that will squash the need for it solve a ton of different cases because it solves a very common case instead. -Brett

You're right, it's the way it has to be done, but I personally find it awful code. "while True" is a do-forever, so that's what you read first, but that's not really what we want to write. Then we read, OK, but then we break out of the forever loop, then we don't, and do something with the data we read. It's a mess. What I'd like to be able to write is something like this: while my_file.has_more_data(): do_something_with_data(my_file.read(1024)) For instance: for line in open(filename): do_something_with_data(line) Which, in its non-line form, is pretty much what Josiah suggested. +1 for adding "blocked_reader" as a static method on the "file" class, or perhaps one of the IOStream ABC classes. Bill

On Thu, Jul 3, 2008 at 2:58 PM, Thomas Lee <tom@vector-seven.com> wrote:
Randomly choose a number 0-100 that is not already picked: # Current Python number = random.randint(1, 100) while number in picked_numbers: number = random.randint(1, 100) # Do-Until do: number = random.randint(1, 100) until number not in picked_numbers # While-As Doesn't seem to be doable. Maybe with the proposed syntax either: while random.randint(1, 100) as number in picked_numbers: pass or while random.randint(1, 100) in picked_numbers as number: pass would work. But it doesn't look pretty anymore. I would definitely prefer a do-until or do-while construct instead. -- mvh Björn

From: "BJörn Lindqvist" <bjourne@gmail.com>
I would definitely prefer a do-until or do-while construct instead.
I co-authored a PEP for a do-while construct and it died solely because we couldn't find a good pythonic syntax. The proposed while-as construct will neatly solve some of the use cases -- the rest will have to live with what we've got. Raymond

On Thu, Jul 3, 2008 at 8:21 PM, Fred Drake <fdrake@acm.org> wrote:
Same here. The while ... as ... syntax introduces a special case that doesn't handle enough cases to be worth the cost of a special case. Zen of Python: "Special cases aren't special enough to break the rules." To summarize why while...as... doesn't handle enough cases: it only works if the <setup> can be expressed a single assignment and the value assigned is also the value to be tested. Like it or not, many uses of assignment expressions in C take forms like these: if ((x = <expr>) > 0 && (y = <expr>) < 0) { ... } while ((p = foo()) != NULL && p->bar) { ... } If we were to add "while x as y:" in 3.1, I'm sure we'd get pressure to add "if x as y:" in 3.2, and then "if (x as y) and (p as q):" in 3.3, and so on. Assignment-in-test: Just Say No. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
All that could be short-circuited by making 'expr as name' a valid expression in general from the outset. However, that would make 'with x as y' inconsistent, because it doesn't bind the value of x to y, but rather the result of x.__enter__(). (Often it's the same thing, but it doesn't have to be.) Then there's the new form of except clause that uses 'as' in another slightly different way. So while this proposal might at first appear to make things more consistent, it actually doesn't. The way things are, each use of 'as' is unique, so it can be regarded as consistently inconsistent. Introducing a general 'expr as name' construct would set up an expectation that 'as' is simply a binding operator, and then the existing uses would stick out more by being inconsistently inconsistent. :-) -- Greg

Guido van Rossum wrote:
I'm actually more interested in the "as" clause being added to "if" rather than "while". The specific use case I am thinking is matching regular expressions, like so: def match_token(s): # Assume we have regex's precompiled for the various tokens: if re_comment.match(s) as m: # Do something with m process(TOK_COMMENT, m.group(1)) elif re_ident.match(s) as m: # Do something with m process(TOK_IDENT, m.group(1)) elif re_integer.match(s) as m: # Do something with m process(TOK_INT, m.group(1)) elif re_float.match(s) as m: # Do something with m process(TOK_COMMENT, m.group(1)) elif re_string.match(s) as m: # Do something with m process(TOK_STRING, m.group(1)) else: raise InvalidTokenError(s) In other words, the general design pattern is one where you have a series of tests, where each test can return either None, or an object that you want to examine further. Of course, if it's just a single if-statement, it's easy enough to assign the variable in one statement and test in the next. There are a number of workarounds, of course - put the tests in a function and use return instead of elif; Use 'else' instead of elif and put the subsequent 'if' in the else block (and deal with indentation crawl) and so on. So it's not a huge deal, I'm just pointing out one possible use case where it might make the code a bit prettier. (Also, there's a much faster way to do the above example, which is to smush all of the patterns into a single massive regex, and then check the last group index of the match object to see which one matched.) -- Talin

On Mon, Jul 7, 2008 at 7:10 PM, Talin <talin@acm.org> wrote:
I'm actually more interested in the "as" clause being added to "if" rather than "while".
Thanks for the support. :-) The more examples we get suggesting that this might also come in handy in other contexts the more ammo I have against the request to add it *just* to the while loop. And adding it everywhere is not an option, as has been explained previously. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Talin wrote:
One of the things I do not like above is the repetition of the assignment target, and statement ends, where it is a little hard to see if it is spelled the same each time. I think it would be more productive to work on adding to the stdlib a standardized store-and-test class -- something like what others have posted on c.l.p. In the below, I factored out the null test value(s) from the repeated if tests and added it to the pocket instance as a nulls test attribute. def pocket(): def __init__(self, nulls): if type(nul;s) != tuple or nulls == (): nulls = (nulls,) self.nulls = nulls def __call__(self, value): self.value = value return self def __bool__(self) return not (self.value in self.nulls) then def match_token(s): # Assume we have regex's precompiled for the various tokens: p = pocket(None) # None from above, I don't know match return if p(re_comment.match(s)): # Do something with p.value process(TOK_COMMENT, p.value.group(1)) elif p(re_ident.match(s)): # Do something with p.value process(TOK_IDENT, p.value.group(1)) elif p(re_integer.match(s)): # etc Pocket could be used for the while case, but I will not show that because I strongly feel 'for item in iteratot' is the proper idiom. Your specific example is so regular that it also could be written (better in my opinion) as a for loop by separating what varies from what does not: match_tok = ( (re_comment, TOK_COMMENT), (re_indent, TOK_INDENT), ... (re_string, TOK_STRING), ) # if <re_x> is actually re.compile(x), then tokens get bundled with # matchers as they are compiled. This strikes me as good. for matcher,tok in match_tok: m = matcher.match(s) if m: process(tok, m.group(1) break else: raise InvalidTokenError(s) Notice how nicely Python's rather unique for-else clause handles the default case. Terry Jan Reedy

If I'm understanding you correctly, this would address something I've wanted list comprehensions to do for a little while: [x for x in range(0,10) until greaterthanN(4,x)] or, trying to take from the above example, tokenchecks = [token for regex,token in match_tok until regex.match(s)] # do whatever from this point forward. return tokenchecks[-1] This ignores the case where you might want to bind a name to the expression in an if and until to prevent redundancy, like so: tokenchecks = [token for regex,token in match_tok until regex.match(s) != None if regex.match(s)] #"!= None" included to make the point that you can evaluate more than simple presence of a value if len(tokenchecks) == 0: #do stuff else: process(tokenchecks[-1]) Perhaps you could add some variable binding in there (...maaaaybe.): #Bad example? tokenchecks = [token for regex,token in match_tok until MatchP != None if MatchP with MatchP as regex.match(s)] #Better example? tokenchecks = [token for regex,token in match_tok until MatchP.group(0) == 'somethingspecific' if MatchP with MatchP as regex.match(s)] There are arguments for or against making list comprehensions more complex to accommodate what I hope I understand this thread is getting at, but this is my two cents on a possible syntax. It seems the pattern (or idiom, or whatever the proper term is) of stopping a loop that operates on data before you're done is common enough to justify it. One could argue that this complicates list comprehensions needlessly, or one could argue that list comprehensions already do what for loops can, but serve as shortcuts for common patterns. It also occurs to me that this may not net as much benefits in parallel (or OTOH it could make it easier), but as I do not have significant experience in parallel programming I'll shy away from further commenting. I'm tired and this email has undergone several revisions late at night on an empty stomach. I apologize for the resulting strangeness or disjointedness, if any. Come to think of it, my example might be worse than my fatigued brain is allowing me to believe. --Andy Afterthought: what would happen if the 'until' and 'if' conditions were opposite? Presumably, if the until were never fulfilled, all elements would be present, but if the reverse were true, would it filter out the result, then, and stop? On Tue, Jul 8, 2008 at 11:16 AM, Terry Reedy <tjreedy@udel.edu> wrote:

Clearly, the list comprehension is intended to map and filter in a functional matter. My [-1] operation was a trivial reduce operation. Certainly another example could be concocted to demonstrate it used in such a matter that would not offend your sensibilities, but as far as I'm concerned it is simple, beautiful, readable, unambiguous, etc. Your counterpoint takes exception to the example rather than the suggestion. Say someone is going through a flat file with a large list of numbers, and they want to filter out all odd numbers, and stop once they encounter a 0 value. [int(line) for line in file if evenP(int(line)) until int(line)==0] or [int(line) for line in file if evenP(int(line)) until int(line)==0 with open(filename) as file] Bam, one line to open the file, iterate through them, filter out unneeded entries, define the base case, and return value. Do you believe that this also abuses list comprehensions? If so, please explain your reasoning. There's nothing keeping a programmer from writing anything this syntax could do as a normal for loop, but neither is there anything stopping said programmer from using a for loop with ifs and breaks in place of a list comprehension. One can also make the reasoning that for infinitely iterable sequences, in order for the shorter syntax to remain useful it should be able to define a stopping point at which it ceases to map over said sequence. This is my justification for suggesting 'until'; say someone created a function that yielded the fibonacci numbers: it would be easy to iterate over this with a full for statement and forget or misplace the base case which in turn would lead to an infinite loop. 'until' offers a relatively simple solution that at worst adds a few words to a program; on average, could reduce a little bit of processing time; and at best, prevents an infinite for loop -- all in a simple, readable, and compact syntax that, as far as I know (I admit I am very new to python-ideas), would add little complexity to the interpreter. --Andy On Wed, Jul 9, 2008 at 4:59 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:

Not everything should be a 1-line function or operation. And, in fact, your examples show that you really don't understand *current* Python semantics, so trying to introduce new syntax is a non-starter (never mind that it was killed earlier). [int(line) for line in file if evenP(int(line)) until int(line)==0 with open(filename) as file] Could be replaced by... [int(line) for line in open(filename) if evenP(int(line)) until int(line)==0] If you kept the until syntax. But even then, since Python files iterate lines with trailing newlines, it would raise an exception in the first pass. [int(line.rstrip('\r\n')) for line in open(filename) if evenP(int(line.rstrip('\r\n')) until ...] But now there's all this .rstrip() crap in there, an explicit evenP call, etc. At this point, it's probably *clearer* (and faster) to pull everything out and be explicit about your cases. out = [] for line in open(filename): val = int(line.rstrip('\r\n')) if val == 0: break if not val & 1: out.append(val) - Josiah On Fri, Jul 11, 2008 at 2:10 AM, Andrew Akira Toulouse <andrew@atoulou.se> wrote:

Andrew Akira Toulouse wrote:
Since you don't like for loops, what's wrong with: from itertools import takewhile with open(filename) as f: intlines = (int(line) for line in f if evenP(int(line))) result = list(takewhile(lambda x: x, intlines)) This is perfectly clear, not too much longer than your suggestion, and doesn't require adding large amounts of extra syntax to the language (though using a lambda for the identity function is a bit more verbose than I'd prefer). Personally, I’d rather have this broken up than in one gigantic line that looks like: result = [int(line) for line in f if evenP(int(line)) until int(line) == 0 with open(filename) as f] As for:
tokenchecks = [token for regex,token in match_tok until regex.match(s)] return tokenchecks[-1]
This is much more readably expressed as: for regexp, token in match_tok: if regexp.match(s): return token Cheers, Jacob Rus

Jacob Rus wrote:
Andrew Akira Toulouse wrote:
tokenchecks = [token for regex,token in match_tok until regex.match(s)] return tokenchecks[-1]
This is missing either ';' or '\n' between the statements.
Besides which, creating an unbounded list one does not actually want strikes me as conceptually ugly ;-).

hi! Jacob Rus wrote:
Andrew Akira Toulouse wrote:
...
I have not followed the thread, but the line above looks cool and terrible at the same time. I can't think what other operator can be squeezed into expression, maybe try-except, eh? result = [int(line) for line in f try if evenP(int(line)) until int(line) == 0 except ValueError("blabla") with open(filename) as f] I am not sure if this kind of SQLness should enter Python syntax at all... Regards, Roman

Point taken -- it's a very valid point, but at the same time building an unnecessary list isn't really all that good-looking. Being able to do query-like things is very useful, and I'd prefer to do it as close to the Python core as possible, rather than (for example) creating a in-memory sqlite database or something. LINQ, I think, somewhat validates this point of view. I feel that an integrated way to query data would be a useful addition. List comprehensions seem to be the natural Pythonic way to do this, but if you can think of something better... On Tue, Jul 29, 2008 at 3:15 AM, Roman Susi <rnd@onego.ru> wrote:

Andrew Akira Toulouse wrote:
How about something liter, like a function that searches containers in a way similar to how Yahoo searches are expressed. While this can be done with regex expressions, they aren't something you want to expose to a typical user. It would be a good candidate for a separately developed module that may could be included in the library at some point in the future. Ron

Andrew Akira Toulouse wrote:
What I am not sure about is that Python needs special syntactic provisions for "query-like" things even though its temptative and I am sure a lot of use cases can be facilitated (not only SQL queries but Prolog-like, SPARQL-like, etc, etc). Because it makes the language unnecessarily complex. Python already has dict/list literals + list comprehensions. And universal functional syntax could be used for the same things if only Python easier "laziness", because all those proposed untils and while could be then simply realized as functions. The below is not syntax proposition, but illustrates what I mean: my_qry = select(` from(t1=`table1), ` where(`t1.somefield < 5), ...) Those things after backticks function select can eval if needed. Similarly, there could be triple backtick for lazy expressions as a whole: qry = ```select(from(t2=table), where(...), ... ``` which could be then evaluated accordind to some domain-specific language (Python syntax for domain specific language): res = sqlish(qry) One obvious usage is regular expressions. Right now ``` is just r""" for them. And sqlish() above is re.compile(). In short, I do not like ad hoc additions to the language (even inline if was imho a compromise), but more generic features (syntactic support for embedded declarative/query languages) like simple lazy evaluation. (PEP 312 "Simple implicit lambda" co-authored by me is just one example toward same goal of simpler laziness.) Please, do not understand me wrong. I am not against adding features you want/need and many people who design ORMs and the likes will be glad to see in Python. I am against making particular syntactic changes to Python to enable those features instead of thinking on more generic feature which would cover many cases. And I think that feature boils down to something spiritually like Lisp's quote. The syntax above reuses backtick dropped by Py3k to mean not just a string to evaluate but a closure. Its possible to do those things with lambdas even now but the readability will suffer to the degree of unpythonic code. Also, its possible to do the same (and there are a lot of examples like PyParsing) with specially crafted classes which overload all operations to perform their logic in a lazy evaluation fashion (that is, function calls are not really function calls any more but data structure specifiers/builders). Sorry for going too far from the original topic. I just wanted to point out that IMHO the while/until feature is not quite good for Python as very specific syntactic addition. I do not believe GvR will ever make list comprehensions complete sql with wheres, groupings, orderbys, havings, limits, even though for (SQL's from) and if (SQL's where) are there. Regards, Roman

Roman Susi wrote:
Python already has dict/list literals + list comprehensions.
Better to think that Python has generator expressions which can be used in list, set, and dict comprehensions (the latter two in 3.0 and maybe 2.6).
Generator functions using loops and alternation are easy laziness for cases not easily abbreviated by generator expressions. I see no reason to complicate the language with new keywords. tjr

On Tue, Jul 29, 2008 at 3:19 PM, Terry Reedy <tjreedy@udel.edu> wrote:
You probably don't want to think about it that way - a list/set/dict comprehension does not actually create a generator. Instead, it basically just inlines the equivalent for loop. Note that there's no YIELD_VALUE opcode for the comprehensions:
Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

Steven Bethard wrote:
Yes I do. For normal purposes, in 3.0, [genexp] == list(genexp), and so on. That it do something else internally for efficiency is implementation detail, not semantics.
Instead, it basically just inlines the equivalent for loop.
In 3.0, not quite. The manual says "Note that the comprehension is executed in a separate scope, so names assigned to in the target list don’t “leak” in the enclosing scope." The difference between this and the conceptual equivalence above is one function call versus many. Were you using 2.6? tjr

On Tue, Jul 29, 2008 at 8:55 PM, Terry Reedy <tjreedy@udel.edu> wrote:
But the semantics are pretty different. The semantics are basically: def _(seq): _1 = [] for x in seq: _1.append(<expression with x>) return _1 return _(seq) def _(seq): for x in seq: yield <expression with x> return list(_(seq)) Yes, they produce the same results, but semantically they're different. In one case, a list is created and items are appended to it and then that list is returned. In the other case, a generator object is created, and then that generator is repeatedly resumed to yield items by the list() constructor. I think it's fine to say they achieve the same result, but I think saying they're really the same thing is a mistake. Steve P.S. Yes, it's executed in a different scope - that's why the code I showed you before had to do ``dict_comp.__code__.co_consts[1]`` instead of just ``dict_comp.__code__``. -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

Steven Bethard wrote:
First of all, the difference in semantics is not that crucial for the point I made (no extra syntax for Python built-in "query language"). Both ways (list comprehensions and generators) are ways to abstract iterative processing of items of some data structures (they are iterative) where generators are much more generic. The problem is not on the result side. Syntax proposed for while/until and other possible additions (try-except, with) are basically the filters in the pipeline which are possible to do with functional approach (using generator protocol or not and whether or not it is good "laziness" - is a matter of implementation) without introducing obscure syntactic additions and new or reused keywords. As someone pointed out, maybe more advanced itertools / "gentools" can achieve desired results without loosing efficiency but helping readable iterators/generators pipeline construction for modeling query languages. -Roman

Steven Bethard wrote:
I could have mentioned here that Python does not have dict or list *literals*. It only have number and string/bytes literals. It has set, dict, and list *displays*. One form of display content is a sequence of expressions, a tuple list (with a special form for dicts). Another form of display content is a genexp (with a special form for dict comprehensions).
Better to think that Python has generator expressions which can be used in list, set, and dict comprehensions (the latter two in 3.0 and maybe 2.6).
The above is a slightly better way to say this.
You probably don't want to think about it that way - a list/set/dict comprehension does not actually create a generator.
I never said it necessarily did. The fact that [tuple_list] == list((tuple_list)) does not mean that [tuple_list] actually creates a tuple object. An interpreter could, but CPython does not. Similarly, f(1,2,3) could create a tuple object, but CPython does not.
I never said that listcomp == genexp, I said that listcomp (setcomp,dictcomp) == bracketed genexp == list/set/dict(genexp) The semantics are basically:
The results of the expressions are their semantics. When you write 547 + 222, the semantics is the result 769, not the internal implementation of how a particular interpreter arrives at that. Anyway, in Python, '==' is value comparison, not algorithm comparison.
This is a CPython internal implementation choice. In each case, a list object is created and items appended to it. The internal question is how the objects are generated. The list(genexp) implementation was considered until the faster version was created. [genexp] == list(genexp) was the design goal. I said 'for normal purposes' because it turns out that the the faster implementation allows that equality to be broken with somewhat bizarre and abusive code that should never appear in practical code. The developers tried to fix this minor glitch but could not without reverting to the slower list(genexp) version, so they decided to leave things alone. My point is that if one understands genexp, one really understands the comprehensions also. It is a matter of reducing cognitive load. I happen to be in favor of that. Terry Jan Reedy

On Wed, Jul 30, 2008 at 9:00 AM, Terry Reedy <tjreedy@udel.edu> wrote:
I see we have a very different definition of semantics. For me, the semantics of ``547 + 222`` are different from ``1538 / 2 are different from 769 because the operations the interpreter goes through are different. Given that by "semantics" you mean, "result of the expression", I can happily conceded that the result of the expression of list(<genexp>) is the same as [<listcomp>]. Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

Precisely: you're talking about two different accepted definitions of semantics: http://en.wikipedia.org/wiki/Formal_semantics_of_programming_languages I'm not going to say one type of formal semantics is better than another, but it's nice when everyone is at least on the same page :) On Wed, Jul 30, 2008 at 09:26, Steven Bethard <steven.bethard@gmail.com>wrote:
This is denotational semantics.
This is operational semantics.
-- It is better to be quotable than to be honest. -Tom Stoppard Borowitz

David Borowitz wrote:
Thank you for the reference. I see that denotational and operational are just the beginning of the semantics of semantics. In writing about and comparing algorithms, I need a term that is slightly in between but closer to operational than merely denotational. I am thinking of using 'operationally equivalent' to describe two algorithms that are denotationally (functionally) identical and which compute the same essential intermediate objects (values) in the same order, while not being 'operationally identical' (which would make them the same thing). Wikipedia refers 'operational equivalence' to 'observational equivalence'. Other hits from Google suggest that it have been used variably to mean things from operationally identical to merely denotationally (functionally identical), but therefore a complete substitute in the users operations. In the present case, we agree that [genexp] == list(genexp] in the Python meaning, denotational and operational, of '==', which is to compare the denotational meaning of two expressions (the resulting objects and their values). I also claim a bit more: [genexp] could be but does not have to be, and in current CPython is not, operationally identical to list(genexp). But I do expect that it is operationally equivalent in the sense above. The essential objects and operations are the list contents and the list that starts empty and grows one reference at a time via append(). Terry Jan Reedy
participants (28)
-
Andrew Akira Toulouse
-
Andrew Toulouse
-
Arnaud Delobelle
-
Bill Janssen
-
BJörn Lindqvist
-
Boris Borcic
-
Brett Cannon
-
Bruce Leban
-
Chris Rebert
-
David Borowitz
-
Facundo Batista
-
Fred Drake
-
George Sakkis
-
Greg Ewing
-
Greg Falcon
-
Guido van Rossum
-
Jacob Rus
-
Josiah Carlson
-
Leif Walsh
-
Raymond Hettinger
-
Roman Susi
-
Ron Adam
-
Stavros Korokithakis
-
Steven Bethard
-
Talin
-
Terry Reedy
-
Thomas Lee
-
Torsten Bronger