[Greg Ewing]
I like the general shape of this, but I have one or two reservations about the details.
That summarizes the feedback so far pretty well. I think we're on to something. And I'm not too proud to say that Ruby has led the way here to some extent (even if Python's implementation would be fundamentally different, since it's based on generators, which has some different possibilities and precludes some Ruby patterns).
1) We're going to have to think carefully about the naming of functions designed for use with this statement. If 'with' is going to be in there as a keyword, then it really shouldn't be part of the function name as well.
Of course. I only used 'with_opened' because it's been the running example in this thread.
I would rather see something like
with f = opened(pathname): ...
This sort of convention (using a past participle as a function name) would work for some other cases as well:
with some_data.locked(): ...
with some_resource.allocated(): ...
Or how about with synchronized(some_resource): ...
On the negative side, not having anything like 'with' in the function name means that the fact the function is designed for use in a with-statement could be somewhat non-obvious. Since there's not going to be much other use for such a function, this is a bad thing.
This seems a pretty mild problem; one could argue that every function is only useful in a context where its return type makes sense, and we seem to be getting along just fine with naming conventions (or just plain clear naming).
It could also lead people into subtle usage traps such as
with f = open(pathname): ...
which would fail in a somewhat obscure way.
Ouch. That one hurts. (I was going to say "but f doesn't have a next() method" when I realized it *does*. :-) It is *almost* equivalent to for f in open(pathname): ... except if the "..." block raises an exception. Fortunately your proposal to use 'as' makes this mistake less likely.
So maybe the 'with' keyword should be dropped (again!) in favour of
with_opened(pathname) as f: ...
But that doesn't look so great for the case where there's no variable to be assigned to -- I wasn't totally clear about it, but I meant the syntax to be with [VAR =] EXPR: BLOCK where VAR would have the same syntax as the left hand side of an assignment (or the variable in a for-statement).
2) I'm not sure about the '='. It makes it look rather deceptively like an ordinary assignment, and I'm sure many people are going to wonder what the difference is between
with f = opened(pathname): do_stuff_to(f)
and simply
f = opened(pathname) do_stuff_to(f)
or even just unconsciously read the first as the second without noticing that anything special is going on. Especially if they're coming from a language like Pascal which has a much less magical form of with-statement.
Right.
So maybe it would be better to make it look more different:
with opened(pathname) as f: ...
Fredrik said this too, and as long as we're going to add 'with' as a new keyword, we might as well promote 'as' to become a real keyword. So then the syntax would become with EXPR [as VAR]: BLOCK I don't see a particular need for assignment to multiple VARs (but VAR can of course be a tuple of identifiers).
* It seems to me that this same exception-handling mechanism would be just as useful in a regular for-loop, and that, once it becomes possible to put 'yield' in a try-statement, people are going to *expect* it to work in for-loops as well.
(You can already put a yield inside a try-except, just not inside a try-finally.)
Guido has expressed concern about imposing extra overhead on all for-loops. But would the extra overhead really be all that noticeable? For-loops already put a block on the block stack, so the necessary processing could be incorporated into the code for unwinding a for-block during an exception, and little if anything would need to change in the absence of an exception.
Probably.
However, if for-loops also gain this functionality, we end up with the rather embarrassing situation that there is *no difference* in semantics between a for-loop and a with-statement!
There would still be the difference that a for-loop invokes iter() and a with-block doesn't. Also, for-loops that don't exhaust the iterator leave it available for later use. I believe there are even examples of this pattern, where one for-loop searches the iterable for some kind of marker value and the next for-loop iterates over the remaining items. For example: f = open(messagefile) # Process message headers for line in f: if not line.strip(): break if line[0].isspace(): addcontinuation(line) else: addheader(line) # Process message body for line in f: addbody(line)
This could be "fixed" by making the with-statement not loop, as has been suggested. That was my initial thought as well, but having thought more deeply, I'm starting to think that Guido was right in the first place, and that a with-statement should be capable of looping. I'll elaborate in another post.
So perhaps the short description of a with-statement that we give to newbies could be the following: """ The statement: for VAR in EXPR: BLOCK does the same thing as: with iter(EXPR) as VAR: # Note the iter() call BLOCK except that: - you can leave out the "as VAR" part from the with-statement; - they work differently when an exception happens inside BLOCK; - break and continue don't always work the same way. The only time you should write a with-statement is when the documentation for the function you are calling says you should. """
So a block could return a value to the generator using a return statement; the generator can catch this by catching ReturnFlow. (Syntactic sugar could be "VAR = yield ..." like in Ruby.)
This is a very elegant idea, but I'm seriously worried by the possibility that a return statement could do something other than return from the function it's written in, especially if for-loops also gain this functionality.
But they wouldn't!
Intercepting break and continue isn't so bad, since they're already associated with the loop they're in, but return has always been an unconditional get-me-out-of-this-function. I'd feel uncomfortable if this were no longer true.
Me too. Let me explain the use cases that led me to throwing that in (I ws running out of time and didn't properly explain it) and then let me propose an alternative. This is a bit long, but important! *First*, in the non-looping use cases (like acquiring and releasing a lock), a return-statement should definitely be allowed when the with-statement is contained in a function. There's lots of code like this out there: def search(self, eligible, default=None): self.lock.acquire() try: for item in self.elements: if eligible(item): return item # no eligible iems return default finally: self.lock.release() and this translates quite nicely to a with-statement: def search(self, eligible, default=None): with synchronized(self.lock): for item in self.elements: if eligible(item): return item # no eligible iems return default *Second*, it might make sense if break and continue would be handled the same way; here's an example: def alt_search(self): for item in self.elements: with synchronized(item): if item.abandoned(): continue if item.eligible(): break else: item = self.default_item return item.post_process() (I realize the case for continue isn't as strong as that for break, but I think we have to support both if we support one.) *Third*, if there is a try-finally block around a yield in the generator, the finally clause absolutely must be executed when control leaves the body of the with-statement, whether it is through return, break, or continue. This pretty much means these have to be turned into some kind of exception. So the first example would first be transformed into this: def search(self, eligible, default=None): try: with synchronized(self.lock): for item in self.elements: if eligible(item): raise ReturnFlow(item) # was "return item" # no eligible iems raise ReturnFlow(default) # was "return default" except ReturnFlow, exc: return exc.value before applying the transformation of the with-statement, which I won't repeat here (look it up in my previous long post in this thread). (BTW I do agree that it should use __next__(), not next_ex().) I'm assuming the following definition of the ReturnFlow exception: class ReturnFlow(Exception): def __init__(self, value=None): self.value = value The translation of break into raise BreakFlow() and continue into rase ContinueFlow() is now obvious. (BTW ReturnFlow etc. aren't great names. Suggestions?) *Fourth*, and this is what makes Greg and me uncomfortable at the same time as making Phillip and other event-handling folks drool: from the previous three points it follows that an iterator may *intercept* any or all of ReturnFlow, BreakFlow and ContinueFlow, and use them to implement whatever cool or confusing magic they want. For example, a generator can decide that for the purposes of break and continue, the with-statement that calls it is a loop, and give them the usual semantics (or the opposite, if you're into that sort of thing :-). Or a generator can receive a value from the block via a return statement. Notes: - I think there's a better word than Flow, but I'll keep using it until we find something better. - This is not limited to generators -- the with-statement uses an arbitrary "new-style" iterator (something with a __next__() method taking an optional exception argument). - The new __next__() API can also (nay, *must*, to make all this work reliably) be used to define exception and cleanup semantics for generators, thereby rendering obsolete PEP 325 and the second half of PEP 288. When a generator is GC'ed (whether by reference counting or by the cyclical garbage collector), its __next__() method is called with a BreakFlow exception instance as argument (or perhaps some other special exception created for the purpose). If the generator catches the exception and yields another value, too bad -- I consider that broken behavior. (The alternative would be to keep calling __next__(BreakFlow()) until it doesn't return a value, but that feels uncomfortable in a finalization context.) - Inside a with-statement, user code raising a Flow exception acts the same as the corresponding statement. This is slightly unfortunate, because it might lead one to assume that the same is true for example in a for-loop or while-loop, but I don't want to make that change. I don't think it's a big problem. Given that 1, 2 and 3 combined make 4 inevitable, I think we might as well give in, and *always* syntactically accept return, break and continue in a with-statement, whether or not it is contained in a loop or function. When the iterator does not handle the Flow exceptions, and there is no outer context in which the statement is valid, the Flow exception is turned into an IllegalFlow exception, which is the run-time equivalent of SyntaxError: 'return' outside function (or 'break' outside loop, etc.). Now there's one more twist, which you may or may not like. Presumably (barring obfuscations or bugs) the handling of BreakFlow and ContinueFlow by an iterator (or generator) is consistent for all uses of that particular iterator. For example synchronized(lock) and transactional(db) do not behave as loops, and forever() does. Ditto for handling ReturnFlow. This is why I've been thinking of leaving out the 'with' keyword: in your mind, these calls would become new statement types, even though the compiler sees them all the same: synchronized(lock): BLOCK transactional(db): BLOCK forever(): BLOCK opening(filename) as f: BLOCK It does require the authors of such iterators to pick good names, and it doesn't look as good when the iterator is a method of some object: self.elements[0].locker.synchronized(): BLOCK You proposed this too (and I even commented on it, ages ago in this same endless message :-) and while I'm still on the fence, at least I now have a better motivational argument (i.e., that each iterator becomes a new statement type in your mind). One last thing: if we need a special name for iterators and generators designed for use in a with-statement, how about calling them with-iterators and with-generators. The non-looping kind can be called resource management iterators / generators. I think whatever term we come up with should not be a totally new term but a combination of iterator or generator with some prefix, and it should work both for iterators and for generators. That's all I can muster right now (I should've been in bed hours ago) but I'm feeling pretty good about this. -- --Guido van Rossum (home page: http://www.python.org/~guido/)