for line in input with open(path) as input...

with open(path) as input: for line in input: do(line) Using with to create reference to opened file returned by open() so it could temporarily be assigned to input for the sole purpose of iterating its contents never sat very well with me. * The context manager returned by open() exists only to create the context and return reference "input"; * the context and code block created by the "with" only exists for inner "for" loop's code block to execute in. Now, given a generator function: def iterwith(cm): with cm as context: if context is None: context = cm for item in context: yield item The previous code can be turned into: for line in iterwith(open(path)): do(line) So, questions: - Is there anything inherently wrong with the idea, or does it exist? - Is it a generally useful tool, or are the examples limited to files? - Is it possible a more general mechanism could have value, such as: for line in file with open(path) as file: do(line) The preceding could be leveraged to different effect: for line in file with locked(path): write(path + ".out", line) Or, for line in input with nested(open(path),lock,open(opath)) as input,locked,output: output.write(line) To revisit the original purpose of "with", this seems to cleanly address a very common scenario wherein: resource = create_resource() try: for item in resource: do_something(resource, item) except: raise finally: cleanup() # Standard with approach... with create_resource() as resource: for item in resource: do_something(resource, item) # With for loop as context... for item in resource with create_resource() as resource: do_something(resource, item) And, given the translation into statements, maybe even crazy stuff... [line for line in file with open(path) as file] J/K, I think.

On 02/02/13 22:46, Shane Green wrote:
It's not the *sole* purpose. If all you want it to iterate over the file, you can do this: for line in open(path): ... and no context manager is created. The context manager is also responsible for closing the file immediately you exit the block, without waiting for the caller to manually close it, or the garbage collector to (eventually) close it. So it is not *solely* for iteration. File context managers can also be used for more than just iteration: with open(path) as input: text = input.read() with open(path, 'r+') as output: output.write('ZZ') and so forth.
I don't understand that objection. As I see it, that's a bit like saying "the len function exists only to get the length of objects". What did you expect the context manager to exist for if not to do the things you say? What am I missing? -- Steven

On Sat, Feb 2, 2013 at 11:46 PM, Steven D'Aprano <steve@pearwood.info> wrote:
If I understand the OP, the issue is that the 'with' creates a name binding and an indentation level for no purpose; it's like doing this: f = open(path) for line in f: ... In that instance, it's possible to inline the function call and use its result directly; it would be nice to be able to do the same with a context manager. However, since 'with' isn't an expression, it's not possible to directly inline the two. I think the utility function iterwith() is a good - and probably the best - method; it demands nothing special from the language, and works quite happily. ChrisA

The with statement block is needed to define *when* cleanup happens (unconditionally at the end of the block). The "iterwith" generator is currently pointless, as it results in nondeterministic cleanup of the context manager, so you may as well not bother and just rely on the underlying iterable's nondeterministic cleanup. We're never going to add cleanup semantics directly to for loops because: - separation of concerns is a good design principle - Indentation levels are not a limited resource (anyone that thinks they are may be forgetting that factoring out context managers, iterators and subfunctions gives you more of them, and that judicious use of early returns and continue statements can avoid wasting them) - we already considered it when initially designing the with statement and decided it was a bad idea. I forget where that last part is written up. If it's not in PEP 343, 342, 346 or 340 (the full set of PEPs that led to the current with statement and contextlib.contextmanager designs), it should be in one of the threads they reference. Cheers, Nick.

Thanks Nick. I definitely see your point about iterwith(); have been thinking about that since someone asked where __exit__() would be invoked. I meant the following as a more compact way of expressing for line in file with open(path) as file: process(line) As a more compact way of expressing with open(path) as file: for line in file: process(line) Not a change to the semantics of for-loops; a point my iterwith() function has confuses greatly, I realize now. I'm not seeing a loss of separation of concerns there. Indentation levels aren't limited, but flatter is better ;-) I saw a bunch of back and forth regarding iteration and context management in the PEP, but didn't notice anything along these lines in particular . I'll have to go back and take a closer look. Nick Coghlan wrote:

On Sat, Feb 2, 2013 at 5:16 PM, Shane Green <shane@umbrellacode.com> wrote:
This is an interesting idea, though a bit too dense for my taste.
Indentation levels aren't limited, but flatter is better ;-)
I really like Golang's solution (defer) which Nick sort of emulates with ExitStack. http://docs.python.org/3/library/contextlib.html#contextlib.ExitStack If ExitStack ever became a language feature, we could write stuff like: def f(): fhand = local open(path) process(fhand) ghand = local open(path2) process(ghand) # which would be sort of equivalent to def g(): try: fhand = None ghand = None fhand = local open(path) process(fhand) ghand = local open(path2) process(ghand) finally: if fhand is not None: fhand.close() if ghand is not None: ghand.close() Yuval

None of these proposals have any merit. The last thing we need is more ways to spell the same thing that can already be spelled in several ways, all of which are just fine. Just because you save a line doesn't make your code more readable. -- --Guido van Rossum (python.org/~guido)

On Sun, Feb 3, 2013 at 1:47 AM, Yuval Greenfield <ubershmekel@gmail.com> wrote:
Why would it ever become a language feature? It works just fine as a context manager, and the need for it isn't frequent enough to justify special syntax.
Why would you leave fhand open longer than necessary? The above would be better written as: def f(): with open(path) as fhand: process(fhand) with open(path2) as ghand: process(ghand) If you need both files open at the same time, you can use a nested context manager: def f(): with open(path) as fhand: with open(path2) as ghand: process(fhand, ghand) Or the nesting behaviour built into with statements themselves: def f(): with open(path) as fhand, open(path2) as ghand: process(fhand, ghand) It's only when the number of paths you need to open is dynamic that ExitStack comes into play (this is actually very close to the example in ExitStack's docstring, as handling a variable number of simultaneously open files was the use case that highlighted the fatal flaw in the way the old contextlib.nested design handled context managers that acquired the resource in __init__ rather than __enter__): def f(*paths): with contextlib.ExitStack() as stack: files = [stack.enter_context(open(path)) for path in paths] process(files) Function and class definitions control name scope (amongst other things), with statements control deterministic cleanup, loops control iteration. That's what I mean by "separation of concerns" in relation to these aspects of the language design and it's a *good* thing (and one of the key reasons with statements behave like PEP 343, rather than being closer to Guido's original looping idea that is described in PEP 340). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Okay, Nick, thanks for iterating exactly what separation of concerns you meant: it clearly identifies where my thinking was going a bit awry. My thinking was more along the lines of, "with" controls deterministic cleanup around a block of code, can't that block be an existing for loop? The "with" statement still controls deterministic cleanup; its __enter__() still necessarily precedes evaluation of the for loop, and its __exit__() still immediately follows evaluation of the for loop. But, there's not much to gain from the idea, at best, so it's a bit of a waste of time, I'm afraid... 02/02/2013 22:13:02

On Sun, Feb 3, 2013 at 6:26 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
This is indeed what I was looking for. The "with" statement does give a lot of control, though I do dislike the double indentation and also dislike opening 2 files on one line. Sucks to be me.
Yes, separation of concerns is indeed a good thing and as Guido mentioned, there are already too many ways to do this. Thanks for the enlightenment, Yuval

On Feb 2, 2013, at 7:16, Shane Green <shane@umbrellacode.com> wrote:
Thanks Nick. I definitely see your point about iterwith(); have been thinking about that since someone asked where __exit__() would be
If iterwith calls close when the iterator is exhausted, it's useful. In that case, close on __exit__ is only a fallback in cases where you don't exhaust it. If it only has close on __exit__, it's useless, because its the same as not doing anything. A few months ago I proposed and unproposed a similar extension to generator expression syntax. People suggested multiple versions of an iterwith style function, and I think everyone (including me) agreed that this was a better answer even if you ignore the obvious huge benefit that it doesn't require changing the language. While your idea isn't identical to mine (in a generator expression, the with would have to come before the for, which doesn't feel as natural as your postfix, and it also breaks the lifting of the outermost iterable, which isn't an issue in the for statement), I think the same thing ends up being true in your case. for line in iterwith(open(path)): for line in f with open(path) as f: I think the first one is more readable. It also doesn't make you repeat the name of an otherwise-unnecessary variable twice. And it doesn't expose that variable to the loop body. If you _want_ to use f, i think you want the explicit with statement for clarity. The only real advantage to the second is that it immediately closes the file on break, instead of only doing so on normal exit or return. Sometimes that's important, but again, I think in most of those cases you'll want the explicit with scope for clarity.

Sorry, I was definitely unclear: I understand that isn't their sole purpose, and what role they play, in both the iteration examples and others, and I don't mean to "object" to anything, nor was I suggesting context-manager current implementation was uneeded, etc. My point is that, in the examples I listed, the sole purpose of the context-manager is to provide a context the for loop will execute in. with manager as context: # No statements here... for items in context: # blah # no statements here... I was suggesting this scenario, wherein the body of a for loop is, in fact, the block of code that acts upon some resource or set of resource being managed by a context manager, might be common enough to warrant a closer look, as the more "pythonic" approach might accept the for loop's body as the with statement's code block context. with open(file) as input: # why have input exist here, if it's not used, other than... for line in input: # do something... # why have input--the context--exist outside of me? # why have input exist here? And finally, why define the outer with block, for the sole pupose of containing the inner "for in.." loop's block?

On 02/02/13 22:46, Shane Green wrote:
It's not the *sole* purpose. If all you want it to iterate over the file, you can do this: for line in open(path): ... and no context manager is created. The context manager is also responsible for closing the file immediately you exit the block, without waiting for the caller to manually close it, or the garbage collector to (eventually) close it. So it is not *solely* for iteration. File context managers can also be used for more than just iteration: with open(path) as input: text = input.read() with open(path, 'r+') as output: output.write('ZZ') and so forth.
I don't understand that objection. As I see it, that's a bit like saying "the len function exists only to get the length of objects". What did you expect the context manager to exist for if not to do the things you say? What am I missing? -- Steven

On Sat, Feb 2, 2013 at 11:46 PM, Steven D'Aprano <steve@pearwood.info> wrote:
If I understand the OP, the issue is that the 'with' creates a name binding and an indentation level for no purpose; it's like doing this: f = open(path) for line in f: ... In that instance, it's possible to inline the function call and use its result directly; it would be nice to be able to do the same with a context manager. However, since 'with' isn't an expression, it's not possible to directly inline the two. I think the utility function iterwith() is a good - and probably the best - method; it demands nothing special from the language, and works quite happily. ChrisA

The with statement block is needed to define *when* cleanup happens (unconditionally at the end of the block). The "iterwith" generator is currently pointless, as it results in nondeterministic cleanup of the context manager, so you may as well not bother and just rely on the underlying iterable's nondeterministic cleanup. We're never going to add cleanup semantics directly to for loops because: - separation of concerns is a good design principle - Indentation levels are not a limited resource (anyone that thinks they are may be forgetting that factoring out context managers, iterators and subfunctions gives you more of them, and that judicious use of early returns and continue statements can avoid wasting them) - we already considered it when initially designing the with statement and decided it was a bad idea. I forget where that last part is written up. If it's not in PEP 343, 342, 346 or 340 (the full set of PEPs that led to the current with statement and contextlib.contextmanager designs), it should be in one of the threads they reference. Cheers, Nick.

Thanks Nick. I definitely see your point about iterwith(); have been thinking about that since someone asked where __exit__() would be invoked. I meant the following as a more compact way of expressing for line in file with open(path) as file: process(line) As a more compact way of expressing with open(path) as file: for line in file: process(line) Not a change to the semantics of for-loops; a point my iterwith() function has confuses greatly, I realize now. I'm not seeing a loss of separation of concerns there. Indentation levels aren't limited, but flatter is better ;-) I saw a bunch of back and forth regarding iteration and context management in the PEP, but didn't notice anything along these lines in particular . I'll have to go back and take a closer look. Nick Coghlan wrote:

On Sat, Feb 2, 2013 at 5:16 PM, Shane Green <shane@umbrellacode.com> wrote:
This is an interesting idea, though a bit too dense for my taste.
Indentation levels aren't limited, but flatter is better ;-)
I really like Golang's solution (defer) which Nick sort of emulates with ExitStack. http://docs.python.org/3/library/contextlib.html#contextlib.ExitStack If ExitStack ever became a language feature, we could write stuff like: def f(): fhand = local open(path) process(fhand) ghand = local open(path2) process(ghand) # which would be sort of equivalent to def g(): try: fhand = None ghand = None fhand = local open(path) process(fhand) ghand = local open(path2) process(ghand) finally: if fhand is not None: fhand.close() if ghand is not None: ghand.close() Yuval

None of these proposals have any merit. The last thing we need is more ways to spell the same thing that can already be spelled in several ways, all of which are just fine. Just because you save a line doesn't make your code more readable. -- --Guido van Rossum (python.org/~guido)

On Sun, Feb 3, 2013 at 1:47 AM, Yuval Greenfield <ubershmekel@gmail.com> wrote:
Why would it ever become a language feature? It works just fine as a context manager, and the need for it isn't frequent enough to justify special syntax.
Why would you leave fhand open longer than necessary? The above would be better written as: def f(): with open(path) as fhand: process(fhand) with open(path2) as ghand: process(ghand) If you need both files open at the same time, you can use a nested context manager: def f(): with open(path) as fhand: with open(path2) as ghand: process(fhand, ghand) Or the nesting behaviour built into with statements themselves: def f(): with open(path) as fhand, open(path2) as ghand: process(fhand, ghand) It's only when the number of paths you need to open is dynamic that ExitStack comes into play (this is actually very close to the example in ExitStack's docstring, as handling a variable number of simultaneously open files was the use case that highlighted the fatal flaw in the way the old contextlib.nested design handled context managers that acquired the resource in __init__ rather than __enter__): def f(*paths): with contextlib.ExitStack() as stack: files = [stack.enter_context(open(path)) for path in paths] process(files) Function and class definitions control name scope (amongst other things), with statements control deterministic cleanup, loops control iteration. That's what I mean by "separation of concerns" in relation to these aspects of the language design and it's a *good* thing (and one of the key reasons with statements behave like PEP 343, rather than being closer to Guido's original looping idea that is described in PEP 340). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Okay, Nick, thanks for iterating exactly what separation of concerns you meant: it clearly identifies where my thinking was going a bit awry. My thinking was more along the lines of, "with" controls deterministic cleanup around a block of code, can't that block be an existing for loop? The "with" statement still controls deterministic cleanup; its __enter__() still necessarily precedes evaluation of the for loop, and its __exit__() still immediately follows evaluation of the for loop. But, there's not much to gain from the idea, at best, so it's a bit of a waste of time, I'm afraid... 02/02/2013 22:13:02

On Sun, Feb 3, 2013 at 6:26 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
This is indeed what I was looking for. The "with" statement does give a lot of control, though I do dislike the double indentation and also dislike opening 2 files on one line. Sucks to be me.
Yes, separation of concerns is indeed a good thing and as Guido mentioned, there are already too many ways to do this. Thanks for the enlightenment, Yuval

On Feb 2, 2013, at 7:16, Shane Green <shane@umbrellacode.com> wrote:
Thanks Nick. I definitely see your point about iterwith(); have been thinking about that since someone asked where __exit__() would be
If iterwith calls close when the iterator is exhausted, it's useful. In that case, close on __exit__ is only a fallback in cases where you don't exhaust it. If it only has close on __exit__, it's useless, because its the same as not doing anything. A few months ago I proposed and unproposed a similar extension to generator expression syntax. People suggested multiple versions of an iterwith style function, and I think everyone (including me) agreed that this was a better answer even if you ignore the obvious huge benefit that it doesn't require changing the language. While your idea isn't identical to mine (in a generator expression, the with would have to come before the for, which doesn't feel as natural as your postfix, and it also breaks the lifting of the outermost iterable, which isn't an issue in the for statement), I think the same thing ends up being true in your case. for line in iterwith(open(path)): for line in f with open(path) as f: I think the first one is more readable. It also doesn't make you repeat the name of an otherwise-unnecessary variable twice. And it doesn't expose that variable to the loop body. If you _want_ to use f, i think you want the explicit with statement for clarity. The only real advantage to the second is that it immediately closes the file on break, instead of only doing so on normal exit or return. Sometimes that's important, but again, I think in most of those cases you'll want the explicit with scope for clarity.

Sorry, I was definitely unclear: I understand that isn't their sole purpose, and what role they play, in both the iteration examples and others, and I don't mean to "object" to anything, nor was I suggesting context-manager current implementation was uneeded, etc. My point is that, in the examples I listed, the sole purpose of the context-manager is to provide a context the for loop will execute in. with manager as context: # No statements here... for items in context: # blah # no statements here... I was suggesting this scenario, wherein the body of a for loop is, in fact, the block of code that acts upon some resource or set of resource being managed by a context manager, might be common enough to warrant a closer look, as the more "pythonic" approach might accept the for loop's body as the with statement's code block context. with open(file) as input: # why have input exist here, if it's not used, other than... for line in input: # do something... # why have input--the context--exist outside of me? # why have input exist here? And finally, why define the outer with block, for the sole pupose of containing the inner "for in.." loop's block?
participants (8)
-
Andrew Barnert
-
Chris Angelico
-
Guido van Rossum
-
Nick Coghlan
-
Serhiy Storchaka
-
Shane Green
-
Steven D'Aprano
-
Yuval Greenfield