Using 'with' with extra brackets for nicer indentation
Hello everybody, today I tried to open four files simultaneously by writing with ( open(fname1) as f1, open(fname2) as f2, open(fname3) as f3, open(fname4) as f4 ): ... However, this results in a SyntaxError which is caused by the extra brackets. Is there a reason that brackets are not allowed in this place?
This has being thought, asked, and even agreed as nice thing before, however, it is blocked due to ambiguity on the syntax for some corner cases - and, I may be wrong n that, it would not be possible to do with the current parser (and Python is not shifting to a more complex parser for this feature alone). So, anyway, the official recommendation for long `with` statements is to use the `\` line continuation character: ``` with \ open(fname1) as f1,\ open(fname2) as f2,\ open(fname3) as f3,\ open(fname4) as f4\ : ``` On Wed, 13 Nov 2019 at 15:30, <gabriel.kabbe@mail.de> wrote:
Hello everybody,
today I tried to open four files simultaneously by writing
with ( open(fname1) as f1, open(fname2) as f2, open(fname3) as f3, open(fname4) as f4 ): ...
However, this results in a SyntaxError which is caused by the extra brackets. Is there a reason that brackets are not allowed in this place? _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/7GRCN5... Code of Conduct: http://python.org/psf/codeofconduct/
The syntax error is coming from finding "as" in a place it's unexpected. (Additionally, if you were to drop the `as fn`, you'd get an AttributeError as tuple.__enter__ isn't defined). There's a contextlib helper that you might consider: https://docs.python.org/3/library/contextlib.html#contextlib.ExitStack Which you might use like: from contextlib import ExitStack with ExitStack() as stack: f1 = stack.enter_context(open(fname1)) f2 = stack.enter_context(open(fname2)) f3 = stack.enter_context(open(fname3)) f4 = stack.enter_context(open(fname4)) ... On Wed, Nov 13, 2019 at 1:30 PM <gabriel.kabbe@mail.de> wrote:
Hello everybody,
today I tried to open four files simultaneously by writing
with ( open(fname1) as f1, open(fname2) as f2, open(fname3) as f3, open(fname4) as f4 ): ...
However, this results in a SyntaxError which is caused by the extra brackets. Is there a reason that brackets are not allowed in this place? _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/7GRCN5... Code of Conduct: http://python.org/psf/codeofconduct/
I would not recommend ExitStack for this scenario -- it's meant for situations where the cleanup is *dynamic* (see examples in the docs: https://docs.python.org/3/library/contextlib.html#contextlib.ExitStack). On Wed, Nov 13, 2019 at 10:53 AM James Edwards <jheiv@jheiv.com> wrote:
The syntax error is coming from finding "as" in a place it's unexpected. (Additionally, if you were to drop the `as fn`, you'd get an AttributeError as tuple.__enter__ isn't defined).
There's a contextlib helper that you might consider: https://docs.python.org/3/library/contextlib.html#contextlib.ExitStack
Which you might use like:
from contextlib import ExitStack
with ExitStack() as stack: f1 = stack.enter_context(open(fname1)) f2 = stack.enter_context(open(fname2)) f3 = stack.enter_context(open(fname3)) f4 = stack.enter_context(open(fname4)) ...
On Wed, Nov 13, 2019 at 1:30 PM <gabriel.kabbe@mail.de> wrote:
Hello everybody,
today I tried to open four files simultaneously by writing
with ( open(fname1) as f1, open(fname2) as f2, open(fname3) as f3, open(fname4) as f4 ): ...
However, this results in a SyntaxError which is caused by the extra brackets. Is there a reason that brackets are not allowed in this place? _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/7GRCN5... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/VLCRQ7... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On Nov 13, 2019, at 10:26, gabriel.kabbe@mail.de wrote:
Hello everybody,
today I tried to open four files simultaneously by writing
with ( open(fname1) as f1, open(fname2) as f2, open(fname3) as f3, open(fname4) as f4 ): ...
However, this results in a SyntaxError which is caused by the extra brackets. Is there a reason that brackets are not allowed in this place?
This has been discussed many times. You probably want to search the archives of -ideas and - dev for the detailed arguments, but from what I remember, it goes something like this: `with (a, b):` would be ambiguous between two context managers, and a single tuple used as a context manager. Of course that’s ridiculous. Using a tuple display as a context manager will just raise an exception because tuples aren’t context managers, and it’s hard to imagine why we’d ever want to add the context manager protocol to tuples in the future, so it’s always obvious to any human that this means two context managers, so there’s no reason to consider it ambiguous. But it does mean either we have to add an extra set of productions to the grammar so we have an “all kinds of expression except parenthesized tuple” to use there, or we have to add a post-grammar, non-declarative rule to disambiguate this case. Both of these are doable, but they do complicate the parser, so the question is, is it worth it? Nick Coghlan has frequently argued that when you have more than two or three short context managers, that’s a sign you probably want a custom context manager that handles all those things together, or maybe you just want to use ExitStack. People counter-argue that there are cases where you really do want three mid-length simple context managers, and that’s easily enough to push you over 80 columns, and the backslash looks horrible and is even flagged by some tools even if PEP 8 specifically recommends it here, and turning it into nested with statements makes the code less readable because it highlights the wrong things and indents too far. And in many such cases a custom umbrella manager would just overcomplicate things. If I’m just merging two CSV files into a third one, what meaning or behavior does an object that wraps all three files have? (But maybe ExitStack is fine here?) So the question is whether people can come up with specific real-life examples that clearly look over complicated done one of the other ways (and ugly done with backslashes), so people can decide whether they’re sufficiently compelling that it’s worth complicating the parser for. And I think it always fizzles out at that point. But this all based on my memory, which is probably wrong or fuzzy on at least some points, so you really should dig up all of the old threads.
On Nov 13, 2019, at 10:26, gabriel.kabbe@mail.de wrote:
with ( open(fname1) as f1, open(fname2) as f2, open(fname3) as f3, open(fname4) as f4 ):
Maybe you should be able to do something like with: open(fname1) as f1: open(fname2) as f2: open(fname3) as f3: open(fname4) as f4: ... -- Greg
On Thu, 14 Nov 2019 at 06:45, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On Nov 13, 2019, at 10:26, gabriel.kabbe@mail.de wrote:
with ( open(fname1) as f1, open(fname2) as f2, open(fname3) as f3, open(fname4) as f4 ):
Maybe you should be able to do something like
with: open(fname1) as f1: open(fname2) as f2: open(fname3) as f3: open(fname4) as f4: ...
You can :-) with \ open(fname1) as f1, \ open(fname2) as f2, \ open(fname3) as f3, \ open(fname4) as f4: ... Seriously, what's so bad about backslashes? Paul
On Thu, Nov 14, 2019, at 03:57, Paul Moore wrote:
with \ open(fname1) as f1, \ open(fname2) as f2, \ open(fname3) as f3, \ open(fname4) as f4:
So, uh... what if we didn't need backslashes for statements that begin with a keyword and end with a colon? There's no syntactic ambiguity there, right? Honestly, adding this would make me less annoyed with the error I get when I forget the colon, since it'd actually have a purpose other than grit on the screen.
On Thu, 14 Nov 2019 at 16:42, Random832 <random832@fastmail.com> wrote:
So, uh... what if we didn't need backslashes for statements that begin with a keyword and end with a colon? There's no syntactic ambiguity there, right? Honestly, adding this would make me less annoyed with the error I get when I forget the colon, since it'd actually have a purpose other than grit on the screen.
Not sure about ambiguity, but it would require a much more powerful parser than Python currently has (which only looks ahead one token). Guido is experimenting with PEG parsers, so maybe it will be a possibility in the future, but right now the current parser can't handle it (yes, there are hacks for some special constructs already, but this would need *arbitrary* lookahead - you could have many lines between the with and the colon). Also, I suspect it would really screw up error reporting: with open(fname1) as f1, open(fname2) as f2, open(fname3) as f3, open(fname4) as f4, # Whoops, comma instead of colon print(hello) import xxx as bar if some_var > 10: return Computer parsers are far dumber than human brains, and if I looked at that without having written it, *I'd* have trouble working out what was wrong, so the poor computer has no chance! Paul
On Thu, Nov 14, 2019, at 11:54, Paul Moore wrote:
On Thu, 14 Nov 2019 at 16:42, Random832 <random832@fastmail.com> wrote:
So, uh... what if we didn't need backslashes for statements that begin with a keyword and end with a colon? There's no syntactic ambiguity there, right? Honestly, adding this would make me less annoyed with the error I get when I forget the colon, since it'd actually have a purpose other than grit on the screen.
Not sure about ambiguity, but it would require a much more powerful parser than Python currently has (which only looks ahead one token).
Would it? I was thinking it could be the same as parentheses (or inside list/dict/set displays) - it sees the keyword (with, if, for), and now it is in a mode where whitespace does not matter, until it reaches the colon.
Guido is experimenting with PEG parsers, so maybe it will be a possibility in the future, but right now the current parser can't handle it (yes, there are hacks for some special constructs already, but this would need *arbitrary* lookahead - you could have many lines between the with and the colon).
but there's no construct that begins with 'with' and *doesn't* end in a colon.
On Nov 14, 2019, at 09:05, Random832 <random832@fastmail.com> wrote:
On Thu, Nov 14, 2019, at 11:54, Paul Moore wrote:
On Thu, 14 Nov 2019 at 16:42, Random832 <random832@fastmail.com> wrote:
So, uh... what if we didn't need backslashes for statements that begin with a keyword and end with a colon? There's no syntactic ambiguity there, right? Honestly, adding this would make me less annoyed with the error I get when I forget the colon, since it'd actually have a purpose other than grit on the screen.
Not sure about ambiguity, but it would require a much more powerful parser than Python currently has (which only looks ahead one token).
Would it? I was thinking it could be the same as parentheses (or inside list/dict/set displays) - it sees the keyword (with, if, for), and now it is in a mode where whitespace does not matter, until it reaches the colon.
Guido is experimenting with PEG parsers, so maybe it will be a possibility in the future, but right now the current parser can't handle it (yes, there are hacks for some special constructs already, but this would need *arbitrary* lookahead - you could have many lines between the with and the colon).
but there's no construct that begins with 'with' and *doesn't* end in a colon.
Yeah, it seems like this should be doable in basically the same way bracketed multiline expressions are. I’m not sure how much of a change that would require. But it seems like it’s worth fiddling with the CPython parser to see if it can actually be done, rather than guessing. That would also let people test that it doesn’t have any unforeseen consequences It would mean free indentation between the first line and the colon, but that’s already true for backslash continuation, and editors and humans manage to make readable code out of that. It might also make error handling a bit worse rather than better. With multiline expressions, what gets dumped with the SyntaxError isn’t always the most relevant part of the expression. (Everyone remembers the first time they left off a close paren and got a baffling error complaining about the perfectly good next line…)
On Nov 14, 2019, at 09:53, Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
Yeah, it seems like this should be doable in basically the same way bracketed multiline expressions are. I’m not sure how much of a change that would require. But it seems like it’s worth fiddling with the CPython parser to see if it can actually be done, rather than guessing.
Actually; as an intermediate proof of concept without getting into hacking the parser, you could hack the pure-Python version of the tokenizer in the tokenize module. IIRC, it has code in a couple places that decides whether to yield a NEWLINE (logical end of line) or an NL (physical end of line that’s just whitespace rather than logical end of line), and whether to check indent level and yield INDENT/DEDENT tokens, based on keeping track of the open bracket count and a backslash flag and probably something else for triple-quoted strings. You’d probably just need to add another flag for the head line of a compound statement to those two places, and the code to set and clear that flag in a couple other places, and that’s it. And then you can run it on a whole mess of code and verify that it’s only different in the cases where you want it to be different (what used to be an ERRORTOKEN or NEWLINE is now an NL because we’re in the middle of a with compound statement header).
I'm glad to see so many reponses to my initial post! But to be fair, I can also see that this is a not so important feature and changing the parser for this is probably not worth it. Anyway, I will just keep following this discussion and see what comes out of it :-) Am 14-Nov-2019 19:13:16 +0100 schrieb python-ideas@python.org: On Nov 14, 2019, at 09:53, Andrew Barnert via Python-ideas wrote:
Yeah, it seems like this should be doable in basically the same way bracketed multiline expressions are. I'm not sure how much of a change that would require. But it seems like it's worth fiddling with the CPython parser to see if it can actually be done, rather than guessing.
Actually; as an intermediate proof of concept without getting into hacking the parser, you could hack the pure-Python version of the tokenizer in the tokenize module. IIRC, it has code in a couple places that decides whether to yield a NEWLINE (logical end of line) or an NL (physical end of line that's just whitespace rather than logical end of line), and whether to check indent level and yield INDENT/DEDENT tokens, based on keeping track of the open bracket count and a backslash flag and probably something else for triple-quoted strings. You'd probably just need to add another flag for the head line of a compound statement to those two places, and the code to set and clear that flag in a couple other places, and that's it. And then you can run it on a whole mess of code and verify that it's only different in the cases where you want it to be different (what used to be an ERRORTOKEN or NEWLINE is now an NL because we're in the middle of a with compound statement header). _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at python-ideas@python.org/message/ZXT5VX2GLHYICHXPIG5GUNQD6D6FE33K/" target="_blank">https://mail.python.org/archives/list/python-ideas@python.org/message/ZXT5VX... Code of Conduct: http://python.org/psf/codeofconduct/
I would not be a fan of this approach (even though I agree it's technically feasible). The problem is that if a user simply forgets the colon at the end of a line (surely a common mistake), the modified parser would produce a much more confusing error on a subsequent line. With the PEG parser we could support this: with ( open("file1") as f1, open("file2") as f2, ): <code> But there would still be an ambiguity if it were to see with ( lock1.acquire(), lock2.acquire(), ): <code> Is that a simple tuple or a pair of context managers that are not assigned to local variables? I guess we can make it the latter since a tuple currently fails at runtime, but the ice is definitely thin here. On Thu, Nov 14, 2019 at 10:14 AM Andrew Barnert via Python-ideas < python-ideas@python.org> wrote:
On Nov 14, 2019, at 09:53, Andrew Barnert via Python-ideas < python-ideas@python.org> wrote:
Yeah, it seems like this should be doable in basically the same way
bracketed multiline expressions are. I’m not sure how much of a change that would require. But it seems like it’s worth fiddling with the CPython parser to see if it can actually be done, rather than guessing.
Actually; as an intermediate proof of concept without getting into hacking the parser, you could hack the pure-Python version of the tokenizer in the tokenize module. IIRC, it has code in a couple places that decides whether to yield a NEWLINE (logical end of line) or an NL (physical end of line that’s just whitespace rather than logical end of line), and whether to check indent level and yield INDENT/DEDENT tokens, based on keeping track of the open bracket count and a backslash flag and probably something else for triple-quoted strings. You’d probably just need to add another flag for the head line of a compound statement to those two places, and the code to set and clear that flag in a couple other places, and that’s it.
And then you can run it on a whole mess of code and verify that it’s only different in the cases where you want it to be different (what used to be an ERRORTOKEN or NEWLINE is now an NL because we’re in the middle of a with compound statement header).
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/ZXT5VX... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On Nov 14, 2019, at 10:33, Guido van Rossum <guido@python.org> wrote:
I would not be a fan of this approach (even though I agree it's technically feasible). The problem is that if a user simply forgets the colon at the end of a line (surely a common mistake), the modified parser would produce a much more confusing error on a subsequent line.
Yeah, that’s what I meant about error reporting getting worse rather than better. Presumably it would be similar to this existing case: a = (b+c d = a*a … where you get a SyntaxError on the second line, where the code looks perfectly valid (because out of context it would be perfectly valid). This is something that every novice runs into and gets confused by (usually with a more complicated expression missing the close parens), but we all get used to it and it stops bothering us. Would the same thing be true for colons? I don’t know. And, even if it is, is the first-time hurdle a worse problem than the one we’re trying to solve? Maybe.
With the PEG parser we could support this:
with ( open("file1") as f1, open("file2") as f2, ): <code>
But there would still be an ambiguity if it were to see
with ( lock1.acquire(), lock2.acquire(), ): <code>
Is that a simple tuple or a pair of context managers that are not assigned to local variables? I guess we can make it the latter since a tuple currently fails at runtime, but the ice is definitely thin here.
Yeah, the tuple ambiguity is the first thing that comes up every time someone suggests parens for with. To a human it doesn’t look ambiguous, because how could you ever want to use a tuple literal as a context manager, but the question is whether you can make it just as unambiguous to the parser without making the parser to complicated to hold in your head. I think it could be done with the current parser, but only by making with_item not use expression and instead use a expression-except-tuple production, but that would require a couple dozen parallel sub productions to build up to that one, which sounds like a terrible idea. And I’m not even 100% sure it would work (because it still definitely needs to allow other parenthesized forms, just not parenthesized tuples). I think it could also be done by letting the grammar parse it as a tuple and then having an extra (non-declarative) fixup step. But that obviously makes parsing more complicated to think through, and to document, etc. If it’s easy to do with a PEG parser, and if there are independent reasons to switch to a PEG parser, maybe that’s ok? I don’t know. It is still more complicated conceptually to have a slot in the grammar that’s “any expression except a parenthesized tuple”. That’s why I thought random’s suggestion of just allowing compound statement headers to be multiline without requiring parens seemed potentially more promising than the original (and frequent) suggestion to add parens here.
Maybe. But in the past we've tweaked the syntax to be able to use parentheses (e.g. for long import statements). On Thu, Nov 14, 2019 at 12:08 PM Andrew Barnert <abarnert@yahoo.com> wrote:
On Nov 14, 2019, at 10:33, Guido van Rossum <guido@python.org> wrote:
I would not be a fan of this approach (even though I agree it's
technically feasible). The problem is that if a user simply forgets the colon at the end of a line (surely a common mistake), the modified parser would produce a much more confusing error on a subsequent line.
Yeah, that’s what I meant about error reporting getting worse rather than better.
Presumably it would be similar to this existing case:
a = (b+c d = a*a
… where you get a SyntaxError on the second line, where the code looks perfectly valid (because out of context it would be perfectly valid).
This is something that every novice runs into and gets confused by (usually with a more complicated expression missing the close parens), but we all get used to it and it stops bothering us. Would the same thing be true for colons? I don’t know. And, even if it is, is the first-time hurdle a worse problem than the one we’re trying to solve? Maybe.
With the PEG parser we could support this:
with ( open("file1") as f1, open("file2") as f2, ): <code>
But there would still be an ambiguity if it were to see
with ( lock1.acquire(), lock2.acquire(), ): <code>
Is that a simple tuple or a pair of context managers that are not assigned to local variables? I guess we can make it the latter since a tuple currently fails at runtime, but the ice is definitely thin here.
Yeah, the tuple ambiguity is the first thing that comes up every time someone suggests parens for with. To a human it doesn’t look ambiguous, because how could you ever want to use a tuple literal as a context manager, but the question is whether you can make it just as unambiguous to the parser without making the parser to complicated to hold in your head.
I think it could be done with the current parser, but only by making with_item not use expression and instead use a expression-except-tuple production, but that would require a couple dozen parallel sub productions to build up to that one, which sounds like a terrible idea. And I’m not even 100% sure it would work (because it still definitely needs to allow other parenthesized forms, just not parenthesized tuples).
I think it could also be done by letting the grammar parse it as a tuple and then having an extra (non-declarative) fixup step. But that obviously makes parsing more complicated to think through, and to document, etc.
If it’s easy to do with a PEG parser, and if there are independent reasons to switch to a PEG parser, maybe that’s ok? I don’t know. It is still more complicated conceptually to have a slot in the grammar that’s “any expression except a parenthesized tuple”.
That’s why I thought random’s suggestion of just allowing compound statement headers to be multiline without requiring parens seemed potentially more promising than the original (and frequent) suggestion to add parens here.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
I think it could also be done by letting the grammar parse it as a tuple and then having an extra (non-declarative) fixup step. But that obviously makes parsing more complicated to think through, and to document, etc.
Just throwing this idea in: what about an approach _not touching_ the parser or compiler at all? : Just add __enter__ and __exit__ to tuples themselves! Instead of repeating "why should we ever do that", we _do_ that exactly to enter the contexts in all tuple elements, and leave then in order. js -><- On Thu, 14 Nov 2019 at 17:08, Andrew Barnert via Python-ideas < python-ideas@python.org> wrote:
On Nov 14, 2019, at 10:33, Guido van Rossum <guido@python.org> wrote:
I would not be a fan of this approach (even though I agree it's
technically feasible). The problem is that if a user simply forgets the colon at the end of a line (surely a common mistake), the modified parser would produce a much more confusing error on a subsequent line.
Yeah, that’s what I meant about error reporting getting worse rather than better.
Presumably it would be similar to this existing case:
a = (b+c d = a*a
… where you get a SyntaxError on the second line, where the code looks perfectly valid (because out of context it would be perfectly valid).
This is something that every novice runs into and gets confused by (usually with a more complicated expression missing the close parens), but we all get used to it and it stops bothering us. Would the same thing be true for colons? I don’t know. And, even if it is, is the first-time hurdle a worse problem than the one we’re trying to solve? Maybe.
With the PEG parser we could support this:
with ( open("file1") as f1, open("file2") as f2, ): <code>
But there would still be an ambiguity if it were to see
with ( lock1.acquire(), lock2.acquire(), ): <code>
Is that a simple tuple or a pair of context managers that are not assigned to local variables? I guess we can make it the latter since a tuple currently fails at runtime, but the ice is definitely thin here.
Yeah, the tuple ambiguity is the first thing that comes up every time someone suggests parens for with. To a human it doesn’t look ambiguous, because how could you ever want to use a tuple literal as a context manager, but the question is whether you can make it just as unambiguous to the parser without making the parser to complicated to hold in your head.
I think it could be done with the current parser, but only by making with_item not use expression and instead use a expression-except-tuple production, but that would require a couple dozen parallel sub productions to build up to that one, which sounds like a terrible idea. And I’m not even 100% sure it would work (because it still definitely needs to allow other parenthesized forms, just not parenthesized tuples).
I think it could also be done by letting the grammar parse it as a tuple and then having an extra (non-declarative) fixup step. But that obviously makes parsing more complicated to think through, and to document, etc.
If it’s easy to do with a PEG parser, and if there are independent reasons to switch to a PEG parser, maybe that’s ok? I don’t know. It is still more complicated conceptually to have a slot in the grammar that’s “any expression except a parenthesized tuple”.
That’s why I thought random’s suggestion of just allowing compound statement headers to be multiline without requiring parens seemed potentially more promising than the original (and frequent) suggestion to add parens here. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/BEQRGP... Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, Nov 15, 2019, at 10:30, Joao S. O. Bueno wrote:
Just throwing this idea in: what about an approach _not touching_ the parser or compiler at all? : Just add __enter__ and __exit__ to tuples themselves! Instead of repeating "why should we ever do that", we _do_ that exactly to enter the contexts in all tuple elements, and leave then in order.
This may have already been explained already in the thread, but possibly not clearly enough: this does not solve closing the first file if opening the second file fails (or whatever the equivalent is for doing something other than opening a file). with (open(a), open(b)) as fa, fb: would break down into the steps open file a open file b build tuple call tuple.__enter__ assign result to fa and fb when "open file b" fails, the tuple does not exist yet, so it cannot have an __exit__ that will do cleanup for file a. The compiler would have to handle this case specially. While it might indeed be useful to use "with" with a pre-existing safely constructed tuple [whose __enter__ could handle errors in the items' __enter__ calls], it would not solve the problem of cleaning up after an error in the expression initializing one of the items of the tuple. To do this safely, you could use an "opener" context manager that delays actually opening the file until __enter__ is called, but the ability to use it with a plain open call would be dangerously tempting.
On 16/11/19 4:30 am, Joao S. O. Bueno wrote:
Just add __enter__ and __exit__ to tuples themselves!
That wouldn't give the same result, though. E.g. if you're using open(), a failure to open a later file in the list should trigger the __exit__ of earlier ones. But using a tuple, all the files have been opened before the with statement gets a look in. To get the right semantics, evaluation of the context manager expressions needs to be under the control of the with-statement. -- Greg
On Thu, Nov 14, 2019, at 13:12, Andrew Barnert wrote:
And then you can run it on a whole mess of code and verify that it’s only different in the cases where you want it to be different (what used to be an ERRORTOKEN or NEWLINE is now an NL because we’re in the middle of a with compound statement header).
Maybe any compound statement header? e.g. if/while with lots of and/or conditions, but more to the point it doesn't really make sense to make the rule work differently for different types of statements.
On Nov 14, 2019, at 11:21, Random832 <random832@fastmail.com> wrote:
On Thu, Nov 14, 2019, at 13:12, Andrew Barnert wrote: And then you can run it on a whole mess of code and verify that it’s only different in the cases where you want it to be different (what used to be an ERRORTOKEN or NEWLINE is now an NL because we’re in the middle of a with compound statement header).
Maybe any compound statement header? e.g. if/while with lots of and/or conditions, but more to the point it doesn't really make sense to make the rule work differently for different types of statements.
Sure. People don’t bring that up as much because with it and while you can already just use parens, but no reason they should be different. And that raises a point: the if keyword can appear in other places besides the start of a compound statement. Does tokenize.py have enough info to handle that properly? I don’t know, and the answer to that might be a good proxy to the question of whether it can be done in the real compiler without making parsing complicated, even if it won’t prove the answer either way.
On 2019-11-14 19:51, Andrew Barnert via Python-ideas wrote:
On Nov 14, 2019, at 11:21, Random832 <random832@fastmail.com> wrote:
On Thu, Nov 14, 2019, at 13:12, Andrew Barnert wrote: And then you can run it on a whole mess of code and verify that it’s only different in the cases where you want it to be different (what used to be an ERRORTOKEN or NEWLINE is now an NL because we’re in the middle of a with compound statement header).
Maybe any compound statement header? e.g. if/while with lots of and/or conditions, but more to the point it doesn't really make sense to make the rule work differently for different types of statements.
Sure. People don’t bring that up as much because with it and while you can already just use parens, but no reason they should be different.
And that raises a point: the if keyword can appear in other places besides the start of a compound statement. Does tokenize.py have enough info to handle that properly? I don’t know, and the answer to that might be a good proxy to the question of whether it can be done in the real compiler without making parsing complicated, even if it won’t prove the answer either way.
Keywords (reserved words) are special everywhere except in strings and comments.
On Thu, Nov 14, 2019, at 16:23, MRAB wrote:
On 2019-11-14 19:51, Andrew Barnert via Python-ideas wrote:
And that raises a point: the if keyword can appear in other places besides the start of a compound statement. Does tokenize.py have enough info to handle that properly? I don’t know, and the answer to that might be a good proxy to the question of whether it can be done in the real compiler without making parsing complicated, even if it won’t prove the answer either way.
Well, in principle, you could also allow NL within inline conditionals [i.e. in the part bracketed by if and else] as well. This probably isn't very desirable though, I was kind of assuming that the parser knew when it was at the start of a line.
Keywords (reserved words) are special everywhere except in strings and comments.
I don't think that's directly relevant to Andrew's question, it's could still be the "same kind of special", whereas the tokenizer needs to treat it differently in each context and therefore differentiate between the contexts.
Another idea -- use semicolons to separate "as" clauses in a with statement, and allow newlines after them. with open(name1) as f1; open(name2) as f2; open(name3) as f3: ... -- Greg
On Nov 14, 2019, at 13:23, MRAB <python@mrabarnett.plus.com> wrote:
On 2019-11-14 19:51, Andrew Barnert via Python-ideas wrote: On Nov 14, 2019, at 11:21, Random832 <random832@fastmail.com> wrote:
On Thu, Nov 14, 2019, at 13:12, Andrew Barnert wrote: And then you can run it on a whole mess of code and verify that it’s only different in the cases where you want it to be different (what used to be an ERRORTOKEN or NEWLINE is now an NL because we’re in the middle of a with compound statement header). Maybe any compound statement header? e.g. if/while with lots of and/or conditions, but more to the point it doesn't really make sense to make the rule work differently for different types of statements. Sure. People don’t bring that up as much because with it and while you can already just use parens, but no reason they should be different. And that raises a point: the if keyword can appear in other places besides the start of a compound statement. Does tokenize.py have enough info to handle that properly? I don’t know, and the answer to that might be a good proxy to the question of whether it can be done in the real compiler without making parsing complicated, even if it won’t prove the answer either way. Keywords (reserved words) are special everywhere except in strings and comments.
Of course, but special isn’t sufficient here. The with token can only appear in one place: the start of a with statement. So, the tokenizer rule should be as simple as “when you emit a with token, set the multiline until colon flag”. That’s even simpler than the rule for the open parens token (which has to increment a counter). The if token can appear in three places: the start of an if statement, the middle of a comprehension, and the middle of a conditional expression. So, is there a tokenizer rule “when ???, set the multiline until colon flag”? It’s not just “when you emit an if token”. Maybe “when you emit an if token as the first token on a logical line” or something? I’m not sure.
Hi All The original poster wanted, inside 'with' context management, to open several files. Context management is, roughly speaking, deferred execution wrapped in a try ... except .... statement. The lambda construction also provides for deferred execution. Perhaps something like
with helper(lambda: open('a'), open('b'), open('c')) as a, b c: pass
would do what the original poster wanted, for a suitable value of helper. It certainly gives the parentheses the OP wanted, without introducing new syntax.
However, map also provides for deferred execution. It returns an iterable (not a list).
x = map(open, ['a', 'b', 'c']) x <map object at 0x7f8cb5653b70> next(x) FileNotFoundError: [Errno 2] No such file or directory: 'a'
So the OP could even write:
with helper(map(open, ['a', 'b', 'c'])) as a, b, c: pass for a (different) suitable value of helper.
I hope this helps. -- Jonathan URL: jfine2358.github.io
On Fri, Nov 15, 2019 at 9:41 PM Jonathan Fine <jfine2358@gmail.com> wrote:
Hi All
The original poster wanted, inside 'with' context management, to open several files. Context management is, roughly speaking, deferred execution wrapped in a try ... except .... statement.
The lambda construction also provides for deferred execution. Perhaps something like
with helper(lambda: open('a'), open('b'), open('c')) as a, b c: pass
would do what the original poster wanted, for a suitable value of helper. It certainly gives the parentheses the OP wanted, without introducing new syntax.
However, map also provides for deferred execution. It returns an iterable (not a list).
x = map(open, ['a', 'b', 'c']) x <map object at 0x7f8cb5653b70> next(x) FileNotFoundError: [Errno 2] No such file or directory: 'a'
So the OP could even write:
with helper(map(open, ['a', 'b', 'c'])) as a, b, c: pass for a (different) suitable value of helper.
I hope this helps.
I don't understand your point about deferred execution here. But a vitally important part of multi-with semantics is that, if something goes wrong while opening "c", the files "b" and "a" need to be properly closed. Trying to write a helper() that is capable of doing this would be extremely difficult, possibly impossible (and would need to do its own mapping, otherwise it IS impossible). ChrisA
15.11.19 12:40, Jonathan Fine пише:
The original poster wanted, inside 'with' context management, to open several files. Context management is, roughly speaking, deferred execution wrapped in a try ... except .... statement.
In case of open() there is no deferred execution. The resource is acquired in open(), not in __enter__().
The lambda construction also provides for deferred execution. Perhaps something like
with helper(lambda: open('a'), open('b'), open('c')) as a, b c: pass would do what the original poster wanted, for a suitable value of helper. It certainly gives the parentheses the OP wanted, without introducing new syntax.
It does not work. File 'a' will be leaked if opening file 'b' is failed.
However, map also provides for deferred execution. It returns an iterable (not a list).
x = map(open, ['a', 'b', 'c']) x <map object at 0x7f8cb5653b70> next(x) FileNotFoundError: [Errno 2] No such file or directory: 'a'
So the OP could even write:
with helper(map(open, ['a', 'b', 'c'])) as a, b, c: pass for a (different) suitable value of helper.
It can work (the helper can be implemented using ExitStack), although it looks more clumsy to me than just using line continuations.
I thank Chris and Serhiy for their helpful comments. I agree with both of you, that a lambda won't be able to give the desired result. However, although flawed the idea was a useful stepping stone for me. Perhaps as "the simplest thing that could possibly work". I thank Serhiy for saying that
with helper(map(open, ['a', 'b', 'c'])) as a, b, c: pass can work.
Serhiy suggests https://docs.python.org/3/library/contextlib.html#contextlib.ExitStack
with ExitStack() as stack: files = [stack.enter_context(open(fname)) for fname in filenames]
A simpler alternative, for the OP's original post, could be:
with ExitMap(open, ['a', 'b', 'c']) as a, b, c: pass where ExitMap(*args) is equivalent to helper(map(*args)). Of course, using ExitStack is the easy way to code ExitMap.
In any case, I think that Chris, Serhiy and myself agree that the OP's problem is best solved using the capabilities that Python already has. -- Jonathan
On Nov 15, 2019, at 04:37, Jonathan Fine <jfine2358@gmail.com> wrote:
Serhiy suggests https://docs.python.org/3/library/contextlib.html#contextlib.ExitStack
with ExitStack() as stack: files = [stack.enter_context(open(fname)) for fname in filenames]
A simpler alternative, for the OP's original post, could be:
with ExitMap(open, ['a', 'b', 'c']) as a, b, c: pass where ExitMap(*args) is equivalent to helper(map(*args)). Of course, using ExitStack is the easy way to code ExitMap.
The advantage of your ExitMap over helper (besides being shorter, and conceptually simpler) is that it’s never misleading. with helper(open(fn) for fn in (fna, fnb, fnc)) as a, b, c: This is fine. If open(fnb) raises, helper has already stacked up open(fna) and can exit it. But change the genexpr to a listcomp: with helper([open(fn) for fn in (fna, fnb, fnc)]) as a, b, c: This looks the same, and seems fine if you don’t test error cases, but if open(fnb) fails, helper never gets called so nothing gets stacked up so open(fna) leaks. And calling it with an iterable you create out-of-line can make it even more confusing because it’s not as clear where to look when you accidentally make the iterable not lazy. Not to mention that it could even be a properly lazy Iterator but one that needs to be fully consumed for your program logic (not an issue for a simple map over a tuple of strings), and that would be an even more subtle bug. If you could clearly document and/or test for the requirements, maybe helper would be useful. But your ExitMap doesn’t rely on you correctly building the right iterator; it builds the Iterator itself, and doesn’t expose parts that can be misused. So it’s trivial to document, and to learn and use. It still seems like overkill for the case of “how do I get parens somewhere around a bunch of open calls so I can write a multiline with statement”, and I’m not sure how often it would be useful in other cases. But it seems like it’s at least worth building for your personal toolbox and keeping track of how often it comes up (and how much nicer it makes things), and maybe publishing it to PyPI or submitting it to contextlib2 so more people will do the same.
Andrew Barnert wrote:
The advantage of your ExitMap over helper (besides being shorter, and conceptually simpler) is that it’s never misleading.
[snip] Thank you for this, particular your example (snipped). I agree. But maybe I'm biased.
It still seems like overkill for the case of “how do I get parens somewhere around a bunch of open calls so I can write a multiline with statement”, and I’m not sure how often it would be useful in other cases.
It depends, I think, on how often it's used.
But it seems like it’s at least worth building for your personal toolbox and keeping track of how often it comes up (and how much nicer it makes things), and maybe publishing it to PyPI or submitting it to contextlib2 so more people will do the same.
A personal toolbox is, well, a personal matter. I'm all in favour of sharing. But I'd rather someone else, more motivated, would take on putting it up onto PyPI. I hope we're mostly all agreed, anyway, that we've given a satisfactory answer to the Original Poster's query. -- Jonathan
On Fri, 15 Nov 2019 at 12:04, Serhiy Storchaka <storchaka@gmail.com> wrote:
15.11.19 12:40, Jonathan Fine пише:
The original poster wanted, inside 'with' context management, to open several files. Context management is, roughly speaking, deferred execution wrapped in a try ... except .... statement.
In case of open() there is no deferred execution. The resource is acquired in open(), not in __enter__().
I've often thought that this was the root of various awkwardnesses with context managers. Ideally a well-behaved context manager would only use __exit__ to clean up after __enter__ but open doesn't do that. The nested context manager was designed for this type of well-behaved context manager but was then considered harmful because open (one of the most common context managers) misbehaves. Maybe some of these things could be simpler if it was clarified that a context manager shouldn't acquire resource before __enter__ and a new version of open was provided. -- Oscar
On Sat, Nov 16, 2019 at 9:44 AM Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
On Fri, 15 Nov 2019 at 12:04, Serhiy Storchaka <storchaka@gmail.com> wrote:
15.11.19 12:40, Jonathan Fine пише:
The original poster wanted, inside 'with' context management, to open several files. Context management is, roughly speaking, deferred execution wrapped in a try ... except .... statement.
In case of open() there is no deferred execution. The resource is acquired in open(), not in __enter__().
I've often thought that this was the root of various awkwardnesses with context managers. Ideally a well-behaved context manager would only use __exit__ to clean up after __enter__ but open doesn't do that. The nested context manager was designed for this type of well-behaved context manager but was then considered harmful because open (one of the most common context managers) misbehaves.
Maybe some of these things could be simpler if it was clarified that a context manager shouldn't acquire resource before __enter__ and a new version of open was provided.
Hmm. What exactly is the object that you have prior to the file being opened? It can't simply be a File, because you need to specify parameters to the open() call. Is it a "file ready to be opened"? What's the identity of that? ChrisA
On Nov 15, 2019, at 14:52, Chris Angelico <rosuav@gmail.com> wrote:
On Sat, Nov 16, 2019 at 9:44 AM Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
On Fri, 15 Nov 2019 at 12:04, Serhiy Storchaka <storchaka@gmail.com> wrote:
15.11.19 12:40, Jonathan Fine пише:
The original poster wanted, inside 'with' context management, to open several files. Context management is, roughly speaking, deferred execution wrapped in a try ... except .... statement.
In case of open() there is no deferred execution. The resource is acquired in open(), not in __enter__().
I've often thought that this was the root of various awkwardnesses with context managers. Ideally a well-behaved context manager would only use __exit__ to clean up after __enter__ but open doesn't do that. The nested context manager was designed for this type of well-behaved context manager but was then considered harmful because open (one of the most common context managers) misbehaves.
Maybe some of these things could be simpler if it was clarified that a context manager shouldn't acquire resource before __enter__ and a new version of open was provided.
Hmm. What exactly is the object that you have prior to the file being opened? It can't simply be a File, because you need to specify parameters to the open() call. Is it a "file ready to be opened"? What's the identity of that?
The notion of a “file ready to be opened” makes some sense. I’d expect such a thing to have methods like “read_contents”, “write_contents” (plus maybe atomic write, append, etc. variants) that you often use, and you only use it as a context manager if you want to get an iterable of lines out of it or something else you can access iteratively rather than all at once. That other thing would probably be a separate ABC from the ready-to-be-opened thing, but a single concrete type could still satisfy both ABCs for simple cases like local disk files and StringIO, in which case they’d still just `return self` after doing the open (or, for StringIO, doing nothing) in `__enter__`. The identity of the ready-to-open-file thing could be a value-based thing—two objects with equal filenames and flags, or dirfd plus filenames plus flags, or URLs, or underlying buffers, are equal; the fact that they give you distinct iterable things (with distinct file pointers) when you open them is no different from the fact that you get distinct iterable things from opening the same one twice. But this doesn’t seem to be a popular design. If you look at the way Swift, C#, and other “modern mainstream languages” deal with files, all that not-opened-file stuff is done with static methods or methods of the string type, etc., not by having a not-opened-file type. The closest thing they have to a not-opened-file type is a fancy URL type (which, in Swift, can be pretty fancy—it can be a file: URL with an embedded access token that you got from opening a security scoped bookmark, for example). Also, I think it would get in the way of some handy shortcuts that Python has—e.g., if you have a raw fd passed to you by a C API or over a Unix socket, the way to wrap it in a file object is just to call open and use it as the first argument; it seems like it would be weird to have a “file ready to be opened” that’s actually an open file but the wrapper hasn’t been built yet.
On Sat, Nov 16, 2019 at 10:21 AM Andrew Barnert <abarnert@yahoo.com> wrote:
On Nov 15, 2019, at 14:52, Chris Angelico <rosuav@gmail.com> wrote:
On Sat, Nov 16, 2019 at 9:44 AM Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
On Fri, 15 Nov 2019 at 12:04, Serhiy Storchaka <storchaka@gmail.com> wrote:
15.11.19 12:40, Jonathan Fine пише:
The original poster wanted, inside 'with' context management, to open several files. Context management is, roughly speaking, deferred execution wrapped in a try ... except .... statement.
In case of open() there is no deferred execution. The resource is acquired in open(), not in __enter__().
I've often thought that this was the root of various awkwardnesses with context managers. Ideally a well-behaved context manager would only use __exit__ to clean up after __enter__ but open doesn't do that. The nested context manager was designed for this type of well-behaved context manager but was then considered harmful because open (one of the most common context managers) misbehaves.
Maybe some of these things could be simpler if it was clarified that a context manager shouldn't acquire resource before __enter__ and a new version of open was provided.
Hmm. What exactly is the object that you have prior to the file being opened? It can't simply be a File, because you need to specify parameters to the open() call. Is it a "file ready to be opened"? What's the identity of that?
The notion of a “file ready to be opened” makes some sense. I’d expect such a thing to have methods like “read_contents”, “write_contents” (plus maybe atomic write, append, etc. variants) that you often use, and you only use it as a context manager if you want to get an iterable of lines out of it or something else you can access iteratively rather than all at once.
If __enter__ is the place where all resources are allocated, then the constructor has to record file name (and dirfd), whether to open for reading or writing, the encoding/errors, and everything else you need to know before you can open it. Basically it'd have to record all the arguments to open(), plus any additional state required (current directory, perhaps) in case it can change. The "file ready to be opened" would have to actually be "this file, to be read from in UTF-8 mode and not written to". I think it'd be at least as confusing as the current situation. ChrisA
Maybe some of these things could be simpler if it was clarified that a context manager shouldn't acquire resource before __enter__ and a new version of open was provided.
Hmm. What exactly is the object that you have prior to the file being opened? It can't simply be a File, because you need to specify parameters to the open() call. Is it a "file ready to be opened"? What's the identity of that?
The notion of a “file ready to be opened” makes some sense. I’d expect such a thing to have methods like “read_contents”, “write_contents” (plus maybe atomic write, append, etc. variants) that you often use, and you only use it as a context manager if you want to get an iterable of lines out of it or something else you can access iteratively rather than all at once.
If __enter__ is the place where all resources are allocated, then the constructor has to record file name (and dirfd), whether to open for reading or writing, the encoding/errors, and everything else you need to know before you can open it. Basically it'd have to record all the arguments to open(), plus any additional state required (current directory, perhaps) in case it can change. The "file ready to be opened" would have to actually be "this file, to be read from in UTF-8 mode and not written to".
I think it'd be at least as confusing as the current situation.
Real world example: isn't something like this what happens with click.File?
On 17/11/19 4:54 am, Ricky Teachey wrote:
I think it'd be at least as confusing as the current situation.
It might be better to keep it as purely a context manager, and not load it down with any other baggage. I wouldn't try to make it remember the current directory. Like a generator expression, the expectation is that it would be used while it's still fresh. I seem to remember we had such a context manager for a brief time after the with-statement was invented, until someone had the bright idea to make open() do double duty. The main source of confusion I foresee if we re-introduce it, is that people are now used to doing 'with open()...', so we either make open() no longer a context manager and break existing code, or have two ways to do it, with the one that is currently the most widely used one having a subtle trap hidden in it. -- Greg
On Nov 16, 2019, at 13:13, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
It might be better to keep it as purely a context manager, and not load it down with any other baggage.
Doesn’t `closing` already take care of that (and other things besides files, too)? I’m not really sure what problem we’re trying to solve here. Having a way to construct file objects without opening them, so you can do all the actual opens in a with statement, and there’s no easy way to get it wrong? (Which still wouldn’t really be true, because there’s always the trap of writing a generator function that uses a with that only cleans up if the generator is cleaned up…) Or breaking file objects up into as many orthogonal pieces as possible to see if there’s a better way to reassemble them?
On 17/11/19 10:34 am, Andrew Barnert wrote:
On Nov 16, 2019, at 13:13, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
It might be better to keep it as purely a context manager, and not load it down with any other baggage.
Doesn’t `closing` already take care of that (and other things besides files, too)?
No, because a tuple of closing(open(...)) calls still opens all the files before the with-statement starts. -- Greg
On Sat, Nov 16, 2019, at 16:13, Greg Ewing wrote:
On 17/11/19 4:54 am, Ricky Teachey wrote:
I think it'd be at least as confusing as the current situation.
It might be better to keep it as purely a context manager, and not load it down with any other baggage.
I wouldn't try to make it remember the current directory. Like a generator expression, the expectation is that it would be used while it's still fresh.
I seem to remember we had such a context manager for a brief time after the with-statement was invented, until someone had the bright idea to make open() do double duty.
The main source of confusion I foresee if we re-introduce it, is that people are now used to doing 'with open()...', so we either make open() no longer a context manager and break existing code, or have two ways to do it, with the one that is currently the most widely used one having a subtle trap hidden in it.
I wonder if a general mechanism for turning badly behaved context manager factories into nice ones would be a useful addition to contextlib. something like this: @contextmanager def deferred_call(f, *args, **kwargs): cm = f(*args, **kwargs) res = cm.__enter__() try: yield res finally: cm.__exit__() def deferred_caller(f, name=None): return lambda *a, **k: deferred_call(f, *a, **k) opener = deferred_caller(open)
On Sun, 17 Nov 2019 at 06:28, Random832 <random832@fastmail.com> wrote:
On Sat, Nov 16, 2019, at 16:13, Greg Ewing wrote:
I seem to remember we had such a context manager for a brief time after the with-statement was invented, until someone had the bright idea to make open() do double duty.
The main source of confusion I foresee if we re-introduce it, is that people are now used to doing 'with open()...', so we either make open() no longer a context manager and break existing code, or have two ways to do it, with the one that is currently the most widely used one having a subtle trap hidden in it.
I wonder if a general mechanism for turning badly behaved context manager factories into nice ones would be a useful addition to contextlib. something like this:
That might be useful but it doesn't solve the problem from the perspective of someone writing context manager utilities like nested because it still leaves a trap for anyone who uses open with those utilities. Also I don't know of any other misbehaving context managers besides open so I'm not sure that a general utility is needed rather than just a well-behaved alternative for open. PEP 343 gives the example of "opened" which doesn't have this problem https://www.python.org/dev/peps/pep-0343/#examples but apparently that didn't make it to the final release of 2.5 (I guess that's what Greg is referring to). Ultimately the problem is that the requirements on a context manager are not clearly spelled out. The with statement gives context manager authors a strong guarantee that if __enter__ returns successfully then __exit__ will be called at some point later. There needs to be a reverse requirement on context manager authors to guarantee that it is not necessary to call __exit__ whenever __enter__ has not been called. With the protocol requirements specified in both directions it would be easy to make utilities like nested for combining context managers in different ways. -- Oscar
On 18/11/19 8:17 am, Oscar Benjamin wrote:
PEP 343 gives the example of "opened" which doesn't have this problem https://www.python.org/dev/peps/pep-0343/#examples but apparently that didn't make it to the final release of 2.5 (I guess that's what Greg is referring to).
Probably. I may also have mixed it up with closing(), which does something different. -- Greg
On Sun, Nov 17, 2019, at 16:43, Greg Ewing wrote:
Probably. I may also have mixed it up with closing(), which does something different.
It occurs to me that almost any usage of closing likely has the same problem, in that the expression passed to closing may itself throw an exception. It may be worth incorporating closing into the mechanism I proposed in another reply (i.e. something similar to my deferred_caller, but having its cleanup code call .close on the non-context-manager object returned by the given callable)
On Sun, Nov 17, 2019, at 14:17, Oscar Benjamin wrote:
Also I don't know of any other misbehaving context managers besides open so I'm not sure that a general utility is needed rather than just a well-behaved alternative for open.
It's not in dbapi2, but most database connections are context managers, including sqlite3 in the stdlib. So is requests.Response (so is Session, but AIUI it can't return an error on construction, whereas Response is returned by functions like get which raise exceptions)
On Sun, 17 Nov 2019 at 19:18, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
That might be useful but it doesn't solve the problem from the perspective of someone writing context manager utilities like nested because it still leaves a trap for anyone who uses open with those utilities.
Also I don't know of any other misbehaving context managers besides open so I'm not sure that a general utility is needed rather than just a well-behaved alternative for open. PEP 343 gives the example of "opened" which doesn't have this problem https://www.python.org/dev/peps/pep-0343/#examples but apparently that didn't make it to the final release of 2.5 (I guess that's what Greg is referring to).
Ultimately the problem is that the requirements on a context manager are not clearly spelled out. The with statement gives context manager authors a strong guarantee that if __enter__ returns successfully then __exit__ will be called at some point later. There needs to be a reverse requirement on context manager authors to guarantee that it is not necessary to call __exit__ whenever __enter__ has not been called. With the protocol requirements specified in both directions it would be easy to make utilities like nested for combining context managers in different ways.
The context here has been lost - I've searched the thread and I can't find a proper explanation of how open() "misbehaves" in any way that seems to relate to this statement (I don't actually see any real explanation of any problem with open() to be honest). There's some stuff about what happens if open() itself fails, but I don't see how that results in a problem (as opposed to something like a subtle application error because the writer didn't realise this could happen). Can someone restate the problem please? Thanks, Paul
On Mon, 18 Nov 2019 at 08:42, Paul Moore <p.f.moore@gmail.com> wrote:
On Sun, 17 Nov 2019 at 19:18, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
Ultimately the problem is that the requirements on a context manager are not clearly spelled out. The with statement gives context manager authors a strong guarantee that if __enter__ returns successfully then __exit__ will be called at some point later. There needs to be a reverse requirement on context manager authors to guarantee that it is not necessary to call __exit__ whenever __enter__ has not been called. With the protocol requirements specified in both directions it would be easy to make utilities like nested for combining context managers in different ways.
The context here has been lost - I've searched the thread and I can't find a proper explanation of how open() "misbehaves" in any way that seems to relate to this statement (I don't actually see any real explanation of any problem with open() to be honest). There's some stuff about what happens if open() itself fails, but I don't see how that results in a problem (as opposed to something like a subtle application error because the writer didn't realise this could happen).
Can someone restate the problem please?
Sorry Paul! I think a small number of us were following a sub-thread here where we understood what we were talking about but it wasn't clearly spelt out anywhere. I introduced the word "misbehave" so I'll clarify what I meant. First I'll describe all of the background: Python 2.5 introduced the with statement from PEP 343 and made file objects into context managers by adding __enter__ and __exit__ methods. This means that the object returned by open can be used in a with statement like with open(filename) as fin: ... The contextlib module was also added in Python 2.5 and included a useful utility called nested: https://docs.python.org/2.7/library/contextlib.html#contextlib.nested The idea with nested is that you could flatten nested with statements so with mgr1: with mgr2: ... can be rewritten as with nested(mgr1, mgr2): ... This means that you don't have so much indentation and since nested takes *args you can use an arbitrary number of context managers. This was deprecated essentially because it leads to this construction with nested(open(file1), open(file2)) as (f1, f2): ... Here before nested is called its arguments are prepared from left to right so first file1 is opened and then file2 is opened and then both are passed to nested. If an exception is raised while attempting to open file2 then the file object returned for file1 doesn't get passed to nested and doesn't get used in any with statement so its __enter__ and __exit__ methods are never called. In this simple example the file object will probably be closed by __del__ but a significant part of the point of context managers is that we don't want to rely on __del__ in general. Also forms that are otherwise equivalent won't necessarily lead to __del__ being called e.g.: f1 = open(file1) f2 = open(file2) with nested(f1, f2): ... Since this "deficiency" of nested is about an exception that is raised before nested is even called it clearly wasn't possible to solve this problem by improving nested itself. So Python 2.6 introduced the multiple with statement: with open(file1) as f1, open(file2) as f2: ... Since this is now built in to the with statement rather than using a function it is possible to evaluate things in a different order so e.g. f1.__enter__ here is called before open(file2) which wouldn't be possible with a utility function like nested. Most importantly f1.__exit__ will be called if open(file2) raises which solves the main problem with nested. Then the nested function was deprecated in Python 2.7 and at some point removed altogether. The multiple with statement has problems as well though. One problem is the syntax limitation which is the subject of the OP in this thread. The other is the inability to take an arbitrary number of context managers as nested could with *args. Alternatives to nested can not be used as cleanly though if they are expected to meet this requirement that they should do the right thing with exceptions raised while creating the arguments (before the function is called!). With that constraint in mind it isn't possible to have any utility for multiple with statements that receives more than one context manager at a time. Hence exit stack can be used as it creates an object that only receives context managers one at a time: https://docs.python.org/3/library/contextlib.html#contextlib.ExitStack The example given in the docs there explicitly includes open to show you the kind of problem it is designed to solve: with ExitStack() as stack: files = [stack.enter_context(open(fname)) for fname in filenames] # All opened files will automatically be closed at the end of # the with statement, even if attempts to open files later # in the list raise an exception To me that seems clumsy and awkward compared to nested though: with nested(*map(open, filenames)) as files: ... Ideally I would design nested to take an iterable rather than *args and then it would be fine to do e.g. with nested(open(filename) for filename in filenames) as files: ... Here nested could take advantage of the delayed evaluation in the generator expression to invoke the __enter__ methods and call __exit__ on the opened files if any of the open calls fails. This would also leave a "trap" though since using a list comprehension would suffer the same problem as if nested took *args: with nested([open(filename) for filename in filenames]) as files: ... That's the background so what is it that we are discussing in this subthread? I am proposing the root of the problem here is the fact that open acquires its resource (the opened file descriptor) before __enter__ is called. This is what I mean by a context manager that "misbehaves". If there was a requirement on context managers that __exit__ cleans up after __enter__ and any resource that needs cleaning up should only be acquired in __enter__ then there would never have been a problem with nested. In particular PEP 343 gives an alternative to the current behaviour of open: which is @contextmanager def opened(filename, mode="r"): f = open(filename, mode) try: yield f finally: f.close() https://www.python.org/dev/peps/pep-0343/#examples Because this uses the contextmanager decorator it may not immediately be obvious but this function does not suffer any of the problems described above. That is because what this returns is not a file object but rather an object that can only be used as a context manager. It is the __enter__ method of this context manager that opens the file and returns a usable file object. Here is a simple demonstration:
from contextlib import contextmanager @contextmanager ... def f(): ... print(1) # Executed on __enter__ ... try: ... yield 3 ... finally: ... pass ... f() <contextlib._GeneratorContextManager object at 0x10786be10> f().__enter__() 1 3
That means that there is no problem with using with nested(opened(filename1), opened(filename2)) as (file1, file2): ... or any of the variations on this above. For whatever reason this is not what was released in Python 2.5 which instead added the __enter__ and __exit__ methods to file objects themselves so that the existing open builtin could be used directly with the with statement. What I am saying is that conceived as a context manager the object returned by open misbehaves. I think that not just nested but a number of other convenient utilities and patterns could have been possible if opened has been used instead of open and if context managers were expected to meet the constraint: """ There should be no need to call __exit__ if __enter__ has not been called. """ Of course a lot of time has passed since then and now there are probably many other misbehaving context managers so it might be too late to do anything about that. Oscar
On 2019-11-18 9:10 a.m., Oscar Benjamin wrote:
[snip]
To me that seems clumsy and awkward compared to nested though:
with nested(*map(open, filenames)) as files: ...
Ideally I would design nested to take an iterable rather than *args and then it would be fine to do e.g.
with nested(open(filename) for filename in filenames) as files: ...
Here nested could take advantage of the delayed evaluation in the generator expression to invoke the __enter__ methods and call __exit__ on the opened files if any of the open calls fails. This would also leave a "trap" though since using a list comprehension would suffer the same problem as if nested took *args:
with nested([open(filename) for filename in filenames]) as files: ...
If generator expressions (aka "(open(filename) for filename in filenames)") had __enter__ and __exit__ that deferred to inner __enter__ and __exit__, this "trap" wouldn't exist: with (open(filename) for filename in filenames) as files: ... # fine with [open(filename) for filename in filenames] as files: ... # raises because list doesn't __enter__ mainly because it wouldn't work with arbitrary iterators or iterables. (and if you need it to, "with (x for x in iterable)" would still be available)
[snip]
Oscar _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/KIYSRW... Code of Conduct: http://python.org/psf/codeofconduct/
On Mon, 18 Nov 2019 at 13:12, Soni L. <fakedme+py@gmail.com> wrote:
On 2019-11-18 9:10 a.m., Oscar Benjamin wrote:
[snip]
To me that seems clumsy and awkward compared to nested though:
with nested(*map(open, filenames)) as files: ...
Ideally I would design nested to take an iterable rather than *args and then it would be fine to do e.g.
with nested(open(filename) for filename in filenames) as files: ...
Here nested could take advantage of the delayed evaluation in the generator expression to invoke the __enter__ methods and call __exit__ on the opened files if any of the open calls fails. This would also leave a "trap" though since using a list comprehension would suffer the same problem as if nested took *args:
with nested([open(filename) for filename in filenames]) as files: ...
If generator expressions (aka "(open(filename) for filename in filenames)") had __enter__ and __exit__ that deferred to inner __enter__ and __exit__, this "trap" wouldn't exist:
with (open(filename) for filename in filenames) as files: ... # fine
with [open(filename) for filename in filenames] as files: ... # raises because list doesn't __enter__
mainly because it wouldn't work with arbitrary iterators or iterables.
(and if you need it to, "with (x for x in iterable)" would still be available)
Since generators already have a close method the obvious thing for generator.__exit__ to do (if it existed) would be to call that. That would make it possible to use patterns like: def cat(filenames): for filename in filenames: with open(filename) as infile: yield from infile with cat(filenames) as lines: for line in lines: if key in line: return line Note that the return there stops iterating over the generator while it is suspended. Making the generator a context manager whose __exit__ calls close ensures that the context manager inside the generator is finalised. -- Oscar
On 2019-11-18 11:32 a.m., Oscar Benjamin wrote:
On Mon, 18 Nov 2019 at 13:12, Soni L. <fakedme+py@gmail.com> wrote:
On 2019-11-18 9:10 a.m., Oscar Benjamin wrote:
[snip]
To me that seems clumsy and awkward compared to nested though:
with nested(*map(open, filenames)) as files: ...
Ideally I would design nested to take an iterable rather than *args and then it would be fine to do e.g.
with nested(open(filename) for filename in filenames) as files: ...
Here nested could take advantage of the delayed evaluation in the generator expression to invoke the __enter__ methods and call __exit__ on the opened files if any of the open calls fails. This would also leave a "trap" though since using a list comprehension would suffer the same problem as if nested took *args:
with nested([open(filename) for filename in filenames]) as files: ...
If generator expressions (aka "(open(filename) for filename in filenames)") had __enter__ and __exit__ that deferred to inner __enter__ and __exit__, this "trap" wouldn't exist:
with (open(filename) for filename in filenames) as files: ... # fine
with [open(filename) for filename in filenames] as files: ... # raises because list doesn't __enter__
mainly because it wouldn't work with arbitrary iterators or iterables.
(and if you need it to, "with (x for x in iterable)" would still be available)
Since generators already have a close method the obvious thing for generator.__exit__ to do (if it existed) would be to call that. That would make it possible to use patterns like:
def cat(filenames): for filename in filenames: with open(filename) as infile: yield from infile
with cat(filenames) as lines: for line in lines: if key in line: return line
Note that the return there stops iterating over the generator while it is suspended. Making the generator a context manager whose __exit__ calls close ensures that the context manager inside the generator is finalised.
Sure. We can always split generator methods and generator expressions. After all, you can't use "with" in a generator expression, so it makes no sense to call close there.
-- Oscar
On Mon, 18 Nov 2019 at 14:07, Soni L. <fakedme+py@gmail.com> wrote:
On 2019-11-18 11:32 a.m., Oscar Benjamin wrote:
On Mon, 18 Nov 2019 at 13:12, Soni L. <fakedme+py@gmail.com> wrote:
On 2019-11-18 9:10 a.m., Oscar Benjamin wrote:
with nested(open(filename) for filename in filenames) as files: ...
Here nested could take advantage of the delayed evaluation in the generator expression to invoke the __enter__ methods and call __exit__ on the opened files if any of the open calls fails. This would also leave a "trap" though since using a list comprehension would suffer the same problem as if nested took *args:
with nested([open(filename) for filename in filenames]) as files: ...
If generator expressions (aka "(open(filename) for filename in filenames)") had __enter__ and __exit__ that deferred to inner __enter__ and __exit__, this "trap" wouldn't exist:
[snip]
Since generators already have a close method the obvious thing for generator.__exit__ to do (if it existed) would be to call that. That would make it possible to use patterns like:
[snip]
Sure. We can always split generator methods and generator expressions. After all, you can't use "with" in a generator expression, so it makes no sense to call close there.
I don't think that splitting these is a good idea. Currently both of these return the same type of object: a generator. The generator expression gen = (x for x in y if z) is explicitly defined as being equivalent to def _tmp(): for x in y: if z: yield x gen = _tmp() The resulting objects are the same and have the same methods etc. -- Oscar
On Mon, 18 Nov 2019 at 11:12, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
I am proposing the root of the problem here is the fact that open acquires its resource (the opened file descriptor) before __enter__ is called. This is what I mean by a context manager that "misbehaves". If there was a requirement on context managers that __exit__ cleans up after __enter__ and any resource that needs cleaning up should only be acquired in __enter__ then there would never have been a problem with nested. [...] What I am saying is that conceived as a context manager the object returned by open misbehaves. I think that not just nested but a number of other convenient utilities and patterns could have been possible if opened has been used instead of open and if context managers were expected to meet the constraint: """ There should be no need to call __exit__ if __enter__ has not been called. """ Of course a lot of time has passed since then and now there are probably many other misbehaving context managers so it might be too late to do anything about that.
Hi Oscar, Thanks for the explanation. I see what you mean now, and that *was* something I got from the previous discussion, it's just that I guess I'm so used to the current behaviour that I never really thought of it as "misbehaviour". I'm not 100% convinced that there aren't edge cases where even your strengthened requirements on a context manager might not be enough. For example, if __enter__ is called, but raises an exception, is calling __exit__ required then? Consider @contextmanager def open_2_files(): f = open("file1") g = open("file2") try: yield (f,g) finally: g.close() f.close() That meets your criterion, but if open("file2") fails, you're still in a mess. Of course, that's a toy example, and could be written to fix that, and we could even close that loophole by saying "a context manager should only manage one resource", but we can probably carry on down that route for quite a while (and "should only manage one resource" is not actually correct - the whole *point* of something like nested() would be to manage multiple resources). So thanks for your explanation, which I appreciate. But I'm not sure "tightening up" the requirements on context managers is really necessary - it will always be a balance between simplicity and catching all the edge cases, and I don't think the case has been proved that the current design got that balance wrong (it'll always be a matter of personal opinion, to an extent, though, so debates like this will probably continue happening). Paul
On Mon, 18 Nov 2019 at 15:54, Paul Moore <p.f.moore@gmail.com> wrote:
On Mon, 18 Nov 2019 at 11:12, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
I am proposing the root of the problem here is the fact that open acquires its resource (the opened file descriptor) before __enter__ is called. This is what I mean by a context manager that "misbehaves". If there was a requirement on context managers that __exit__ cleans up after __enter__ and any resource that needs cleaning up should only be acquired in __enter__ then there would never have been a problem with nested. [...] What I am saying is that conceived as a context manager the object returned by open misbehaves. I think that not just nested but a number of other convenient utilities and patterns could have been possible if opened has been used instead of open and if context managers were expected to meet the constraint: """ There should be no need to call __exit__ if __enter__ has not been called. """ Of course a lot of time has passed since then and now there are probably many other misbehaving context managers so it might be too late to do anything about that.
Hi Oscar, Thanks for the explanation. I see what you mean now, and that *was* something I got from the previous discussion, it's just that I guess I'm so used to the current behaviour that I never really thought of it as "misbehaviour". I'm not 100% convinced that there aren't edge cases where even your strengthened requirements on a context manager might not be enough. For example, if __enter__ is called, but raises an exception, is calling __exit__ required then?
It has never been the case that __exit__ would be called if __enter__ does not exit successfully even for the basic form of the with statement e.g.: class ContextMgr: def __enter__(self): print('Entering...') raise ValueError('Bad stuff') def __exit__(self, *args): print('Exiting') with ContextMgr(): pass Gives $ python f.py Entering... Traceback (most recent call last): File "f.py", line 8, in <module> with ContextMgr(): File "f.py", line 4, in __enter__ raise ValueError('Bad stuff') ValueError: Bad stuff You can also see this in the original specification of the with statement since __enter__ is called outside the try suite: https://www.python.org/dev/peps/pep-0343/#specification-the-with-statement
Consider
@contextmanager def open_2_files(): f = open("file1") g = open("file2") try: yield (f,g) finally: g.close() f.close()
That meets your criterion, but if open("file2") fails, you're still in a mess. Of course, that's a toy example, and could be written to fix that,
That example is a poor context manager by anyone's definition and can easily be fixed: @contextmanager def open_2_files(): with open('file1') as f: with open('file2') as g: yield (f, g)
and we could even close that loophole by saying "a context manager should only manage one resource", but we can probably carry on down that route for quite a while (and "should only manage one resource" is not actually correct - the whole *point* of something like nested() would be to manage multiple resources).
I don't see why you would say that managing multiple resources is a problem here. It's a question of who is responsible for what. The context manager itself is responsible for cleaning up anything if an exception is raised *inside* it's __enter__ and __exit__ methods. Once the manager returns from __enter__ though it hands over control. Then the with statement and other supporting utilities are responsible for ensuring that __exit__ is called at the appropriate later time. The problem with a misbehaving context manager is that it creates a future need to call __exit__ before it has been passed to a with statement or any other construct that can guarantee to do that. -- Oscar
On Mon, 18 Nov 2019 at 17:17, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
The problem with a misbehaving context manager is that it creates a future need to call __exit__ before it has been passed to a with statement or any other construct that can guarantee to do that.
You seem to be focusing purely on the usage with open(filename) as f: # use f But open() isn't designed *just* to be used in a with statement. It can be used independently as well. What about f = open(filename) header = f.readline() with f: # use f The open doesn't "create a future need to call __exit__". It *does* require that the returned object gets closed at some stage, but you can do that manually (for example, if "header" in the above is "do not process", maybe you'd close and return early). The context manager behaviour of a file object lets you use a with statement to say "when this block completes, make sure the file is closed", but it does *not* require you to do so. You can wrap open() in a context manager like opened() that *does* work like that, but it's not the only way to write context managers. Certainly, nested() can't be written to safely work with the full generality of context managers as we currently have them, but as I said that's a trade-off. Maybe I should ask the question the other way round. If we had opened(), but not open(), how would you write open() using opened()? It is, after all, easy enough to write opened() in terms of open(). Anyway, I already said that where you choose to draw the line over what a context manager is (assuming you feel that the current definition is wrong), depends on your perspective. So I'm not trying to persuade you that I'm right over this. Unless this turns into a PEP to change the language (and I think it would need a PEP) it's just speculation and collecting opinions, so you have mine ;-) Paul
On Mon, 18 Nov 2019 at 17:46, Paul Moore <p.f.moore@gmail.com> wrote:
On Mon, 18 Nov 2019 at 17:17, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
The problem with a misbehaving context manager is that it creates a future need to call __exit__ before it has been passed to a with statement or any other construct that can guarantee to do that.
You seem to be focusing purely on the usage
with open(filename) as f: # use f
But open() isn't designed *just* to be used in a with statement. It can be used independently as well.
That's the problem I think. The context manager for closing the file is conflated as an object with the file object and its methods when they don't need to be the same object. This was considered in PEP 343: """ The problem is that in PEP 310, the result of calling EXPR is assigned directly to VAR, and then VAR's __exit__() method is called upon exit from BLOCK1. But here, VAR clearly needs to receive the opened file, and that would mean that __exit__() would have to be a method on the file. """ The discussion there shows that part of the design of the with statement was precisely so that it would not be necessary for the file object itself to *be* the context manager because __enter__ can return a different object.
What about
f = open(filename) header = f.readline() with f: # use f
I would naturally rewrite that as with open(filename) as f: header = f.readline() # use f which would work just as well with opened instead of open. The opened function returns a context manager whose __enter__ method returns the file object which then has the corresponding file object methods. [snip]
You can wrap open() in a context manager like opened() that *does* work like that, but it's not the only way to write context managers. Certainly, nested() can't be written to safely work with the full generality of context managers as we currently have them, but as I said that's a trade-off.
I think that nested was fine but in combination with open it was prone to misuse. By the time with/contextlib etc had shipped in 2.5 it was easier to blame nested (which also had other flaws) so it took the fall for open.
Maybe I should ask the question the other way round. If we had opened(), but not open(), how would you write open() using opened()? It is, after all, easy enough to write opened() in terms of open().
The idea would be to have both so I don't think it matters but if you had opened and wanted to build open out of it then you could do: def open(*args): return opened(*args).__enter__()
Anyway, I already said that where you choose to draw the line over what a context manager is (assuming you feel that the current definition is wrong), depends on your perspective.
It's not so much that the definition is wrong. It just isn't really defined and that makes it difficult to do anything fancy with multiple context managers. You need protocol contraints on both sides to be able to build useful utilities/patterns. The bar set for nested was that it should be able to recover from errors before it even gets called!
So I'm not trying to persuade you that I'm right over this. Unless this turns into a PEP to change the language (and I think it would need a PEP) it's just speculation and collecting opinions, so you have mine ;-)
A PEP seems premature as I'm not sure I have any clear solution... -- Oscar
On Mon, 18 Nov 2019 at 18:34, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
On Mon, 18 Nov 2019 at 17:46, Paul Moore <p.f.moore@gmail.com> wrote:
What about
f = open(filename) header = f.readline() with f: # use f
I would naturally rewrite that as
with open(filename) as f: header = f.readline() # use f
which would work just as well with opened instead of open. The opened function returns a context manager whose __enter__ method returns the file object which then has the corresponding file object methods.
My apologies. I'm giving examples that I intend to suggest the issue I'm concerned about, but don't actually demonstrate them very well (because they are "toy" examples, and I'm rushing to put something together and not taking the time to give a better crafted example). By doing so I'm wasting everyone's time, as they show corrections that work with my toy cases, but leave me feeling "but that wasn't the point" (when I clearly hadn't explained the point well enough). I should know better than to do this - sorry.
I think that nested was fine but in combination with open it was prone to misuse. By the time with/contextlib etc had shipped in 2.5 it was easier to blame nested (which also had other flaws) so it took the fall for open.
I think that nested made assumptions about how context managers worked that weren't always true in practice. That's acceptable (if a little risky) as long as it's easy to know its limitations and not use it with inappropriate CMs. Unfortunately, it was extremely tempting to use it with open() which didn't satisfy the additional requirements that nested() imposed, and that made it an attractive nuisance. Treating the problem as being caused by nested() not working properly with *all* context managers was relatively easy, as nested was new. Trying to rework open to fit the expectations of nested would have been far harder (because open has been round so long that backward compatibility is a major issue), as well as not helping in the case of any *other* CMs that didn't work like nested expected them to.
Anyway, I already said that where you choose to draw the line over what a context manager is (assuming you feel that the current definition is wrong), depends on your perspective.
It's not so much that the definition is wrong. It just isn't really defined and that makes it difficult to do anything fancy with multiple context managers. You need protocol contraints on both sides to be able to build useful utilities/patterns. The bar set for nested was that it should be able to recover from errors before it even gets called!
The definition is clearly and precisely implied by the PEP and the semantics of the with statement. A CM is anything that has __enter__ and __exit__. See https://www.python.org/dev/peps/pep-0343/#standard-terminology. Maybe it "makes it difficult to do anything fancy with multiple context managers" - I don't know about that. It feels like a fairly large generalisation to get from "nested doesn't work the way we'd like it to" to that statement. But I haven't really thought about the problem much. In my experience, I don't find much need for fancy context manager combinations. I mostly just do "with <something>: <do some work>". I might refactor a complex set of with statements, but I'd do that in an application specific function, not by trying to design a generalised CM combinator. I sympathise with the interest in writing clever combinators. I just don't know that people are hurting from the lack of them. It's probably worth reiterating that this whole sub-thread is for me nothing more than interesting but entirely theoretical speculation - I don't see a need for *any* solution to the original problem of writing multiple CMs in a single with statement "more readably" than the existing approach of using backslashes, so there's no actual problem being solved by the proposals here (unless other use cases come up in the discussion - and I don't miss nested(), so rehabilitating that isn't compelling for me either).
A PEP seems premature as I'm not sure I have any clear solution...
It's certainly premature to write one at this point. But all I was saying is that if you intend to redefine a standard term like "context manager" that was itself defined in a PEP, you'll need a new PEP to do so at some point. Paul
On Mon, 18 Nov 2019 at 22:00, Paul Moore <p.f.moore@gmail.com> wrote:
On Mon, 18 Nov 2019 at 18:34, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
On Mon, 18 Nov 2019 at 17:46, Paul Moore <p.f.moore@gmail.com> wrote:
I think that nested was fine but in combination with open it was prone to misuse. By the time with/contextlib etc had shipped in 2.5 it was easier to blame nested (which also had other flaws) so it took the fall for open.
I think that nested made assumptions about how context managers worked that weren't always true in practice. That's acceptable (if a little risky) as long as it's easy to know its limitations and not use it with inappropriate CMs. Unfortunately, it was extremely tempting to use it with open() which didn't satisfy the additional requirements that nested() imposed, and that made it an attractive nuisance.
It's not really fair to say that nested imposed additional requirements. It required to receive arguments and actually get called!
Treating the problem as being caused by nested() not working properly with *all* context managers was relatively easy, as nested was new. Trying to rework open to fit the expectations of nested would have been far harder (because open has been round so long that backward compatibility is a major issue), as well as not helping in the case of any *other* CMs that didn't work like nested expected them to.
The __enter__ and __exit__ methods of file objects were new in the same release as with/nested etc but I guess that it was all new at the time and people were just getting used to with so nested would have seemed like a more obscure case.
Anyway, I already said that where you choose to draw the line over what a context manager is (assuming you feel that the current definition is wrong), depends on your perspective.
It's not so much that the definition is wrong. It just isn't really defined and that makes it difficult to do anything fancy with multiple context managers. You need protocol contraints on both sides to be able to build useful utilities/patterns. The bar set for nested was that it should be able to recover from errors before it even gets called!
The definition is clearly and precisely implied by the PEP and the semantics of the with statement. A CM is anything that has __enter__ and __exit__. See https://www.python.org/dev/peps/pep-0343/#standard-terminology.
That definition is sufficient to clarify the terminology but it does nothing to define the expected behaviour of the context managers themselves - what should those methods do? Context managers and the with statement are all about assigning responsibility so there needs to be an understanding of what has responsibility for what at any given time.
Maybe it "makes it difficult to do anything fancy with multiple context managers" - I don't know about that. It feels like a fairly large generalisation to get from "nested doesn't work the way we'd like it to" to that statement.
The precise limitation on any CM utility is that it must only receive one context manager at a time in any given function or method call. If it receives two or more - whether as separate arguments or in a container - then in combination with open it will suffer the same problem as nested. Designing an API around that limitation is awkward: probably exitstack can't be improved upon in that respect. -- Oscar
On Mon, 18 Nov 2019 at 23:42, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
The definition is clearly and precisely implied by the PEP and the semantics of the with statement. A CM is anything that has __enter__ and __exit__. See https://www.python.org/dev/peps/pep-0343/#standard-terminology.
That definition is sufficient to clarify the terminology but it does nothing to define the expected behaviour of the context managers themselves - what should those methods do?
My view is that they can do whatever you like. If they give useful behaviour when used in a with statement, then that's an acceptable definition. Basically, the context manager protocol and the with statement provide a mechanism, not a policy. Typically, that mechanism is used for managing resources, but not always.
Context managers and the with statement are all about assigning responsibility so there needs to be an understanding of what has responsibility for what at any given time.
Somewhere (I can't recall where) there's an example of context managers that allow you to do something like with html(): with body(): with div(class="main-panel"): etc... Whether that counts as a clever usage, or an abuse, is somewhat a matter of opinion. I'm also not clear whether the rules you are proposing would allow or prohibit such a use. I'm also not clear what nested(html(), body()) might mean. It's possible to work out the behaviours from just the definition of the mechanism. But I'm not clear how I'd interpret a rule about "what has responsibility for what" in this example. Conceded, that's a general statement not a specific rule. But the rule you gave earlier:
If there was a requirement on context managers that __exit__ cleans up after __enter__ and any resource that needs cleaning up should only be acquired in __enter__ then there would never have been a problem with nested.
doesn't really apply here, as I don't think of the HTML example in terms of "resources". To look at this another way, using the example of open again, what open() returns isn't an object that manages a resource - it's the resource *itself* (the open file). The fact that the resource can be self-managing at all is a result of the fact that the CM protocol and the with statement are defined purely in terms of a mechanism. Open file objects aren't closed using __exit__(), they are closed using close(). But by making __enter__() do nothing and __exit__() call close, you can make file objects work with the with statement. But you can't make a file object's __enter__ method "acquire any resource that needs cleaning up", because the __enter__ is a method on that resource in the first place. So your proposal is really saying that self-managing resources are disallowed, and all resources need a *pair* of classes - one to represent the resource, and one to represent a "manager" for that resource. That's possible (that's what the opened() example in the with statement PEP did) but it's more restrictive than the current context manager protocol. The restrictions allow you do make more assumptions, and hence write certain functions that you otherwise couldn't, but do the benefits justify the costs? On Tue, 19 Nov 2019 at 07:33, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I think a better way to say it would be that the __enter__ method should be atomic -- it should either acquire all the resources it needs or none of them. Then it's clear that the with statement should call __exit__ if and only if __enter__ does not raise an exception.
I like that characterisation better, but it still makes an implicit assumption that all context managers own the resources they manage. Which disallows self-managing resources. Paul
On Tue, 19 Nov 2019 at 08:21, Paul Moore <p.f.moore@gmail.com> wrote:
On Mon, 18 Nov 2019 at 23:42, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
Context managers and the with statement are all about assigning responsibility so there needs to be an understanding of what has responsibility for what at any given time.
Somewhere (I can't recall where) there's an example of context managers that allow you to do something like
with html(): with body(): with div(class="main-panel"): etc...
Whether that counts as a clever usage, or an abuse, is somewhat a matter of opinion.
I think that's fine as a use of context managers. I assume that the intention is that the CM returned by html() is responsible for adding both the opening and closing html tags. It isn't clear to me from the example how errors are expected to be handled though - would you still want to add the closing tag or would you just catch the error higher up and return a completely different error document to the user.
I'm also not clear whether the rules you are proposing would allow or prohibit such a use.
I'm also not clear what nested(html(), body()) might mean. It's possible to work out the behaviours from just the definition of the mechanism. But I'm not clear how I'd interpret a rule about "what has responsibility for what" in this example.
Conceded, that's a general statement not a specific rule. But the rule you gave earlier:
If there was a requirement on context managers that __exit__ cleans up after __enter__ and any resource that needs cleaning up should only be acquired in __enter__ then there would never have been a problem with nested.
doesn't really apply here, as I don't think of the HTML example in terms of "resources".
To look at this another way, using the example of open again, what open() returns isn't an object that manages a resource - it's the resource *itself* (the open file). The fact that the resource can be self-managing at all is a result of the fact that the CM protocol and the with statement are defined purely in terms of a mechanism. Open file objects aren't closed using __exit__(), they are closed using close(). But by making __enter__() do nothing and __exit__() call close, you can make file objects work with the with statement. But you can't make a file object's __enter__ method "acquire any resource that needs cleaning up", because the __enter__ is a method on that resource in the first place.
So your proposal is really saying that self-managing resources are disallowed, and all resources need a *pair* of classes - one to represent the resource, and one to represent a "manager" for that resource. That's possible (that's what the opened() example in the with statement PEP did) but it's more restrictive than the current context manager protocol. The restrictions allow you do make more assumptions, and hence write certain functions that you otherwise couldn't, but do the benefits justify the costs?
On Tue, 19 Nov 2019 at 07:33, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I think a better way to say it would be that the __enter__ method should be atomic -- it should either acquire all the resources it needs or none of them. Then it's clear that the with statement should call __exit__ if and only if __enter__ does not raise an exception.
I like that characterisation better, but it still makes an implicit assumption that all context managers own the resources they manage. Which disallows self-managing resources.
Paul
Whoops I sent too soon. I'll try again... On Tue, 19 Nov 2019 at 11:01, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
On Tue, 19 Nov 2019 at 08:21, Paul Moore <p.f.moore@gmail.com> wrote:
On Mon, 18 Nov 2019 at 23:42, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
Context managers and the with statement are all about assigning responsibility so there needs to be an understanding of what has responsibility for what at any given time.
Somewhere (I can't recall where) there's an example of context managers that allow you to do something like
with html(): with body(): with div(class="main-panel"): etc...
Whether that counts as a clever usage, or an abuse, is somewhat a matter of opinion.
I think that's fine as a use of context managers. I assume that the intention is that the CM returned by html() is responsible for adding both the opening and closing html tags. It isn't clear to me from the example how errors are expected to be handled though - would you still want to add the closing tag or would you just catch the error higher up and return a completely different error document to the user.
I'm also not clear whether the rules you are proposing would allow or prohibit such a use.
No one is proposing to prohibit such a use! I guess it's unclear what is being proposed because nothing specific has been proposed...
I'm also not clear what nested(html(), body()) might mean.
with nested(html(), body()): ... is intended to be equivalent to with html(): with body(): ... The difference between them is what happens if body() raises before returning a CM. What I am saying is that it shouldn't be nested's responsibility to handle errors at the stage where the context managers are being created (before nested is called). The idea that nested should be able to handle that or is otherwise flawed in design comes from the fact that there are context managers like open() that are expected to fail before __enter__ is called. If you (the author of the above) care about that then you should write your context managers so that the opening and closing tags are emitted in __enter__ and __exit__ rather than the initial call to html(). Note that this happens automatically for you if you use the contextmanager decorator which is the obvious way to implement the above: @contextmanager def html(): print('<html>') yield print('</html>')
Conceded, that's a general statement not a specific rule. But the rule you gave earlier:
If there was a requirement on context managers that __exit__ cleans up after __enter__ and any resource that needs cleaning up should only be acquired in __enter__ then there would never have been a problem with nested.
doesn't really apply here, as I don't think of the HTML example in terms of "resources".
Perhaps resources is not a generally applicable term here.
To look at this another way, using the example of open again, what open() returns isn't an object that manages a resource - it's the resource *itself* (the open file). The fact that the resource can be self-managing at all is a result of the fact that the CM protocol and the with statement are defined purely in terms of a mechanism. Open file objects aren't closed using __exit__(), they are closed using close(). But by making __enter__() do nothing and __exit__() call close, you can make file objects work with the with statement. But you can't make a file object's __enter__ method "acquire any resource that needs cleaning up", because the __enter__ is a method on that resource in the first place.
So your proposal is really saying that self-managing resources are disallowed, and all resources need a *pair* of classes - one to represent the resource, and one to represent a "manager" for that resource. That's possible (that's what the opened() example in the with statement PEP did) but it's more restrictive than the current context manager protocol. The restrictions allow you do make more assumptions, and hence write certain functions that you otherwise couldn't, but do the benefits justify the costs?
I think that at this stage it is not clear if the benefits justify the costs of any change but at the beginning it would have been better to define expectations more clearly. If I was to propose anything here it would not be to disallow anything that you can currently do with context managers. Rather the suggestion would be to: 1. Clearly define what a well-behaved context manager is. 2. Add convenient utilities for working with well behaved context managers. 3. Add well-behaved alternatives for open and maybe others. 4. Add Random832's utility for adapting misbehaving context managers. The point is not that anything should be disallowed but that we can have useful context manager utilities if we have a clear understanding of how to use them and easy ways to use them correctly. Then if someone hits up against a bug from using a misbehaving context manager where they shouldn't the response can be "change the context manager or don't use it in that situation" rather than "that's a bug in the useful utilities so let's remove those". -- Oscar
On Tue, 19 Nov 2019 at 11:34, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
If I was to propose anything here it would not be to disallow anything that you can currently do with context managers. Rather the suggestion would be to: 1. Clearly define what a well-behaved context manager is. 2. Add convenient utilities for working with well behaved context managers. 3. Add well-behaved alternatives for open and maybe others. 4. Add Random832's utility for adapting misbehaving context managers.
That sounds reasonable, with one proviso. I would *strongly* object to calling context managers that conform to the new expectations "well behaved", and by contrast implying that those that don't are somehow "misbehaving". File objects have been considered as perfectly acceptable context managers since the first introduction of context managers (so have locks, and zipfile objects, which might also fall foul of the new requirements). Suddenly deeming them as "misbehaving" is unreasonable. A new name should be chosen, which *doesn't* have the implication that these are somehow "better" than traditional context managers. Maybe "resource containers", although as you point out above, resources probably aren't the key point here. Naming is hard, and I'll leave it up to you ;-)
The point is not that anything should be disallowed but that we can have useful context manager utilities if we have a clear understanding of how to use them and easy ways to use them correctly. Then if someone hits up against a bug from using a misbehaving context manager where they shouldn't the response can be "change the context manager or don't use it in that situation" rather than "that's a bug in the useful utilities so let's remove those".
Again, you're characterising things as "misbehaving". Rather I'd say that the *user* is at fault for attempting to use a context manager that doesn't satisfy the (additional) requirements imposed by the utility. The fix is to use something that *does* satisfy the utility's requirements (which may mean using a different object, or wrapping the context manager, or whatever). The tricky bit here is the human problem of making it clear to people what the additional constraints are, and how to tell what satisfies them - that was the original problem with nested, that it *didn't* do a good job of making it obvious that you needed to use opened() with it rather than open() (more accurately, I guess, that you had to use an "open file container" object, not a raw file object). Paul
On Tue, Nov 19, 2019, at 07:03, Paul Moore wrote:
That sounds reasonable, with one proviso. I would *strongly* object to calling context managers that conform to the new expectations "well behaved", and by contrast implying that those that don't are somehow "misbehaving". File objects have been considered as perfectly acceptable context managers since the first introduction of context managers (so have locks, and zipfile objects, which might also fall foul of the new requirements). Suddenly deeming them as "misbehaving" is unreasonable.
The problem is that if this model is perfectly okay, then *there's no reason for __enter__ to exist at all*. Why doesn't *every* context manager just do *everything* in __init__? I think it's clear that something was lost between the design and the implementation.
On Tue, 19 Nov 2019 at 16:05, Random832 <random832@fastmail.com> wrote:
On Tue, Nov 19, 2019, at 07:03, Paul Moore wrote:
That sounds reasonable, with one proviso. I would *strongly* object to calling context managers that conform to the new expectations "well behaved", and by contrast implying that those that don't are somehow "misbehaving". File objects have been considered as perfectly acceptable context managers since the first introduction of context managers (so have locks, and zipfile objects, which might also fall foul of the new requirements). Suddenly deeming them as "misbehaving" is unreasonable.
The problem is that if this model is perfectly okay, then *there's no reason for __enter__ to exist at all*. Why doesn't *every* context manager just do *everything* in __init__? I think it's clear that something was lost between the design and the implementation.
Because you don't have to create the context manager directly in the with statement - __init__ is called when the object is created,, __enter__ is called when you enter the scope of the with statement. It's often the case that these two events happen at more or less the same time, but not always. I'm not the one suggesting that behaviour that's been around since Python 2.5 should be changed, so I don't think it's down to me to come up with examples where that flexibility is required - if someone's proposing to abolish __enter__ then let them prove that it's OK to do so. And if no-one's suggesting that, then I fail to see your point. Paul.
@jayvdb on GitHub and I are working on a new version of one of my packages, stdio-mgr (https://github.com/bskinn/stdio-mgr), with a dramatically expanded API and capabilities. Context managers feature heavily in the planned design; part of the design plan is to allow instantiation of a StdioManager object prior to entering a context, so that the user can tweak settings on the resulting object beforehand:
cm = StdioManager() cm.setting = True cm.other_setting = False with cm: {do stuff with stdio managed}
If this paradigm holds, we will *specifically* be exploiting the distinction between __init__ and __enter__. -42 to abolishing __enter__.
On 19/11/2019 17:12, Brian Skinn wrote:
@jayvdb on GitHub and I are working on a new version of one of my packages, stdio-mgr (https://github.com/bskinn/stdio-mgr), with a dramatically expanded API and capabilities.
Context managers feature heavily in the planned design; part of the design plan is to allow instantiation of a StdioManager object prior to entering a context, so that the user can tweak settings on the resulting object beforehand:
cm = StdioManager() cm.setting = True cm.other_setting = False with cm: {do stuff with stdio managed}
If this paradigm holds, we will *specifically* be exploiting the distinction between __init__ and __enter__.
-42 to abolishing __enter__.
I concur. Logically the problem that people are complaining about is that we use open() as the context manager rather than a class that defers the actual file open to __enter__(). That's fixable by writing your own wrapper class, but having a builtin file context manager that deferred resource-consuming actions would be a Good Thing™ How about using Paths as file context managers? Just an idle thought. -- Rhodri James *-* Kynesim Ltd
On 2019-11-19 3:37 p.m., Rhodri James wrote:
On 19/11/2019 17:12, Brian Skinn wrote:
@jayvdb on GitHub and I are working on a new version of one of my packages, stdio-mgr (https://github.com/bskinn/stdio-mgr), with a dramatically expanded API and capabilities.
Context managers feature heavily in the planned design; part of the design plan is to allow instantiation of a StdioManager object prior to entering a context, so that the user can tweak settings on the resulting object beforehand:
cm = StdioManager() cm.setting = True cm.other_setting = False with cm: {do stuff with stdio managed}
If this paradigm holds, we will *specifically* be exploiting the distinction between __init__ and __enter__.
-42 to abolishing __enter__.
I concur. Logically the problem that people are complaining about is that we use open() as the context manager rather than a class that defers the actual file open to __enter__(). That's fixable by writing your own wrapper class, but having a builtin file context manager that deferred resource-consuming actions would be a Good Thing™
How about using Paths as file context managers? Just an idle thought.
I still feel like opening the file but delaying the exception would be more beneficial and have better semantics. It allows cd, mv, rm, etc to happen without any surprising semantics, and fits the model you want it to fit.
On Nov 19, 2019, at 09:51, Soni L. <fakedme+py@gmail.com> wrote:
I still feel like opening the file but delaying the exception would be more beneficial and have better semantics. It allows cd, mv, rm, etc to happen without any surprising semantics, and fits the model you want it to fit.
Well, without any surprising semantics as long as you only care about POSIX or only care about Windows. Writing code that makes sense on both platforms (so rm will either work and leave my file handle open, or raise an exception so I have to remember to delete on close and that may possibly fail; the only thing I know is that it won’t work and make my open fail, although my open might have failed anyway for different reasons…) is pretty hard. Also, this leads to surprising semantics in other cases: f1 = open(fn1) f2 = open(fn2, 'w') Any existing code that does this almost certainly expects that it’s not going to create or truncate a file at fn2 if there’s no file at fn1, but that would no longer be true. And you’d expect that putting f1 and f2 into a with statement would solve that, but it would actually make things worse: f1.__enter__ raises, so f2.__enter__ never gets called, so fn2 still gets created or truncated, and on top of that we leak a file handle. The only way to fix this is to move the open itself into a with statement—just like in Python 3.8, so we haven’t gained anything. Moving the actual open to __enter__, on the other hand, does solve the problem; no matter how you write things, fn2 will never be entered, so no file will be created or truncated and no fd will get leaked.
On 19/11/2019 17:50, Soni L. wrote:
On 2019-11-19 3:37 p.m., Rhodri James wrote:
On 19/11/2019 17:12, Brian Skinn wrote:
@jayvdb on GitHub and I are working on a new version of one of my packages, stdio-mgr (https://github.com/bskinn/stdio-mgr), with a dramatically expanded API and capabilities.
Context managers feature heavily in the planned design; part of the design plan is to allow instantiation of a StdioManager object prior to entering a context, so that the user can tweak settings on the resulting object beforehand:
cm = StdioManager() cm.setting = True cm.other_setting = False with cm: {do stuff with stdio managed}
If this paradigm holds, we will *specifically* be exploiting the distinction between __init__ and __enter__.
-42 to abolishing __enter__.
I concur. Logically the problem that people are complaining about is that we use open() as the context manager rather than a class that defers the actual file open to __enter__(). That's fixable by writing your own wrapper class, but having a builtin file context manager that deferred resource-consuming actions would be a Good Thing™
How about using Paths as file context managers? Just an idle thought.
I still feel like opening the file but delaying the exception would be more beneficial and have better semantics. It allows cd, mv, rm, etc to happen without any surprising semantics, and fits the model you want it to fit.
It really doesn't. In my experience, lying to the user (or the compiler, or whatever) is very rarely a good idea. Either the file is open or it isn't, and finding out half a mile of text away from where the error actually occurred that it isn't doesn't give you much opportunity to fix things. As to the expected semantics of cd, mv, rm etc, I wouldn't expect those to work on an open file at all, and I wouldn't be sure what happened if they did. -- Rhodri James *-* Kynesim Ltd
On Nov 19, 2019, at 09:40, Rhodri James <rhodri@kynesim.co.uk> wrote:
How about using Paths as file context managers? Just an idle thought.
Then how do you open a Path for writing, or in binary mode, etc.? Adding a Path.opening method that returns an file-on-enter context manager instead of a file would probably work. But is that better than an opening function that takes a Path (or any of the other valid arguments to open) as its first argument?
On 19/11/2019 17:57, Andrew Barnert wrote:
On Nov 19, 2019, at 09:40, Rhodri James <rhodri@kynesim.co.uk> wrote:
How about using Paths as file context managers? Just an idle thought.
Then how do you open a Path for writing, or in binary mode, etc.?
With additional parameters and/or methods, obviously.
Adding a Path.opening method that returns an file-on-enter context manager instead of a file would probably work.
Yes, that might work better. Something like: p1 = pathlib.Path("foo/bar/wombat.baz") with p1.opened_as("wb") as outfile: do_stuff(outfile)
But is that better than an opening function that takes a Path (or any of the other valid arguments to open) as its first argument?
To-may-to, to-mah-to. Personally I think it is better, because users won't be so quick to think "oh, that's a file" and try to use it directly. And no, I don't think that trying to open the file on the first attempt to read or write would be at all a good idea! -- Rhodri James *-* Kynesim Ltd
On Tue, Nov 19, 2019, at 11:26, Paul Moore wrote:
Because you don't have to create the context manager directly in the with statement
...what? that's *my* argument. that's *precisely* why I consider open to be 'misbehaving'. Because you *should* be able to create a context manager outside the with statement without causing these problems. Because calling open outside the with statement [absent other arrangements like an explicit try/finally] causes problems that wouldn't be there for a proper context manager that has __exit__ as the inverse of __enter__ rather than as the inverse of some other operation that happens before __enter__.
On Tue, 19 Nov 2019 at 19:03, Random832 <random832@fastmail.com> wrote:
On Tue, Nov 19, 2019, at 11:26, Paul Moore wrote:
Because you don't have to create the context manager directly in the with statement
...what? that's *my* argument. that's *precisely* why I consider open to be 'misbehaving'. Because you *should* be able to create a context manager outside the with statement without causing these problems. Because calling open outside the with statement [absent other arrangements like an explicit try/finally] causes problems that wouldn't be there for a proper context manager that has __exit__ as the inverse of __enter__ rather than as the inverse of some other operation that happens before __enter__.
We're just going round in circles here. You *can* call open outside the with statement. It's *perfectly* fine to do so. But if you do, it is *your* responsibility to manage the closing of the file object (which is done using close(), not __exit__()). That's been valid and acceptable for more years than I can recall (at least back to Python 1.4). As a convenience, you can pass the file object to a with statement, to say "i want the file to be closed when control leaves this suite, regardless of how that happens - please handle the admin for me. That convenience ability is provided by file objects making __enter__ do nothing (because there's nothing you need to do on entering the with block to prepare for the tidying up) and making __exit__ act like close (because all you need to do on leaving the scope is close the file). According to the PEP, and established usage since Python 2.5, an object that does this is called a context manager. It's a perfectly acceptable one. Even if it doesn't implement other behaviours that you would like it to have, doesn't mean it's not a context manager. I'm fine with you coining a new term ("enhanced context manager", I don't know, I've already said naming is hard) for context managers with the additional properties you are interested in, but existing context managers are *not* somehow "misbehaving" or "broken", or "problematic" - it's your assumption that context managers have behaviour that isn't required of them by the definition of that term that's wrong.
Because you *should* be able to create a context manager outside the with statement without causing these problems.
Says who? The PEP doesn't say any such thing. If you are saying it, I disagree with you. If you are saying "it's obvious and everyone must surely agree", then sorry, I still disagree with you. Simply asserting the same thing repeatedly is not a particularly effective argument. Demonstrate some benefits. I'll give you "we'd be able to create functions like nested() without worrying about users misusing them" for free. It's not worth a lot (IMO) but hey, you got it for free, what did you expect? ;-) Examine the costs, and explain why they are worth accepting Backward compatibility is a big one here, and you're going to need some serious benefits to offset it, or at least some really good ways to mitigate it - there's masses of code that passes file objects to with statements, as well as plenty of code that uses them independently. I really don't see any new arguments being made here, to be honest. Paul
On Tuesday, November 19, 2019 at 3:06:26 PM UTC-5, Paul Moore wrote:
On Tue, 19 Nov 2019 at 19:03, Random832 <rand...@fastmail.com <javascript:>> wrote:
On Tue, Nov 19, 2019, at 11:26, Paul Moore wrote:
Because you don't have to create the context manager directly in the with statement
...what? that's *my* argument. that's *precisely* why I consider open to
be 'misbehaving'. Because you *should* be able to create a context manager outside the with statement without causing these problems. Because calling open outside the with statement [absent other arrangements like an explicit try/finally] causes problems that wouldn't be there for a proper context manager that has __exit__ as the inverse of __enter__ rather than as the inverse of some other operation that happens before __enter__.
We're just going round in circles here. You *can* call open outside the with statement. It's *perfectly* fine to do so. But if you do, it is *your* responsibility to manage the closing of the file object
So you're proposing that everyone who doesn't use open in a context manager should always write f = open(...) # Absolutely nothing between the above line and the line below. try: ... finally: f.close() How is that any nicer than using a context manager? Because if you don't always write the above code, I think your code is way too fragile. If any line of code where the comment is raises, you have a resource leak. (which is done using close(), not __exit__()). That's been valid and
acceptable for more years than I can recall (at least back to Python 1.4).
It is valid, and I guess this is a matter of preference, but personally I don't think it's "acceptable" code. I think most code reviews would request changes to code that opens files without using a context manager.
As a convenience, you can pass the file object to a with statement, to say "i want the file to be closed when control leaves this suite, regardless of how that happens - please handle the admin for me. That convenience ability is provided by file objects making __enter__ do nothing (because there's nothing you need to do on entering the with block to prepare for the tidying up) and making __exit__ act like close (because all you need to do on leaving the scope is close the file).
According to the PEP, and established usage since Python 2.5, an object that does this is called a context manager. It's a perfectly acceptable one. Even if it doesn't implement other behaviours that you would like it to have, doesn't mean it's not a context manager. I'm fine with you coining a new term ("enhanced context manager", I don't know, I've already said naming is hard) for context managers with the additional properties you are interested in, but existing context managers are *not* somehow "misbehaving" or "broken", or "problematic" - it's your assumption that context managers have behaviour that isn't required of them by the definition of that term that's wrong.
I guess I'm in the camp of people that considers acquiring resources in the constructor as "misbehaving". Reason below…
Because you *should* be able to create a context manager outside the with statement without causing these problems.
Says who? The PEP doesn't say any such thing. If you are saying it, I disagree with you. If you are saying "it's obvious and everyone must surely agree", then sorry, I still disagree with you. Simply asserting the same thing repeatedly is not a particularly effective argument.
Demonstrate some benefits. I'll give you "we'd be able to create
There's nothing preventing anyone from creating a context manager whenever they please. For example, an object could create a list of context managers and then acquire those later when the resources are needed. Are you proposing a system whereby objects that inherit from ContextManager can only come into existence in a with block? Right now, context managers can be created anwhere. Your best defense against leaking resources is acquiring them in the __enter__ method. Therefore, my personal opinion is that any object that acquires resources outside of an __enter__ method should probably be rewritten as a context manager.
functions like nested() without worrying about users misusing them" for free. It's not worth a lot (IMO) but hey, you got it for free, what did you expect? ;-) Examine the costs, and explain why they are worth accepting Backward compatibility is a big one here, and you're going to need some serious benefits to offset it, or at least some really good ways to mitigate it - there's masses of code that passes file objects to with statements, as well as plenty of code that uses them independently.
I really don't see any new arguments being made here, to be honest.
Paul _______________________________________________ Python-ideas mailing list -- python...@python.org <javascript:> To unsubscribe send an email to python-id...@python.org <javascript:> https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/2E3ULZ... Code of Conduct: http://python.org/psf/codeofconduct/
On Nov 19, 2019, at 08:04, Random832 <random832@fastmail.com> wrote: On Tue, Nov 19, 2019, at 07:03, Paul Moore wrote:
That sounds reasonable, with one proviso. I would *strongly* object to calling context managers that conform to the new expectations "well behaved", and by contrast implying that those that don't are somehow "misbehaving". File objects have been considered as perfectly acceptable context managers since the first introduction of context managers (so have locks, and zipfile objects, which might also fall foul of the new requirements). Suddenly deeming them as "misbehaving" is unreasonable.
The problem is that if this model is perfectly okay, then *there's no reason for __enter__ to exist at all*. Why doesn't *every* context manager just do *everything* in __init__? I think it's clear that something was lost between the design and the implementation.
Forget about context managers for a second. A class can bind attributes in __new__ and return a fully initialized object. If that’s perfectly ok, why doesn’t every class do everything in __new__, in which case there’s no reason for __init__ to exist at all? But in fact, it’s usually a good idea to bind your mutable attributes and attributes that you expect subclasses to override in __init__. This signals your intentions better, and makes it easier to use your class with a range of optional utilities. But it’s not mandatory, and sometimes there are good reasons to violate it. And yet, despite the split being entirely up to the class writers and there being no hard rules about it, it’s still useful. So, why can’t much the same be true for context managers? It’s usually a good idea to do resource acquisition in __enter__, and also to have __init__ never raise. This signals your intentions better, and makes it easier to use your cm with a range of optional utilities, but it’s not mandatory, and sometimes there are good reasons not to. A language could certainly live without the distinction, but it could also live without two-phase initialization. (C++ merges all of __new__, __init__, and __enter__ into one constructor, and __del__ and __exit__ into one destructor, and “resource acquisition is initialization” would work perfectly if not for half the stdlib and 80% of the third-party ecosystem being inherited without wrappers from C, and therefore not exception safe…). But that doesn’t mean a language can’t benefit from the distinction. For example, notice that Python doesn’t have C++‘s complicated member destructor rules or ObjC’s different kinds of attributes to manage ARC, and yet we can still get away with having resources with dynamic lifetimes (tied to an owning object rather than a lexical scope). That works because resources can easily be used manually, rather than every resource being a context manager and only usable that way; otherwise we’d need language or library support for managing your attribute context. It isn’t perfect (you can’t screw up RAII in C++ if you only use RAII objects; you can easily screw up cleanup in Python even with objects that have cm support), but it mostly works. Arguably we’re getting half the benefit of an RAII system with only a quarter of the costs. (And part of the cost we’re skipping may be that it’s nearly impossible to add non-refcounting GC to an implementation of C++ or ObjC, but pretty easy for Python.) I can imagine other designs that might have the same benefit and still not require __enter__. For example, make ExitStack syntactic and then eliminate the cm machinery: scope: f1 = open(fn1) defer f1.close() f2 = open(fn2) defer f2.close() More verbose, but simpler, and it means you don’t need to write anything to make an object manageable; the name “close” is just a convention rather than machinery. (For cleanup that’s not a single expression, you’d have to factor it out into a function or method—but that’s at worst equivalent to writing the __exit__ method today, and now you’d only need that for complicated resource managers rather than for all of them.) Or just go back to Python 2.2 and mandate deterministic destruction (and if that means Jython can’t be simple and efficient, so be it) and then build from there. Now all you need is a way to create scopes without manually defining and calling nested functions and you’re done. But if we’re really going to rethink resource management from scratch, I don’t think we’re talking about Python anymore anyway.
On 20/11/19 6:51 am, Andrew Barnert via Python-ideas wrote:
A class can bind attributes in __new__ and return a fully initialized object. If that’s perfectly ok, why doesn’t every class do everything in __new__, in which case there’s no reason for __init__ to exist at all?
If Python had been designed with the ability to subclass built-in immutable objects from the beginning, __init__ may well never have existed. I can't think of another language off the top of my head that splits the functionality of object creation in quite this way. C++ lets you override the 'new' operator, but that's strictly about allocating memory, it doesn't do any initialisation. -- Greg
On Nov 19, 2019, at 14:30, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 20/11/19 6:51 am, Andrew Barnert via Python-ideas wrote:
A class can bind attributes in __new__ and return a fully initialized object. If that’s perfectly ok, why doesn’t every class do everything in __new__, in which case there’s no reason for __init__ to exist at all?
If Python had been designed with the ability to subclass built-in immutable objects from the beginning, __init__ may well never have existed.
I doubt it. For simple classes, __init__ is just easier to use. No need to write a magical pseudo-classmethod, to explicitly super, etc. When you don’t need the flexibility of being able to return an arbitrary instance (maybe even of a different type than was requested), or to override the MRO, or to set up immutable values or something else that’s impossible or at least conceptually wrong to do in __init__, you don’t want the extra complexity of __new__, and I don’t think Guido would have forced it on us.
I can't think of another language off the top of my head that splits the functionality of object creation in quite this way.
Off the top of my head, there’s Smalltalk, which I suspect is where Guido got the idea, and ObjC, which got it from Smalltalk, and Swift, which got it from ObjC, and Ruby, which got it from either Smalltalk or Python. And there are other languages outside the Smalltalk lineage that do two-phase construction. IIRC, Eiffel is sort of the reverse of Python, where the constructor implicitly calls new instead of vice versa, while Sather makes the constructor explicitly call new. C++ and its descendants don’t have two-phase initialization—but sometimes people hack it up manually; there’s a Design Pattern for it in Java. One use for it is needing to call virtual functions in the initializer (in C++11 and later, all of the cases where you actually couldn’t refactor your way out of that with no cost are solved, but it’s still sometimes convenient, or just familiar, to not do so). Sometimes it’s even about deferring exceptions, just like this discussion.
On 20/11/19 12:14 pm, Andrew Barnert wrote:
Off the top of my head, there’s Smalltalk, which I suspect is where Guido got the idea, and ObjC, which got it from Smalltalk, and Swift, which got it from ObjC,
I don't think Smalltalk/ObjC do quite the same thing. Sometimes you see a pattern like (SomeClass new) initWithArgs: ... but that's something you do explicitly -- it's not built into the language. If Python were like that, it would only have __new__, and if you wanted an __init__ you would have to call it yourself. -- Greg
On Nov 19, 2019, at 15:45, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 20/11/19 12:14 pm, Andrew Barnert wrote:
Off the top of my head, there’s Smalltalk, which I suspect is where Guido got the idea, and ObjC, which got it from Smalltalk, and Swift, which got it from ObjC,
I don't think Smalltalk/ObjC do quite the same thing. Sometimes you see a pattern like
(SomeClass new) initWithArgs: ...
but that's something you do explicitly -- it's not built into the language.
This is a minor difference. A lot of things in Smalltalk and ObjC are ubiquitous conventions instead of language-supported, and this is one of them. Calling new followed by init* is just a convention, but it’s a convention followed almost ubiquitously. In Python, the 99% case is automated—you have to opt out of it in your __new__ for the rare exceptions, instead of having to opt in to it everywhere except the rare exceptions. Which I think is an improvement, but it doesn’t really change anything fundamental. It’s still two-phase initialization where new/alloc/__new__ is a class method that can return anything it wants (like a cached object or an instance of a subclass) and init/__init__ is an instance method that finishes setting up whatever it returned. (In modern ObjC with ARC, it’s not quite “just a convention”, because the compiler assumes that you’re following it and you will break garbage collection, and get a warning but not an error, if you break the rules. But you don’t have to use ARC, and you can go around the conventions rather than breaking them, so I think “just a convention” is close enough.)
On Tue, 19 Nov 2019 at 12:03, Paul Moore <p.f.moore@gmail.com> wrote:
On Tue, 19 Nov 2019 at 11:34, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
If I was to propose anything here it would not be to disallow anything that you can currently do with context managers. Rather the suggestion would be to: 1. Clearly define what a well-behaved context manager is. 2. Add convenient utilities for working with well behaved context managers. 3. Add well-behaved alternatives for open and maybe others. 4. Add Random832's utility for adapting misbehaving context managers.
That sounds reasonable, with one proviso. I would *strongly* object to calling context managers that conform to the new expectations "well behaved", and by contrast implying that those that don't are somehow "misbehaving". File objects have been considered as perfectly acceptable context managers since the first introduction of context managers (so have locks, and zipfile objects, which might also fall foul of the new requirements). Suddenly deeming them as "misbehaving" is unreasonable.
Perhaps a less emotive way of distinguishing these classes of context managers would be as "eager" vs "lazy". An eager context manager jumps the gun and does whatever needs undoing or following up before its __enter__ method is called. A lazy context manager waits until __enter__ is called before committing itself. I don't really want to give a sense of equality between eager and lazy though. To me it is clear that lazy context managers are preferable. You say it's unreasonable to claim now that some long-existing context managers are "misbehaving". Perhaps there is a better word but an alternative that implies that all existing context managers are "perfectly acceptable" will not convey the useful point that some ways of doing things are better than others. -- Oscar
On 11/19/19 8:57 PM, Oscar Benjamin wrote:
On Tue, 19 Nov 2019 at 11:34, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
If I was to propose anything here it would not be to disallow anything that you can currently do with context managers. Rather the suggestion would be to: 1. Clearly define what a well-behaved context manager is. 2. Add convenient utilities for working with well behaved context managers. 3. Add well-behaved alternatives for open and maybe others. 4. Add Random832's utility for adapting misbehaving context managers. That sounds reasonable, with one proviso. I would *strongly* object to calling context managers that conform to the new expectations "well behaved", and by contrast implying that those that don't are somehow "misbehaving". File objects have been considered as perfectly acceptable context managers since the first introduction of context managers (so have locks, and zipfile objects, which might also fall foul of the new requirements). Suddenly deeming them as "misbehaving" is unreasonable. Perhaps a less emotive way of distinguishing these classes of context managers would be as "eager" vs "lazy". An eager context manager jumps
On Tue, 19 Nov 2019 at 12:03, Paul Moore <p.f.moore@gmail.com> wrote: the gun and does whatever needs undoing or following up before its __enter__ method is called. A lazy context manager waits until __enter__ is called before committing itself.
I don't really want to give a sense of equality between eager and lazy though. To me it is clear that lazy context managers are preferable. You say it's unreasonable to claim now that some long-existing context managers are "misbehaving". Perhaps there is a better word but an alternative that implies that all existing context managers are "perfectly acceptable" will not convey the useful point that some ways of doing things are better than others.
-- Oscar
If I understand it right, an eager context manager (like open currently is) allows itself to be used somewhat optionally, and not need to be in a with statement. A lazy context manager on the other hand, seems to be assuming that it will be used in a with statement (or something similar), as it is putting off some important work until it sees it is in there. It impetus for this seems to be to make a cleaner syntax for a with statement managing multiple resources through context managers, if we only need simple recovery (the built in provided by the context manager). It may be at the cost of making more complex (and explicit) handling of conditions for current (or possible future) 'eager' managers. For example, if open doesn't actually open the file, but only prepares to open, then the use of a file outside a with becomes more complicated if we want somewhat precise control over the file. -- Richard Damon
On Wed, 20 Nov 2019 at 02:36, Richard Damon <Richard@damon-family.org> wrote:
On 11/19/19 8:57 PM, Oscar Benjamin wrote:
On Tue, 19 Nov 2019 at 11:34, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
If I was to propose anything here it would not be to disallow anything that you can currently do with context managers. Rather the suggestion would be to: 1. Clearly define what a well-behaved context manager is. 2. Add convenient utilities for working with well behaved context managers. 3. Add well-behaved alternatives for open and maybe others. 4. Add Random832's utility for adapting misbehaving context managers. That sounds reasonable, with one proviso. I would *strongly* object to calling context managers that conform to the new expectations "well behaved", and by contrast implying that those that don't are somehow "misbehaving". File objects have been considered as perfectly acceptable context managers since the first introduction of context managers (so have locks, and zipfile objects, which might also fall foul of the new requirements). Suddenly deeming them as "misbehaving" is unreasonable. Perhaps a less emotive way of distinguishing these classes of context managers would be as "eager" vs "lazy". An eager context manager jumps
On Tue, 19 Nov 2019 at 12:03, Paul Moore <p.f.moore@gmail.com> wrote: the gun and does whatever needs undoing or following up before its __enter__ method is called. A lazy context manager waits until __enter__ is called before committing itself.
I don't really want to give a sense of equality between eager and lazy though. To me it is clear that lazy context managers are preferable. You say it's unreasonable to claim now that some long-existing context managers are "misbehaving". Perhaps there is a better word but an alternative that implies that all existing context managers are "perfectly acceptable" will not convey the useful point that some ways of doing things are better than others.
-- Oscar
If I understand it right, an eager context manager (like open currently is) allows itself to be used somewhat optionally, and not need to be in a with statement. A lazy context manager on the other hand, seems to be assuming that it will be used in a with statement (or something similar), as it is putting off some important work until it sees it is in there.
It impetus for this seems to be to make a cleaner syntax for a with statement managing multiple resources through context managers, if we only need simple recovery (the built in provided by the context manager). It may be at the cost of making more complex (and explicit) handling of conditions for current (or possible future) 'eager' managers. For example, if open doesn't actually open the file, but only prepares to open, then the use of a file outside a with becomes more complicated if we want somewhat precise control over the file.
The idea would be (or rather would have been) that open still does the same thing it did before and returns a file object with the usual methods. However that file object would *not* be a context manager. If you wanted a context manager for the file then you would have used opened (from PEP 343) rather than open and opened would give a context manager whose __enter__ would return the same file object that open otherwise returns. So it doesn't make other uses of the file object more complicated except that you choose whether to call open or opened depending on how you want to use the file. -- Oscar
On Nov 19, 2019, at 18:53, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
On Wed, 20 Nov 2019 at 02:36, Richard Damon <Richard@damon-family.org> wrote:
If I understand it right, an eager context manager (like open currently is) allows itself to be used somewhat optionally, and not need to be in a with statement. A lazy context manager on the other hand, seems to be assuming that it will be used in a with statement (or something similar), as it is putting off some important work until it sees it is in there.
It impetus for this seems to be to make a cleaner syntax for a with statement managing multiple resources through context managers, if we only need simple recovery (the built in provided by the context manager). It may be at the cost of making more complex (and explicit) handling of conditions for current (or possible future) 'eager' managers. For example, if open doesn't actually open the file, but only prepares to open, then the use of a file outside a with becomes more complicated if we want somewhat precise control over the file.
The idea would be (or rather would have been) that open still does the same thing it did before and returns a file object with the usual methods. However that file object would *not* be a context manager. If you wanted a context manager for the file then you would have used opened (from PEP 343) rather than open and opened would give a context manager whose __enter__ would return the same file object that open otherwise returns. So it doesn't make other uses of the file object more complicated except that you choose whether to call open or opened depending on how you want to use the file.
Would you also have added sqlite3.connected, mmap.mmapped, socket.madefile, requests.got (this name scheme doesn’t extend as well as I thought it would…), etc.? At first it seems like a good idea, but ultimately it means you need two different functions or methods for every kind of resource. Maybe a generic helper is a better idea. For the 80% case (where enter just has to call the function and stash and return the result, and exit just calls a close method on that result), instead of this: with closing(urlopen(url)) as page: … you do this: with context(urlopen, url) as page: … which gives you a lazy context manager that doesn’t call urlopen until enter. And for cases where the exit isn’t as simple as calling close, but is simple enough to just be a function call, and where the enter function doesn’t take an exit kwarg: with context(FilesCache, exit=FilesCache.shutdown) as fc: For anything more complicated you still have to write a custom context manager, but that’s not a huge deal considering how rare it is. Now you don’t need opened and mmapped and so on; if you are calling one of them repeatedly, just define it locally: opened = partial(context, open) More importantly, you don’t need files, db connections, network requests, mmaps, and other resources to be context managers, you just need them to be compatible with context (which most of them already are, and even more would be if there were a concrete advantage to it). So there’s no point in eager context managers—sure, someone could go out of their way to write one, but then you probably could get away with saying that it’s “broken”, unlike today. This is still more verbose than current Python, and passing a function and its args doesn’t look as nice as just calling the function—but remember that you can still trivially partial any resource constructor you plan to context manage repeatedly. (And if that’s not good enough, you could elevate building and entering a context manager around a construction expression into syntactic sugar, but I doubt that would be needed.) But unless you have a time machine back to 2007, I still don’t think this is at all feasible, because of all of the code that would have to change, including half the existing third-party libraries that provide context managers and even more applications that rely on eager context manager behavior. So we’re still not talking about an idea for Python here. The idea in my other email, providing a wrapper to turn eager context managers lazy and an ABC to distinguish them and so on, may not be nearly as clean, but it seems a lot more feasible.
On Nov 19, 2019, at 17:57, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
Perhaps a less emotive way of distinguishing these classes of context managers would be as "eager" vs "lazy". An eager context manager jumps the gun and does whatever needs undoing or following up before its __enter__ method is called. A lazy context manager waits until __enter__ is called before committing itself.
I don't really want to give a sense of equality between eager and lazy though. To me it is clear that lazy context managers are preferable.
What you’re calling “lazy” context managers are better in that they work with tools like nested. But “eager” context managers are better in that they can be used as resources when the context manager protocol doesn’t help, as when the releasing doesn’t happen in the same lexical scope as the acquiring. (Go back to my example with someone who opens a file, and then opens another file if that didn’t raise, to see why you can’t just use a lazy context manager anyway.) You can go back and forth arguing that the first one is more important or that the second one is more important, but neither argument rebuts the fact that the other kind actually does have benefits. So, where do you go from there? Option 1 is to assume that context managers are, or can be, eager, and therefore tools like nested are bug magnets and can’t be put somewhere as prominent as contextlib. That’s where we are today, and you don’t like it. Option 2 is to assume that context managers are lazy (maybe with some rare exceptions but those should be clearly called out), so file object and SQLite connections and dozens of other context managers, stdlib and third party, all need to be fixed. That’s what you’re suggesting, but I doubt it’s ever going to happen. Is there a third option? Maybe. If we accept that both exist, we just need some way to distinguish the two, and to make tools like nested raise if fed an eager one, right? That sounds like a job for an ABC—one that classes have to manually opt in to via inheritance or registration, but a bunch of the stuff in contextlib (including the wrapper around generator functions) that people use directly or use to create their own context managers is definitely lazy and would opt in. You can also create either a generic wrapper that wraps any eager context manager factory into a lazy one, or create specific lazy context managers or cm factories like opening, or both. You can also do the reverse wrapper, and use that to try to convince people to always write their context managers lazy and wrap if they need an eager one. Submit all of this to contextlib2 (or, if Nick rejects it, create your own PyPI project instead) and see if people use it. If so, you can propose merging it all into the stdlib, and maybe even shortcuts like making tuples act like nested. In the long run, you can start pressuring people to write their tutorials and blog posts and python-list and StackOverflow answers to favor creating and using lazy context managers whenever possible (“because they work with the tuple syntax” seems like a good argument…). And you can do all of this without needing to change the protocol.
On 20/11/2019 01:57, Oscar Benjamin wrote:
On Tue, 19 Nov 2019 at 12:03, Paul Moore <p.f.moore@gmail.com> wrote:
On Tue, 19 Nov 2019 at 11:34, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
If I was to propose anything here it would not be to disallow anything that you can currently do with context managers. Rather the suggestion would be to: 1. Clearly define what a well-behaved context manager is. 2. Add convenient utilities for working with well behaved context managers. 3. Add well-behaved alternatives for open and maybe others. 4. Add Random832's utility for adapting misbehaving context managers.
That sounds reasonable, with one proviso. I would *strongly* object to calling context managers that conform to the new expectations "well behaved", and by contrast implying that those that don't are somehow "misbehaving". File objects have been considered as perfectly acceptable context managers since the first introduction of context managers (so have locks, and zipfile objects, which might also fall foul of the new requirements). Suddenly deeming them as "misbehaving" is unreasonable.
Perhaps a less emotive way of distinguishing these classes of context managers would be as "eager" vs "lazy". An eager context manager jumps the gun and does whatever needs undoing or following up before its __enter__ method is called. A lazy context manager waits until __enter__ is called before committing itself.
I don't really want to give a sense of equality between eager and lazy though. To me it is clear that lazy context managers are preferable.
As context managers, yes, lazy managers make chaining them easier because there's no mess to clean up if the chain breaks while you are creating it. On the other hand, eager managers like open() can be used outside a "with" statement and still manage resources perfectly well for a lot of cases. It a matter of fitness for different purposes, so even "preferable" is a relative term here. -- Rhodri James *-* Kynesim Ltd
On Nov 20, 2019, at 7:50 AM, Rhodri James <rhodri@kynesim.co.uk> wrote:
On 20/11/2019 01:57, Oscar Benjamin wrote:
On Tue, 19 Nov 2019 at 12:03, Paul Moore <p.f.moore@gmail.com> wrote:
On Tue, 19 Nov 2019 at 11:34, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
If I was to propose anything here it would not be to disallow anything that you can currently do with context managers. Rather the suggestion would be to: 1. Clearly define what a well-behaved context manager is. 2. Add convenient utilities for working with well behaved context managers. 3. Add well-behaved alternatives for open and maybe others. 4. Add Random832's utility for adapting misbehaving context managers.
That sounds reasonable, with one proviso. I would *strongly* object to calling context managers that conform to the new expectations "well behaved", and by contrast implying that those that don't are somehow "misbehaving". File objects have been considered as perfectly acceptable context managers since the first introduction of context managers (so have locks, and zipfile objects, which might also fall foul of the new requirements). Suddenly deeming them as "misbehaving" is unreasonable. Perhaps a less emotive way of distinguishing these classes of context managers would be as "eager" vs "lazy". An eager context manager jumps the gun and does whatever needs undoing or following up before its __enter__ method is called. A lazy context manager waits until __enter__ is called before committing itself. I don't really want to give a sense of equality between eager and lazy though. To me it is clear that lazy context managers are preferable.
As context managers, yes, lazy managers make chaining them easier because there's no mess to clean up if the chain breaks while you are creating it. On the other hand, eager managers like open() can be used outside a "with" statement and still manage resources perfectly well for a lot of cases. It a matter of fitness for different purposes, so even "preferable" is a relative term here.
-- Rhodri James *-* Kynesim Ltd _______________________________________________
To my mind, eager context managers nest just fine, if you put each into their own context. What it seems the lazy managers let you do is squish multiple context managers into a single context, but then you get the question of which of them actually is providing the context that you are in?
On 20/11/2019 15:28, Richard Damon wrote:
On Nov 20, 2019, at 7:50 AM, Rhodri James <rhodri@kynesim.co.uk> wrote: As context managers, yes, lazy managers make chaining them easier because there's no mess to clean up if the chain breaks while you are creating it. On the other hand, eager managers like open() can be used outside a "with" statement and still manage resources perfectly well for a lot of cases. It a matter of fitness for different purposes, so even "preferable" is a relative term here.
To my mind, eager context managers nest just fine, if you put each into their own context. What it seems the lazy managers let you do is squish multiple context managers into a single context, but then you get the question of which of them actually is providing the context that you are in?
This is just semantics. Other people have meant by "nest" what you and I meant by "squish" and "chain" respectively. Once squished, all of the context managers are providing/contributing to the context you are in, which is a new and different context all its own. We could write an explicit squisher class now, but syntactical help for an implicit squisher would be nice. -- Rhodri James *-* Kynesim Ltd
On Nov 18, 2019, at 09:47, Paul Moore <p.f.moore@gmail.com> wrote:
But open() isn't designed *just* to be used in a with statement. It can be used independently as well. What about
f = open(filename) header = f.readline() with f: # use f
The open doesn't "create a future need to call __exit__". It *does* require that the returned object gets closed at some stage, but you can do that manually (for example, if "header" in the above is "do not process", maybe you'd close and return early).
And if readline or the comparison raises an exception, you want to leak f? Unless there’s some case where you don’t want to close f at all, I don’t see why you don’t want a context manager. For example, your return early case: with open(filename) as f: header = f.readline() if something(header): return # use f … does exactly what you wanted, and also properly handles exceptions. Which is the whole point of context managers. I suppose if you wanted to close f in some different way in some situations… but there is no different way to close files. And in fact, I think any object that has multiple different ways to call it shouldn’t be a context manager (if there’s not one obvious thing that “exit” means, how do you read it?); you’d want to have methods or functions to create multiple different context managers around that object. But for files, that isn’t an issue. Every call to open does create a future need to call close, and __exit__ obviously means close for files, so every call to open does create a future need to call __exit__ after all. There are cases where that future close may not be lexically bound. Maybe for some cases you close the file at the end of the function, but for other cases you pass the file off to a daemon thread that will close the file when it finishes (or just leak the file and let the OS handle it if it doesn’t finish). That seems like a case for ExitStack rather than for using a file as a context manager.
On Tue, Nov 19, 2019 at 5:47 AM Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
Unless there’s some case where you don’t want to close f at all, I don’t see why you don’t want a context manager. For example, your return early case:
with open(filename) as f: header = f.readline() if something(header): return # use f
… does exactly what you wanted, and also properly handles exceptions. Which is the whole point of context managers.
I suppose if you wanted to close f in some different way in some situations… but there is no different way to close files.
There is one other common way you might want to close a file, and that's "close it if I opened it, but otherwise don't". For example, if you're given a file name to output into, then write to that file; but otherwise, write to sys.stdout. I'm not sure what the best way to do that is, other than ExitStack, which feels clunky (like using a while loop to emulate a for loop - sure it works, but it feels wrong). ChrisA
On Mon, Nov 18, 2019, at 12:46, Paul Moore wrote:
But open() isn't designed *just* to be used in a with statement. It can be used independently as well. What about
f = open(filename) header = f.readline() with f: # use f
The open doesn't "create a future need to call __exit__". It *does* require that the returned object gets closed at some stage, but you can do that manually (for example, if "header" in the above is "do not process", maybe you'd close and return early).
sure, but "need to call close" is just a different spelling of "need to call __exit__".
Maybe I should ask the question the other way round. If we had opened(), but not open(), how would you write open() using opened()? It is, after all, easy enough to write opened() in terms of open().
I would say open() is arguably a wart. Had context managers existed in the language all along, it should not be possible to do anything that creates an open-ended future requirement to do something (i.e. "require that the returned object gets closed at some stage") without either using a with statement or writing code to manually call __enter__ and __exit__. The likely most common case being a class that holds an open file, which should really have its own context manager that forwards to the file's. to that end, open() could look something like this: def open(*a,**k): cm = opened(*a, **k) f = cm.__enter__() f.close = cm.__exit__ return f
On Nov 18, 2019, at 10:51, Random832 <random832@fastmail.com> wrote:
On Mon, Nov 18, 2019, at 12:46, Paul Moore wrote:
But open() isn't designed *just* to be used in a with statement. It can be used independently as well. What about
f = open(filename) header = f.readline() with f: # use f
The open doesn't "create a future need to call __exit__". It *does* require that the returned object gets closed at some stage, but you can do that manually (for example, if "header" in the above is "do not process", maybe you'd close and return early).
sure, but "need to call close" is just a different spelling of "need to call __exit__".
Maybe I should ask the question the other way round. If we had opened(), but not open(), how would you write open() using opened()? It is, after all, easy enough to write opened() in terms of open().
I would say open() is arguably a wart. Had context managers existed in the language all along, it should not be possible to do anything that creates an open-ended future requirement to do something (i.e. "require that the returned object gets closed at some stage") without either using a with statement or writing code to manually call __enter__ and __exit__. The likely most common case being a class that holds an open file, which should really have its own context manager that forwards to the file's.
to that end, open() could look something like this:
def open(*a,**k): cm = opened(*a, **k) f = cm.__enter__() f.close = cm.__exit__ return f
I think in that hypothetical language you might want a generic “releaser” function, or method on all cms, or even special syntax, to turn any cm into something that you’ll take care of closing later manually (usually, but not necessarily, by using the closing cm on it in a different lexical context that you end up passing the released cm to). (I think C++ smart pointers might be relevant here, or maybe something from Rust, although I haven’t thought it through in much detail.)
On 2019-11-18 5:13 p.m., Andrew Barnert via Python-ideas wrote:
On Nov 18, 2019, at 10:51, Random832 <random832@fastmail.com> wrote:
On Mon, Nov 18, 2019, at 12:46, Paul Moore wrote:
But open() isn't designed *just* to be used in a with statement. It can be used independently as well. What about
f = open(filename) header = f.readline() with f: # use f
The open doesn't "create a future need to call __exit__". It *does* require that the returned object gets closed at some stage, but you can do that manually (for example, if "header" in the above is "do not process", maybe you'd close and return early).
sure, but "need to call close" is just a different spelling of "need to call __exit__".
Maybe I should ask the question the other way round. If we had opened(), but not open(), how would you write open() using opened()? It is, after all, easy enough to write opened() in terms of open().
I would say open() is arguably a wart. Had context managers existed in the language all along, it should not be possible to do anything that creates an open-ended future requirement to do something (i.e. "require that the returned object gets closed at some stage") without either using a with statement or writing code to manually call __enter__ and __exit__. The likely most common case being a class that holds an open file, which should really have its own context manager that forwards to the file's.
to that end, open() could look something like this:
def open(*a,**k): cm = opened(*a, **k) f = cm.__enter__() f.close = cm.__exit__ return f
I think in that hypothetical language you might want a generic “releaser” function, or method on all cms, or even special syntax, to turn any cm into something that you’ll take care of closing later manually (usually, but not necessarily, by using the closing cm on it in a different lexical context that you end up passing the released cm to).
(I think C++ smart pointers might be relevant here, or maybe something from Rust, although I haven’t thought it through in much detail.)
could we tweak open() so it doesn't raise immediately? this would make it play nicer with __enter__ but would probably break some things. this would make open() itself "never" fail.
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/7Y4B5G... Code of Conduct: http://python.org/psf/codeofconduct/
On 2019-11-18 5:22 p.m., Soni L. wrote:
On 2019-11-18 5:13 p.m., Andrew Barnert via Python-ideas wrote:
On Nov 18, 2019, at 10:51, Random832 <random832@fastmail.com> wrote:
f = open(filename) header = f.readline() with f: # use f The open doesn't "create a future need to call __exit__". It *does* require that the returned object gets closed at some stage, but you can do that manually (for example, if "header" in the above is "do not
On Mon, Nov 18, 2019, at 12:46, Paul Moore wrote: But open() isn't designed *just* to be used in a with statement. It can be used independently as well. What about process", maybe you'd close and return early). > > sure, but "need to call close" is just a different spelling of "need to call __exit__".
Maybe I should ask the question the other way round. If we had opened(), but not open(), how would you write open() using opened()? It is, after all, easy enough to write opened() in terms of open(). I would say open() is arguably a wart. Had context managers existed in the language all along, it should not be possible to do anything that creates an open-ended future requirement to do something (i.e. "require that the returned object gets closed at some stage") without either using a with statement or writing code to manually call __enter__ and __exit__. The likely most common case being a class that holds an open file, which should really have its own context manager that forwards to the file's. to that end, open() could look something like this: def open(*a,**k): cm = opened(*a, **k) f = cm.__enter__() f.close = cm.__exit__ return f
I think in that hypothetical language you might want a generic “releaser” function, or method on all cms, or even special syntax, to turn any cm into something that you’ll take care of closing later manually (usually, but not necessarily, by using the closing cm on it in a different lexical context that you end up passing the released cm to).
(I think C++ smart pointers might be relevant here, or maybe something from Rust, although I haven’t thought it through in much detail.)
could we tweak open() so it doesn't raise immediately? this would make it play nicer with __enter__ but would probably break some things. this would make open() itself "never" fail.
let me ask again: can we make it so open() never fails, instead returning a file that can be either "open", "closed" or "errored"? operations on "errored" files would, well, raise. more specifically, __enter__ would raise. thus, `with (open("foo"), open("bar")) as (foo, bar):` would actually work.
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/7Y4B5G... Code of Conduct: http://python.org/psf/codeofconduct/
On Nov 18, 2019, at 16:35, Soni L. <fakedme+py@gmail.com> wrote:
On 2019-11-18 5:22 p.m., Soni L. wrote:
On 2019-11-18 5:13 p.m., Andrew Barnert via Python-ideas wrote:
On Nov 18, 2019, at 10:51, Random832 <random832@fastmail.com> wrote:
> f = open(filename)
> The open doesn't "create a future need to call __exit__". It *does* require that the returned object gets closed at some stage, but you can do that manually (for example, if "header" in the above is "do not
On Mon, Nov 18, 2019, at 12:46, Paul Moore wrote: But open() isn't designed *just* to be used in a with statement. It can be used independently as well. What about header = f.readline() with f: # use f process", maybe you'd close and return early). > > sure, but "need to call close" is just a different spelling of "need to call __exit__".
Maybe I should ask the question the other way round. If we had opened(), but not open(), how would you write open() using opened()? It is, after all, easy enough to write opened() in terms of open(). I would say open() is arguably a wart. Had context managers existed in the language all along, it should not be possible to do anything that creates an open-ended future requirement to do something (i.e. "require that the returned object gets closed at some stage") without either using a with statement or writing code to manually call __enter__ and __exit__. The likely most common case being a class that holds an open file, which should really have its own context manager that forwards to the file's. to that end, open() could look something like this: def open(*a,**k): cm = opened(*a, **k) f = cm.__enter__() f.close = cm.__exit__ return f
I think in that hypothetical language you might want a generic “releaser” function, or method on all cms, or even special syntax, to turn any cm into something that you’ll take care of closing later manually (usually, but not necessarily, by using the closing cm on it in a different lexical context that you end up passing the released cm to).
(I think C++ smart pointers might be relevant here, or maybe something from Rust, although I haven’t thought it through in much detail.)
could we tweak open() so it doesn't raise immediately? this would make it play nicer with __enter__ but would probably break some things. this would make open() itself "never" fail.
let me ask again: can we make it so open() never fails, instead returning a file that can be either "open", "closed" or "errored"?
operations on "errored" files would, well, raise.
more specifically, __enter__ would raise.
thus, `with (open("foo"), open("bar")) as (foo, bar):` would actually work.
Sure, that could work. After all, when POSIX open returns -1, if you don’t check it explicitly and just start calling functions, they all fail with EBADFD. (Although, being C, you have to check _those_ failures explicitly—but that wouldn’t be an issue for Python.) It seems like it would be just as much of a breaking change as everything else suggested here. Plenty of code expects open to fail immediately, and will do the wrong thing if it doesn’t, unless it’s all rewritten to use with immediately, and to use tuples as context managers rather than any other idiom when opening multiple files. Also, does every third-party (or maybe even stdlib?) file-like object that’s managing a resource have to change to not open the socket or whatever until you enter it? But by the same token, I don’t think it breaks any _more_ code than the other ideas, and maybe it’s simpler than some of them? But doesn’t that raise the same issue discussed in the other subthread of how you handle cases where you don’t want a context manager anywhere at all? Do you have to change them all to explicitly call __enter__, and then call __exit__ instead of close? Or do we add a really_open method paired with close for those cases? Or can we just ban all such cases and say that you’re always supposed to find a way to wrap them in a cm somehow? One more thing: what happens when I, e.g., call open on an fd? Today we can expect that as soon as we do that, the file object owns the file handle. Either you’d change that so the file object doesn’t own the file handle until it’s entered, or you’d make open inconsistent about whether it owns a file object before enter, which could be confusing.
On 2019-11-18 10:59 p.m., Andrew Barnert wrote:
On Nov 18, 2019, at 16:35, Soni L. <fakedme+py@gmail.com> wrote:
On 2019-11-18 5:22 p.m., Soni L. wrote:
On 2019-11-18 5:13 p.m., Andrew Barnert via Python-ideas wrote:
On Nov 18, 2019, at 10:51, Random832 <random832@fastmail.com> wrote:
On Mon, Nov 18, 2019, at 12:46, Paul Moore wrote: But open() isn't designed *just* to be used in a with statement. It can be used independently as well. What about >> f = open(filename) header = f.readline() with f: # use f >> The open doesn't "create a future need to call __exit__". It *does* require that the returned object gets closed at some stage, but you can do that manually (for example, if "header" in the above is "do not process", maybe you'd close and return early). > > sure, but "need to call close" is just a different spelling of "need to call __exit__". > Maybe I should ask the question the other way round. If we had opened(), but not open(), how would you write open() using opened()? It is, after all, easy enough to write opened() in terms of open(). I would say open() is arguably a wart. Had context managers existed in the language all along, it should not be possible to do anything that creates an open-ended future requirement to do something (i.e. "require that the returned object gets closed at some stage") without either using a with statement or writing code to manually call __enter__ and __exit__. The likely most common case being a class that holds an open file, which should really have its own context manager that forwards to the file's. to that end, open() could look something like this: def open(*a,**k): cm = opened(*a, **k) f = cm.__enter__() f.close = cm.__exit__ return f
I think in that hypothetical language you might want a generic “releaser” function, or method on all cms, or even special syntax, to turn any cm into something that you’ll take care of closing later manually (usually, but not necessarily, by using the closing cm on it in a different lexical context that you end up passing the released cm to).
(I think C++ smart pointers might be relevant here, or maybe something from Rust, although I haven’t thought it through in much detail.)
could we tweak open() so it doesn't raise immediately? this would make it play nicer with __enter__ but would probably break some things. this would make open() itself "never" fail.
let me ask again: can we make it so open() never fails, instead returning a file that can be either "open", "closed" or "errored"?
operations on "errored" files would, well, raise.
more specifically, __enter__ would raise.
thus, `with (open("foo"), open("bar")) as (foo, bar):` would actually work.
Sure, that could work. After all, when POSIX open returns -1, if you don’t check it explicitly and just start calling functions, they all fail with EBADFD. (Although, being C, you have to check _those_ failures explicitly—but that wouldn’t be an issue for Python.)
It seems like it would be just as much of a breaking change as everything else suggested here. Plenty of code expects open to fail immediately, and will do the wrong thing if it doesn’t, unless it’s all rewritten to use with immediately, and to use tuples as context managers rather than any other idiom when opening multiple files. Also, does every third-party (or maybe even stdlib?) file-like object that’s managing a resource have to change to not open the socket or whatever until you enter it?
No. They have to change to not *raise* until you either enter it, or call some other method on it.
But by the same token, I don’t think it breaks any _more_ code than the other ideas, and maybe it’s simpler than some of them?
But doesn’t that raise the same issue discussed in the other subthread of how you handle cases where you don’t want a context manager anywhere at all? Do you have to change them all to explicitly call __enter__, and then call __exit__ instead of close? Or do we add a really_open method paired with close for those cases? Or can we just ban all such cases and say that you’re always supposed to find a way to wrap them in a cm somehow?
No. If it's "errored" then close will raise, but otherwise it'll close. No need to explicitly call __enter__. Everything would work as it does today except open() itself wouldn't immediately raise but defer the raise to any method call - be that __enter__, read, close, whatever. If you don't want to use a cm, wrap the reads/writes/close in try/except.
One more thing: what happens when I, e.g., call open on an fd? Today we can expect that as soon as we do that, the file object owns the file handle. Either you’d change that so the file object doesn’t own the file handle until it’s entered, or you’d make open inconsistent about whether it owns a file object before enter, which could be confusing.
I don't know? Can open(fd) raise currently? If so, it'd stop raising, and would defer the raise to any method call, but otherwise open the fd immediately.
On Tue, 19 Nov 2019 at 00:35, Soni L. <fakedme+py@gmail.com> wrote:
On 2019-11-18 5:22 p.m., Soni L. wrote:
could we tweak open() so it doesn't raise immediately? this would make it play nicer with __enter__ but would probably break some things. this would make open() itself "never" fail.
let me ask again: can we make it so open() never fails, instead returning a file that can be either "open", "closed" or "errored"?
operations on "errored" files would, well, raise.
more specifically, __enter__ would raise.
thus, `with (open("foo"), open("bar")) as (foo, bar):` would actually work.
There would have to be a very strong reason for making this kind of change to open because it would break a lot of code. Nothing in this thread comes close to warranting such a change. However there could also be a new function with a different name that behaved in a different way so that people could choose to use that if they wanted. What you are suggesting is similar to the opened function discussed elsewhere in this thread. -- Oscar
On Mon, Nov 18, 2019, at 19:32, Soni L. wrote:
let me ask again: can we make it so open() never fails, instead returning a file that can be either "open", "closed" or "errored"?
For one thing, it'd have to *truly never* fail. Not just on I/O errors but on things like passing bad function arguments. That's hard to do with a function written in C, and it would also mean when you eventually get the error the traceback won't point at the line where you made a mistake. And all "hybrid context manager" constructs would have to do this - not just open but requests.get and sqlite3.connect and whatever else is out there. I think the best we can do is make new context managers and encourage their use, since the principle that context managers should not acquire resources that will need to be released until __enter__ is called was broken at the very beginning.
operations on "errored" files would, well, raise.
more specifically, __enter__ would raise.
thus, `with (open("foo"), open("bar")) as (foo, bar):` would actually work.
On 19/11/19 3:32 pm, Random832 wrote:
For one thing, it'd have to *truly never* fail.
Even if open() itself truly never failed, there would still be room to get yourself into trouble. E.g. with (open(get_filename_1()), open(get_filename_2())) as (f1, f2): ... If get_filename_2() fails, the first file would be left open. -- Greg
On 2019-11-18 10:49, Random832 wrote:
I would say open() is arguably a wart. Had context managers existed in the language all along, it should not be possible to do anything that creates an open-ended future requirement to do something (i.e. "require that the returned object gets closed at some stage") without either using a with statement or writing code to manually call __enter__ and __exit__. The likely most common case being a class that holds an open file, which should really have its own context manager that forwards to the file's.
I agree. I think this discussion wouldn't be necessary if there were a context manager that was like open() but opened the file in __enter__ and closed it in __exit__. I would go so far as to say I think it would be a good idea to add such a context manager to Python and deprecate open(). When I teach Python, I don't really even explain open() in isolation to students, I just tell them to always use it with a with statement. If I see code that uses open() outside a with statement it's a red flag to me. These hypothetical cases where you might want to use open() outside a with statement seem, to me, to be fairly esoteric. The best policy is for every file-opening operation to occur as part of a context manager, -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On 18 Nov 2019, at 19:27, Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2019-11-18 10:49, Random832 wrote:
I would say open() is arguably a wart. Had context managers existed in the language all along, it should not be possible to do anything that creates an open-ended future requirement to do something (i.e. "require that the returned object gets closed at some stage") without either using a with statement or writing code to manually call __enter__ and __exit__. The likely most common case being a class that holds an open file, which should really have its own context manager that forwards to the file's.
I agree. I think this discussion wouldn't be necessary if there were a context manager that was like open() but opened the file in __enter__ and closed it in __exit__. I would go so far as to say I think it would be a good idea to add such a context manager to Python and deprecate open().
When I teach Python, I don't really even explain open() in isolation to students, I just tell them to always use it with a with statement. If I see code that uses open() outside a with statement it's a red flag to me. These hypothetical cases where you might want to use open() outside a with statement seem, to me, to be fairly esoteric. The best policy is for every file-opening operation to occur as part of a context manager,
Esoteric? I work with files that are open beyond the scope of a single block of code all the time. Clearly I need to arrange to close the file eventually. Why would I want the uglyness of __exit__ rather then close()? I use with all the time if the open() is with linear code. Barry
-- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/STCWSH... Code of Conduct: http://python.org/psf/codeofconduct/
On 18.11.2019 21:13, Barry wrote:
On 18 Nov 2019, at 19:27, Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2019-11-18 10:49, Random832 wrote:
I would say open() is arguably a wart. Had context managers existed in the language all along, it should not be possible to do anything that creates an open-ended future requirement to do something (i.e. "require that the returned object gets closed at some stage") without either using a with statement or writing code to manually call __enter__ and __exit__. The likely most common case being a class that holds an open file, which should really have its own context manager that forwards to the file's.
I agree. I think this discussion wouldn't be necessary if there were a context manager that was like open() but opened the file in __enter__ and closed it in __exit__. I would go so far as to say I think it would be a good idea to add such a context manager to Python and deprecate open().
When I teach Python, I don't really even explain open() in isolation to students, I just tell them to always use it with a with statement. If I see code that uses open() outside a with statement it's a red flag to me. These hypothetical cases where you might want to use open() outside a with statement seem, to me, to be fairly esoteric. The best policy is for every file-opening operation to occur as part of a context manager,
Esoteric? I work with files that are open beyond the scope of a single block of code all the time. Clearly I need to arrange to close the file eventually. Why would I want the uglyness of __exit__ rather then close()? I use with all the time if the open() is with linear code.
Same here. In fact, I regularly have classes manage open files as part of their state, store it in instance attributes, have methods operate on it and close it either as part of the instance cleanup or explicitly in a .close() or .commit() method. In fact, context managers themselves are often structured in this way and the OPs use case could also be handled by such a context manager class which groups resources. While context managers are nice to define a block context, they are certainly not the only way to define the context of an operation. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Nov 18 2019)
Python Projects, Coaching and Consulting ... http://www.egenix.com/ Python Database Interfaces ... http://products.egenix.com/ Plone/Zope Database Interfaces ... http://zope.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/
On Nov 18, 2019, at 12:34, Barry <barry@barrys-emacs.org> wrote:
Esoteric? I work with files that are open beyond the scope of a single block of code all the time. Clearly I need to arrange to close the file eventually. Why would I want the uglyness of __exit__ rather then close()? I use with all the time if the open() is with linear code.
I think his “esoteric” was a bit strong, but take a step back and think about this: If the normal way to open files gave you a context manager, and you had to “release” the file from it (with a function or method or special syntax or whatever, but something explicit), would that make your code substantially worse? If there were two builtins to open a file, one that gave you a cm and one that didn’t, would that raise the burden of learning and remembering Python too high? What if the second one wasn’t a builtin but had to be imported from io or something? Neither one requires (or even at all encourages) you to manually call __exit__ instead of close, but they solve the same problem (by just putting that __exit__ under the covers where you don’t have to see it, but that’s still a solution). Would either be terrible. And if it would allow us to make context managers easier to combine and reason about, maybe it would be worth it, even if it made some of your code a little more verbose (requiring a release, or an io.open, or whatever)? But I’m not sure we could get there from here in Python. Also, I suspect this wouldn’t really solve the problem. What you really want is something like Rust, where you don’t extract the raw object out of its ownership context, you move it from one ownership context to another—whether that’s a with statement in the caller, or an RAII wrapper around an attribute, or some threaded thingy, or whatever. (You _can_ get the raw file object or pointer to allocated memory or whatever, but you only do that in rare cases, mostly to build new move functions for new context managers.) And I don’t see how to fit that in the with statement design. It’s easy with the “everything is owned by default and every ownership is scoped by default” model that Rust has and C++ vaguely approximates, but with Python you’d need… I’m not sure, but I suspect it would be ugly.
On 2019-11-18 11:24, Brendan Barnwell wrote:
I agree. I think this discussion wouldn't be necessary if there were a context manager that was like open() but opened the file in __enter__ and closed it in __exit__. I would go so far as to say I think it would be a good idea to add such a context manager to Python and deprecate open().
We could bring back file() and improve one of them. 😂 -Mike
On 19/11/19 4:54 am, Paul Moore wrote:
"should only manage one resource" is not actually correct - the whole *point* of something like nested() would be to manage multiple resources
I think a better way to say it would be that the __enter__ method should be atomic -- it should either acquire all the resources it needs or none of them. Then it's clear that the with statement should call __exit__ if and only if __enter__ does not raise an exception. -- Greg
On Mon, Nov 18, 2019, at 03:42, Paul Moore wrote:
The context here has been lost - I've searched the thread and I can't find a proper explanation of how open() "misbehaves" in any way that seems to relate to this statement (I don't actually see any real explanation of any problem with open() to be honest). There's some stuff about what happens if open() itself fails, but I don't see how that results in a problem (as opposed to something like a subtle application error because the writer didn't realise this could happen).
Can someone restate the problem please?
This particular chain of discussion is regarding a proposal to solve the problem posed in the original topic by using a parenthesized tuple display, i.e. code that looks like the following: with (open(filename1), open(filename2)) as (file1, file2): ... If there is no special handling for this syntax, this would be equivalent to: files = (open(filename1), open(filename2)) with files as (file1, file2): ... i.e. the `open` calls are all finished (and throw any exceptions) before the tuple is constructed, and therefore its proposed __enter__ method cannot be called. This means if open(filename2) fails, it is not protected by the with block as it would be normally if the error occurred in the file object's __enter__ method, and file1.__exit__() never gets called The concept of a callable that returns a context manager (whether that is open, requests.get, or any other context manager's constructor) and can throw an exception is therefore a problem for the tuple.__enter__ proposal which was being discussed in this subthread, and generally for any other construct like it.
On Mon, 18 Nov 2019 at 17:55, Random832 <random832@fastmail.com> wrote:
This particular chain of discussion is regarding a proposal to solve the problem posed in the original topic by using a parenthesized tuple display, i.e. code that looks like the following:
with (open(filename1), open(filename2)) as (file1, file2):
Thanks. I'm still not convinced that's a "problem" that needs solving - backslashes are a little ugly but fine IMO. But I now understand the discussion in this subthread, so thanks for that. Paul
On Mon, Nov 18, 2019, at 12:59, Paul Moore wrote:
On Mon, 18 Nov 2019 at 17:55, Random832 <random832@fastmail.com> wrote:
This particular chain of discussion is regarding a proposal to solve the problem posed in the original topic by using a parenthesized tuple display, i.e. code that looks like the following:
with (open(filename1), open(filename2)) as (file1, file2):
Thanks. I'm still not convinced that's a "problem" that needs solving - backslashes are a little ugly but fine IMO. But I now understand the discussion in this subthread, so thanks for that. Paul
I think in a broader sense it does raise a question of "if it's okay for functions like open to be like this, why have __enter__ at all?" - but the barn door's been open nearly a decade on that.
On Fri, 15 Nov 2019 at 22:54, Chris Angelico <rosuav@gmail.com> wrote:
On Sat, Nov 16, 2019 at 9:44 AM Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
On Fri, 15 Nov 2019 at 12:04, Serhiy Storchaka <storchaka@gmail.com> wrote:
15.11.19 12:40, Jonathan Fine пише:
The original poster wanted, inside 'with' context management, to open several files. Context management is, roughly speaking, deferred execution wrapped in a try ... except .... statement.
In case of open() there is no deferred execution. The resource is acquired in open(), not in __enter__().
I've often thought that this was the root of various awkwardnesses with context managers. Ideally a well-behaved context manager would only use __exit__ to clean up after __enter__ but open doesn't do that. The nested context manager was designed for this type of well-behaved context manager but was then considered harmful because open (one of the most common context managers) misbehaves.
Maybe some of these things could be simpler if it was clarified that a context manager shouldn't acquire resource before __enter__ and a new version of open was provided.
Hmm. What exactly is the object that you have prior to the file being opened? It can't simply be a File, because you need to specify parameters to the open() call. Is it a "file ready to be opened"? What's the identity of that?
Good question :) Maybe it's an "opener": class opener: def __init__(self, *args): self.args = args def __enter__(self): fileobj = self.fileobj = open(*self.args) return self.fileobj def __exit__(self, *args): return self.fileobj with opener('w.txt', 'w') as fout: fout.write('asd\n'*10) -- Oscar
On 15/11/19 5:54 am, Paul Moore wrote:
On Thu, 14 Nov 2019 at 16:42, Random832 <random832@fastmail.com> wrote:
So, uh... what if we didn't need backslashes for statements that begin with a keyword and end with a colon?
Not sure about ambiguity, but it would require a much more powerful parser than Python currently has
I'm not convinced of that. The parser already handles ignoring newlines inside parenthesised expressions. Maybe the technique used for that could be adapted to ignore newlines between a statement-opening keyword and its matching colon? In any case, I really doubt that arbitrary lookahead is required to do this. -- Greg
On 2019-11-13 18:26, gabriel.kabbe@mail.de wrote:
Hello everybody,
today I tried to open four files simultaneously by writing
with ( open(fname1) as f1, open(fname2) as f2, open(fname3) as f3, open(fname4) as f4 ): ...
However, this results in a SyntaxError which is caused by the extra brackets. Is there a reason that brackets are not allowed in this place?
The syntax (slightly simplified) is: "with" expression ["as" name] ":" but the expression itself can start with a parenthesis, so if it saw a parenthesis after the "with" it would be ambiguous. For example, compare: with (open(fname1) as f1, open(fname2) as f2): with: with (open(fname1)) as f1, open(fname2) as f2:
On Wed, Nov 13, 2019 at 11:26 AM MRAB <python@mrabarnett.plus.com> wrote:
"with" expression ["as" name] ":"
but the expression itself can start with a parenthesis, so if it saw a parenthesis after the "with" it would be ambiguous
I have used 'with' for so long that I was under the impression that the as-target was just a name as in MRAB's simplified syntax above, so imagine my surprise when I tried putting parentheses around the target and didn't get a syntax error straight away. I of course had to explore a bit, and came up with this. The ugly formatting of the with is simply to show that the parens behave as expected. class opener: def __init__(self, *files): self.files = [open(file) for file in files] def __enter__(self): return [file.__enter__() for file in self.files] def __exit__(self, *exc_info): for file in self.files: file.__exit__(*exc_info) return True with opener( 'x', 'y', 'z' ) as ( f1, f2, f3 ): print(f1) print(f1.closed) print(f2) print(f3) print(f1.closed)
On Thu, Nov 14, 2019 at 8:48 AM Eric Fahlgren <ericfahlgren@gmail.com> wrote:
I have used 'with' for so long that I was under the impression that the as-target was just a name as in MRAB's simplified syntax above, so imagine my surprise when I tried putting parentheses around the target and didn't get a syntax error straight away. I of course had to explore a bit, and came up with this. The ugly formatting of the with is simply to show that the parens behave as expected. class opener: def __init__(self, *files): self.files = [open(file) for file in files] def __enter__(self): return [file.__enter__() for file in self.files] def __exit__(self, *exc_info): for file in self.files: file.__exit__(*exc_info) return True
with opener( 'x', 'y', 'z' ) as ( f1, f2, f3 ): print(f1) print(f1.closed) print(f2) print(f3)
print(f1.closed)
Note that the semantics here are NOT the same as the semantics of either ExitStack or a single large 'with' statement. (And I'm not sure whether or not those two are the same.) With your opener, you first open each file, then enter each file; and only then are you considered to be inside the context. With a large 'with' statement, they should (I believe) be opened and entered individually. If one of the files fails to open, your constructor will fail, and self.files won't be set - you won't exit each of the files that you *did* open. ChrisA
What Eric Fahlgren wants is basically the deprecated contextlib.nested function, that function should have the right semantics. See https://github.com/python/cpython/blob/2.7/Lib/contextlib.py#L88-L129 On Wed, Nov 13, 2019 at 6:55 PM Chris Angelico <rosuav@gmail.com> wrote:
I have used 'with' for so long that I was under the impression that the as-target was just a name as in MRAB's simplified syntax above, so imagine my surprise when I tried putting parentheses around the target and didn't get a syntax error straight away. I of course had to explore a bit, and came up with this. The ugly formatting of the with is simply to show that
On Thu, Nov 14, 2019 at 8:48 AM Eric Fahlgren <ericfahlgren@gmail.com> wrote: the parens behave as expected.
class opener: def __init__(self, *files): self.files = [open(file) for file in files] def __enter__(self): return [file.__enter__() for file in self.files] def __exit__(self, *exc_info): for file in self.files: file.__exit__(*exc_info) return True
with opener( 'x', 'y', 'z' ) as ( f1, f2, f3 ): print(f1) print(f1.closed) print(f2) print(f3)
print(f1.closed)
Note that the semantics here are NOT the same as the semantics of either ExitStack or a single large 'with' statement. (And I'm not sure whether or not those two are the same.) With your opener, you first open each file, then enter each file; and only then are you considered to be inside the context. With a large 'with' statement, they should (I believe) be opened and entered individually. If one of the files fails to open, your constructor will fail, and self.files won't be set - you won't exit each of the files that you *did* open.
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/RT3RFV... Code of Conduct: http://python.org/psf/codeofconduct/
-- Sebastian Kreft
On Wed, Nov 13, 2019 at 2:09 PM Sebastian Kreft <skreft@gmail.com> wrote:
What Eric Fahlgren wants is basically the deprecated contextlib.nested function, that function should have the right semantics. See https://github.com/python/cpython/blob/2.7/Lib/contextlib.py#L88-L129
Oh, no, I don't want it. :) I just was killing time reading py-ideas while eating lunch and was somewhat surprised by the ability to return a tuple as the context manager result and have it unpack appropriately, nothing more. I absolutely would never use my example code a production environment, and if not there, then why even bother using it in a hacked up one-time script, where I'd just leave the files open and assume they'll be closed on exit...
participants (27)
-
Andrew Barnert
-
Barry
-
Brendan Barnwell
-
Brian Skinn
-
Chris Angelico
-
Eric Fahlgren
-
Gabriel Kabbe
-
gabriel.kabbe@mail.de
-
Greg Ewing
-
Guido van Rossum
-
James Edwards
-
Joao S. O. Bueno
-
Jonathan Fine
-
M.-A. Lemburg
-
Mike Miller
-
MRAB
-
Neil Girdhar
-
Oscar Benjamin
-
Paul Moore
-
Random832
-
Rhodri James
-
Richard Damon
-
Richard Damon
-
Ricky Teachey
-
Sebastian Kreft
-
Serhiy Storchaka
-
Soni L.