[Python-ideas] Proposal: Allowing any variable to be used in a 'with... as...' expression

Yonatan Zunger zunger at humu.com
Sat May 18 20:13:49 EDT 2019


Hi everyone,

I'd like to bounce this proposal off everyone and see if it's worth
formulating as a PEP. I haven't found any prior discussion of it, but as we
all know, searches can easily miss things, so if this is old hat please LMK.

*Summary: *The construction

with expr1 as var1, expr2 as var2, ...:
    body

fails (with an AttributeError) unless each expression returns a value
satisfying the context manager protocol. Instead, we should permit any
expression to be used. If a value does not expose an __enter__ method, it
should behave as though its __enter__ method is return self; if it does not
have an __exit__ method, it should behave as though that method is return
False.

*Rationale: *The with statement has proven to be a valued extension to
Python. In addition to providing improved readability for block scoping, it
has strongly encouraged the use of scoped cleanups for objects which
require them, such as files and mutices, in the process eliminating a lot
of annoying bugs. I would argue that at present, whenever dealing with an
object which requires such cleanup at a known time, with should be the
default way of doing it, and *not* doing so is the sort of thing one should
be explaining in a code comment. However, the current syntax makes a few
common patterns harder to implement than they should be.

For example, this is a good pattern:

with functionReturningFile(...) as input:
   doSomething(input)

There are many cases where an Optional[file] makes sense as a parameter, as
well; for example, an optional debug output stream, or an input source
which may either be a file (if provided) or some non-file source (by
default). Likewise, there are many cases where a function may naturally
return an Optional[file], e.g. "open the file if the user has provided the
filename." However, the following is *not* valid Python:

with functionReturningOptionalFile(...) as input:
   doSomething(input)

To handle this case, one has a few options. One may only use the 'with' in
the known safe cases:

inputFile = functionReturningOptionalFile(...)
if inputFile:
    with inputFile as input:
        doSomething(input)
else:
    doSomething(None)

(NB that this requires factoring the with statement body into its own
function, which may separately reduce readability and/or introduce
overhead); one may dispense with the 'with' clause and do it in the
pre-PEP343 way:

try:
    input = functionReturningOptionalFile(...)
    doSomething(input)
finally:
    if input:
        input.close()

(This sacrifices all the benefits of the with statement, and requires the
caller to explicitly call the cleanup methods, increasing error-proneness);
or one may construct an explicit 'dev-null' class and return it instead of
the file:

class DevNullFile(object):
    .... implement the entire File API, including a context manager ...

(This can only be described as god-awful, especially for complex API's like
files)

One obvious option would be to allow None to act as a context manager as
well. We might contrast this with PEP 336
<https://www.python.org/dev/peps/pep-0336/>, "Make None Callable." This was
rejected (rightly, I think) because "it is considered a feature that None
raises an error if called." For example, it means that if a function
variable has been nulled, attempting to call it later raises an error, as
this usually indicates a code mistake. In the case where that is not
correct, it is easy to assign a noop lambda to the function variable
instead of None, thus allowing the error-checking and the
function-deactivating behaviors to both persist, and in a clear and easily
understandable way.

In this case, OTOH, the AttributeError raised if None is passed to a with
statement has significantly lower value. As the example above illustrates,
there are many cases where None is an entirely legitimate value to want to
pass, and unlike in the other situation, there is no equally easy way to
pass it. Furthermore, if the passing of None *is* an error in some case, it
is more useful to see that error at the site where the variable is actually
used in the with statement body -- the thing for which it does not make
sense to use None -- rather than at a structural declaration which
essentially defines a variable scope.

This is also the reason why such a change would impact relatively little
existing code: code already has to be structured to prevent this from
happening. If the assigned expression in the with statement could only
return None as a result of a code bug, and a piece of existing code is
relying on the with statement to catch it, it would instead fall through
and be caught by their own body code, presumably giving a more coherent
error anyway. This is a nonzero change in behavior, but it's well within
the scope of behavior changes which normally occur from version to version.

One alternative to this proposal would be to have only None allowed to act
as a context manager. However, None is not particularly special in this
regard; the logic above applies to any function which might return a Union
type. Furthermore, allowing it for any type would permit the following
construction as well:

with var1 as expr1, var2 as expr2, ...
    .... body ...

where the common factor between the variables is no longer their need for a
guaranteed cleanup operation, but simply that they are semantically all
tied to a single scope of the code. This improves code clarity, as it
allows the syntax to follow the intent more closely, and also eliminates
one other ugliness. In present Python, the required syntax for the above
would be

var1 = expr1
var3 = expr3
with var2 as expr2, var4 as expr4:
    ... body ...

where the variables in the 'with' statement are those which satisfy the
context manager protocol, and the ones above it are those which do not
satisfy the protocol. The split between the two is entirely tied to a
nonlocal fact about the code, namely the implementation of the return
values of each of the expressions, making it nonobvious which is which.
Worse, if the expressions depend on each other in sequence, this may have
to be broken up into

var1 = expr1
with var2 as expr2:
    var3 = expr3(var1, var2)
    with var4 as expr4(var3, ...):
        .... body ...

This seems to lose on every measure of clarity and maintainability relative
to the single compound 'with' statement above.

Finally, one may ask if an (effective) default implementation of a protocol
is ever a good idea. "Hidden defaults" are a great way to trigger
surprising behavior, after all. However, in this case I would argue that
the proposed default behavior is sufficiently obvious that there is no
risk. Someone seeing a compound 'with' statement of the above form would
naturally assume that its consequence is (a) to set each varN to the
corresponding exprN, and (b) to execute any scope-initializers tied to
exprN. Likewise, someone would naturally assume that nothing at all happens
at scope exit, which is exactly the behavior of __exit__ being 'return
False'. In fact, this *increases* local code clarity, since the
counter-case -- where the implementation of each defaults (effectively) to
raising an AttributeError -- is nonobvious and so requires that "nonlocal
knowledge" of the code to assemble with statements.

*Specific implementation proposal: *Actually defining __enter__ and
__exit__ methods for each object would be a lot of overhead for no good
value. Instead, we can easily implement this as a change to the specified
behavior of the 'with' statement, simply by changing the error-handling
behavior in the SETUP_WITH
<https://github.com/python/cpython/blob/d5d9e81ce9a7efc5bc14a5c21398d1ef6f626884/Python/ceval.c#L3119>
and WITH_CLEANUP_START
<https://github.com/python/cpython/blob/d5d9e81ce9a7efc5bc14a5c21398d1ef6f626884/Python/ceval.c#L3119>
cases in ceval.c. If this does proceed to the PEP stage, I'll put together
a changelist, but it's very straightforward. Null values for enter and exit
are no longer errors; if enter is null, then instead of decrementing the
refcount of mgr and calling enter, we leave the mgr refcount alone and push
it onto the stack in place of the result. If exit is null, we simply push
it onto the stack just like we would normally, and ignore it in
WITH_CLEANUP_START.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20190518/9279eacb/attachment.html>


More information about the Python-ideas mailing list