[Python-ideas] Make "yield" inside a with statement a SyntaxError

Ken Hilton kenlhilton at gmail.com
Wed Aug 8 02:14:47 EDT 2018


This mostly springs off of a comment I saw in some thread.

The point of a with statement is that it ensures that some resource will be
disposed of, yes? For example, this:

    with open(filename) as f:
        contents = f.read()

is better than this:

    contents = open(filename).read()

because the former definitely closes the file while the latter relies on
garbage collection?

The point of a yield expression is to suspend execution. This is nice for
efficient looping because instead of having to hold all results in memory,
each result can be consumed immediately, yes? Therefore this:

    def five_to_one():
        for i in range(4):
            yield 5 - i

is better than this:

    def five_to_one():
        result = []
        for i in range(4):
            result.append(5 - i)
        return result

because the former suspends execution of "five_to_one" while the latter
holds all five results in memory?

Now, let's take a look at the following scenario:

    def read_multiple(*filenames):
        for filename in filenames:
            with open(filename) as f:
                yield f.read()

Can you spot the problem? The "with open(filename)" statement is supposed
to ensure that the file object is disposed of properly. However, the "yield
f.read()" statement suspends execution within the with block, so if this
happened:

    for contents in read_multiple('chunk1', 'chunk2', 'chunk3'):
        if contents == 'hello':
            break

and the contents of "chunk2" were "hello" then the loop would exit, and
"chunk2" would never be closed! Yielding inside a with block, therefore,
doesn't make sense and can only lead to obscure bugs.
The proper way to define the "read_multiple" function would be like so:

    def read_multiple(*filenames):
        for filename in filenames:
            with open(filename) as f:
                contents = f.read()
            yield contents

Save the contents in a variable somewhere, then yield the variable, instead
of suspending execution within a context manager.

I believe all possible cases where one would yield inside a context manager
can be covered by saving anything required from the context manager and
then yielding the results outside. Therefore, I propose making a "yield"
inside a with block become a SyntaxError.

This means the first "read_multiple" definition I presented will become
illegal and fail *at compile-time*. However, it is still legal to define a
generator inside a with block:

    def pass_file_chars(oldfunc):
        with open('secretfile') as f:
            contents = f.read()
            @functools.wraps
            def newfunc(*args, **kwargs):
                for char in contents:
                    yield oldfunc(char, *args, **kwargs)
        return newfunc

This is probably a bad example, but I hope it still explains why it should
be legal to define generators in context managers - as long as the with
block serves its purpose correctly, everything else should still work
normally.

For those concerned about backwards compatibility: I believe that those who
attempt to yield inside a context manager will already discover that
results are undefined when doing so; this will simply make it more obvious
that suspending execution in a with block is not meant to happen, and
convert undefined behavior into a straight-up SyntaxError.

What are your thoughts?

Sharing,
Ken Hilton;
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180808/d702b431/attachment-0001.html>


More information about the Python-ideas mailing list