[Python-ideas] Make "yield" inside a with statement a SyntaxError

Ronald Oussoren ronaldoussoren at mac.com
Wed Aug 8 10:22:42 EDT 2018



> On 8 Aug 2018, at 14:35, Rhodri James <rhodri at kynesim.co.uk> wrote:
> 
> On 08/08/18 07:14, Ken Hilton wrote:
>> Now, let's take a look at the following scenario:
>>     def read_multiple(*filenames):
>>         for filename in filenames:
>>             with open(filename) as f:
>>                 yield f.read()
>> Can you spot the problem? The "with open(filename)" statement is supposed
>> to ensure that the file object is disposed of properly. However, the "yield
>> f.read()" statement suspends execution within the with block, so if this
>> happened:
>>     for contents in read_multiple('chunk1', 'chunk2', 'chunk3'):
>>         if contents == 'hello':
>>             break
>> and the contents of "chunk2" were "hello" then the loop would exit, and
>> "chunk2" would never be closed! Yielding inside a with block, therefore,
>> doesn't make sense and can only lead to obscure bugs.
> 
> An incomplete analysis and therefore an incorrect conclusion.  Until the garbage collector comes out to play, read_multiple() will keep the file open, keep the tuple of filenames in memory and of course keep the environment of the generator around.  That's all leakage caused by the _break_, and following your logic the obvious solution would be to ban breaks in loops that are reading from generators.  But that wouldn't be helpful, obviously.

It is also possible to fix the particular issue by using another with statement, that is use:

with contextlib.closing(read_multiple(…)) as chunks:
   for contents in chunks:
       …

Automatically closing the generator at the end of the for loop would be nice, but getting the semantics right without breaking existing valid code is not trivial.

Ronald


More information about the Python-ideas mailing list