[Python-Dev] string_join overrides TypeError exception thrown in generator

Nick Coghlan ncoghlan at gmail.com
Mon Aug 15 14:28:09 CEST 2005

Stephen Thorne wrote:
> I can't see an obvious solution, but perhaps generators should get
> special treatment regardless. Reading over this code it looks like the
> generator is exhausted all at once, instead of incrementally..

Indeed - str.join uses a multipass approach to build the final string, so it 
needs to ensure it has a reiterable to play with. PySequence_Fast achieves 
that, at the cost of dumping a generator into a sequence rather than building 
a string from it directly.

Unicode.join uses PySequence_Fast too, and has the same problem with masking 
the TypeError from the generator.

The calling code simply can't tell if the NULL return was set directly by 
PySequence_Fast, or was relayed by PySequence_List (which got it from 
_PyList_Extend, which got it from listextend, which got it from iternext, etc).

This is the kind of problem that PEP 344 is designed to solve :)

This also shows that argument validation is one of the cases where using an 
iterable instead of a generator is a good thing, since errors get raised where 
the generator is created, instead of where it is first used:

class gen(object):
   def __init__(self):
      raise TypeError, "I am a TypeError"
   def __iter__(self):
      yield 1

def one(): return ''.join( x for x in gen() )
def two(): return ''.join([x for x in gen()])

for x in one, two:
     except TypeError, e:
          print e

Hmm, makes me think of a neat little decorator:

def step_on_creation(gen):
     def start_gen(*args, **kwds):
         g = gen(*args, **kwds)
         return g
     start_gen.__name__ = gen.__name__
     start_gen.__doc__ = gen.__doc__
     start_gen.__dict__ = gen.__dict__
     return start_gen

def gen():
      # Setup executed at creation time
      raise TypeError, "I am a TypeError"
      yield None
      # The actual iteration steps
      yield 1


