Please comment on PEP XXX: Enhanced Generators
Oren Tirosh
oren-py-l at hishome.net
Fri Feb 1 04:42:14 EST 2002
> 1. Four new built-in functions: xmap, xfilter, xzip, and indexed
I have already implemented these functions (except indexed) two months ago
in my dataflow library (http://www.tothink.com/python/dataflow).
My implementation has a very significant difference: just like the xrange
function, the resulting object is a restartable source, not a one-time
iterator. Naturally, the source is restartable only of all arguments to
the functions are restartable sources themselves.
> 2. Generator comprehensions: [yield int(x) for x in floatgen()]
The only problem I see with this syntax is that the square brackets are
misleading - the resulting object is not a list. How about using parens
instead? They are not strictly required, but just like for tuples, they
help disambiguate the syntax.
> 3. Feeding data into a generator using an optional argument to .next()
Python currently has generators that produce data. Below are three other
significant dataflow configurations:
1. consumers
2. transformations - receive data from an upstream source, perform some
transformation on it and send it to their downstream consumer.
3. dialogues - like transformations, dialogues they have an input and an
output, but they are both connected to the same logical entity, not an
upstream and a downstream.
Consumers functions are relatively simple. Like generators, they do not
require full-fledged multithreading, just the suspension of a single
function context. Like generator functions, consumer functions would allow
the state machines of the producer process and the consumer pto run
independently.
Dialogues can be implemented without threads, too, but in that case they
would require the producer process and the consumer process to run in
lockstep, producing and consuming an item of data for each step. Letting
them produce and consumer data at different rates would require either
the use of null values, full threading support or decoupling the rates
with a queue.
Transformations can be implemented easily using a generator function that
takes an upstream iterator as an argument. Transformations may be chained
together, if necessary. The input and output rates of a tranformation are
decoupled without any need for threads of queuing: the transformation may
call the upstream iterator's next more or less than once per yield statement.
Transformations could also be implemented as a consumer function that
takes a downstream sink object as an argument, but the upstream view looks
more natural and can be implemented today using generator functions.
My opinion:
We know enough about consumers to implement them now, if a good syntax is
found and they are deemed important enough by the BDFL. Transformations
are easy with the current generators. Dialogues are a form of dataflow
more often found in communication systems than as a method of data flow
between two parts of the same program. Communication systems have many
other issues to deal with than just the two-way exchange of data so two-way
generators are probably not the right way to support them.
dataflowingly yours,
Oren
Appendix: Emulating consumer functions with generator functions:
def consumer(args):
<initialize>
more = [None]
yield more
while more:
<use data in more[0]>
yield more
<cleanup>
driving a consumer function:
for mailbox in consumer(args):
if <more data>:
mailbox[0] = <data>
else:
del mailbox[:]
the long version:
itr = consumerfunction(args)
mailbox = itr.next()
while <more data>
mailbox[0] = <data>
itr.next()
mailbox[:] = []
try:
itr.next()
raise 'Shouldn't be here'
except StopIteration: pass
More information about the Python-list
mailing list