Announce: dataflow toolkit v0.5

Oren Tirosh oren-py-l at hishome.net
Thu Jan 31 09:38:01 EST 2002


dataflow.py is a toolkit for dataflow-oriented programming.  It requires
Python 2.2 since it makes extensive use of iterators and generator 
functions.

  http://www.tothink.com/python/dataflow

The module defines the functions xmap, xzip and xfilter which are 
equivalent to map, zip, filter but return an iterable object rather than
a list and are lazy in reading from their upstream sources.  Other
lazy functions and objects for contructing data flow processes are also
provided.

The flow type describes a recipe for a dataflow.  A complete flow includes
a source, optional transformations and a destination.  

* Sources
Sources may be Python containers or any other object with an __iter__ 
method.

* Transformations
A transformation is an object with a __tran__ method.  This method takes
an upstream iterator and returns an iterator.  It is usually implemented
as a generator function.  A function or any other callable object may also 
be used as a stateless transformation.  

* Destinations
Destinations have a __sink__ method returning an object conforming to the
sink protocol, analogous to the iterator protocol.  The sink() function also 
supports the Python built-in container and file types that do not have a
__sink__ method.

Partial flows may be used for the construction of complete flows or used
by themselves. A flow consisting of a source and one or more transformations 
may be iterated using a for statement, for example.

One final feature is the ability to construct dataflow using shell-like 
| and > operators.  This makes Python quite usable for one-liners.  The
resulting syntax looks familiar and alien at the same time.  I'm still
trying to figure out whether that's a good thing or not :-)

	Oren





More information about the Python-list mailing list