[Tutor] [OT] ETL Tools

Bob Gailer bgailer at alum.rpi.edu
Fri Mar 30 23:41:39 CEST 2007

Stephen Nelson-Smith wrote:
> Hello all,
> Does anyone know of any ETL (Extraction, Transformation, Loading)
> tools in Python (or at any rate, !Java)?
I have under development a Python tool that is based on IBM's CMS 
Pipelines. I think it would be suitable for ETL. It would help me to see 
a sample of the raw data and a more detailed description of "process, 
aggregate, group-by". Perhaps your application is the nudge I need to 
bring this tool to life.

Let me know.
> I have lots (and lots) of raw data in the form of log files which I
> need to process and aggregate and then do a whole bunch of group-by
> operations, before dumping them into text/relational database for a
> search engine to access.
> At present we have a bunch of scripts in perl and ruby, and a berkley
> and mysql database for the grouping operations.  This is proving to be
> a little slow with the amount of data we now have, so I am looking
> into alternatives.
> Does anyone have any experience of this sort of  thing?  Or know
> someone who does, that I could talk to?

Bob Gailer

More information about the Tutor mailing list