[Tutor] [OT] ETL Tools
bgailer at alum.rpi.edu
Fri Mar 30 23:41:39 CEST 2007
Stephen Nelson-Smith wrote:
> Hello all,
> Does anyone know of any ETL (Extraction, Transformation, Loading)
> tools in Python (or at any rate, !Java)?
I have under development a Python tool that is based on IBM's CMS
Pipelines. I think it would be suitable for ETL. It would help me to see
a sample of the raw data and a more detailed description of "process,
aggregate, group-by". Perhaps your application is the nudge I need to
bring this tool to life.
Let me know.
> I have lots (and lots) of raw data in the form of log files which I
> need to process and aggregate and then do a whole bunch of group-by
> operations, before dumping them into text/relational database for a
> search engine to access.
> At present we have a bunch of scripts in perl and ruby, and a berkley
> and mysql database for the grouping operations. This is proving to be
> a little slow with the amount of data we now have, so I am looking
> into alternatives.
> Does anyone have any experience of this sort of thing? Or know
> someone who does, that I could talk to?
More information about the Tutor