[Tutor] [OT] ETL Tools
Bob Gailer
bgailer at alum.rpi.edu
Fri Mar 30 23:41:39 CEST 2007
Stephen Nelson-Smith wrote:
> Hello all,
>
> Does anyone know of any ETL (Extraction, Transformation, Loading)
> tools in Python (or at any rate, !Java)?
>
I have under development a Python tool that is based on IBM's CMS
Pipelines. I think it would be suitable for ETL. It would help me to see
a sample of the raw data and a more detailed description of "process,
aggregate, group-by". Perhaps your application is the nudge I need to
bring this tool to life.
Let me know.
> I have lots (and lots) of raw data in the form of log files which I
> need to process and aggregate and then do a whole bunch of group-by
> operations, before dumping them into text/relational database for a
> search engine to access.
>
> At present we have a bunch of scripts in perl and ruby, and a berkley
> and mysql database for the grouping operations. This is proving to be
> a little slow with the amount of data we now have, so I am looking
> into alternatives.
>
> Does anyone have any experience of this sort of thing? Or know
> someone who does, that I could talk to?
>
>
--
Bob Gailer
510-978-4454
More information about the Tutor
mailing list