Straw poll on Python performance (was Re: Python is far from a top performer ...)

Ken kenfar42 at yahoo.com
Sat Jan 10 03:40:55 CET 2004


>   Do you spend a "significant" amount of time actually optimizing your 
>   Python applications?  (Significant is here defined as "more than five
>   percent of your time", which is for example two hours a week in a 
>   40-hour work week.)

Some of them.

I'm using python for data transformations:  some feeds are small and
easily handled by python, however the large ones (10 million rows per
file) require a bit of thought to be spent on performance.  However,
this isn't exactly python optimization - more like shifting high-level
pieces around in the architecture: merge two files or do a binary
lookup (nested-loop-join) one one? etc...

To make matters worse we just implemented a metadata-driven
transformation engine entirely written in python.  It'll work great on
the small files, but the large ones...

Luckily, the nature of this application lends itself towards
distributed processing - so my plan is to:
1.  check out psycho for the metadata-driven tool
2.  partition the feeds across multiple servers
3.  rewrite performance-intensive functions in c

But I think I'll get by with just options #1 and #2: we're using
python and it's working well - exactly because it is so adaptable. 
The cost in performance is inconsequential in this case compared to
the maintainability.



More information about the Python-list mailing list