affects on extended modules

Curtis Jensen cjensen at bioeng.ucsd.edu
Wed Jan 2 19:52:44 EST 2002


Pedro <pedro_rodriguez at club-internet.fr> wrote in message news:<pan.2001.12.28.11.17.47.971198.1713 at club-internet.fr>...
> "Curtis Jensen" <cjensen at bioeng.ucsd.edu> wrote:
> 
> > Pedro <pedro_rodriguez at club-internet.fr> wrote in message
> > news:<pan.2001.12.06.12.45.41.172197.2456 at club-internet.fr>...
> >> "Curtis Jensen" <cjensen at bioeng.ucsd.edu> wrote:
> >> 
> >> > Kragen Sitaker wrote:
> >> >> 
> >> >> Curtis Jensen <cjensen at bioeng.ucsd.edu> writes:
> >> >> > We have created a python interface to some core libraries of our
> >> >> > own making.  We also have a C interface to these same libraries.
> >> >> > However, the the python interface seems to affect the speed of the
> >> >> > extended libraries.  ie.  some library routines have their own
> >> >> > benchmark code, and the time of exection from the start of the
> >> >> > library routine to the end of the library routine (not including
> >> >> > any python code execution), takes longer than it's C counterpart.
> >> >> 
> >> >> In the Python version, the code is in a Python extension module,
> >> >> right?
> >> >>  A .so or .dll file?  Is it also in the C counterpart?  (If that's
> >> >>  not
> >> >> it, can you provide more details on how you compiled and linked the
> >> >> two?)
> >> >> 
> >> >> In general, referring to dynamically loaded things through symbols
> >> >> --- even from within the same file --- tends to be slower than
> >> >> referring to things that aren't dynamically loaded.
> >> >> 
> >> >> What architecture are you on?  If you're on the x86, maybe Numeric
> >> >> is being stupid and allocating things that aren't maximally aligned.
> >> >>  But you'd probably notice a pretty drastic difference in that case.
> >> >> 
> >> >> ... or maybe Numeric is being stupid and allocating things in a way
> >> >> that causes cache-line contention.
> >> >> 
> >> >> Hope this helps.
> >> > 
> >> > Thanks for the responce.  The C counterpart is directly linked
> >> > together into one large binary (yes, the python is using a dynamicaly
> >> > linked object file, a .so).  So, That might be the source of the
> >> > problem.  I can try and make a dynamicaly linked version of the C
> >> > counterpart and see how that affects the speed.  We are running on
> >> > IRIX 6.5 machines (mips).
> >> > Thanks.
> >> > 
> >> > 
> >> Don't know if this helps but I had a similar problem on Linux.
> >> 
> >> The context was : a python script was calling an external program and
> >> parsing output (with popen) many times. I decided to optimize this by
> >> turning the external program into a dynamicaly linked library with
> >> python bindings. I expected to gain the extra system calls to fork and
> >> start a new process, but it turned out that this solution was slower.
> >> 
> >> The problem was caused by multithreading stuff. When using the library
> >> straight from a C program, I didn't link with multithreaded libraries
> >> and so all system calls weren't protected (they don't need to lock and
> >> unlock their resources).
> >> 
> >> Unfortunately, the library was reading files with fgetc (character by
> >> character :( ). Since the Python version I used was compiled with
> >> multi-threading enabled, it turned out that the fgetc function used in
> >> this case lock/unlock features, which cause the extra waste of time.
> >> 
> >> To find this, I compiled my library with profiling (I think I needed to
> >> use some system call to activate profiling from the library, since I
> >> couldn't rebuild Python).
> >> 
> >> OT : at the end I fixed the library (fgetc replaced by fgets), and
> >> didn't gain anything by turning the external program into a python
> >> extension. Since it seemed that Linux disk cache was good, I removed
> >> the python extension thus keeping a pure Python program, and
> >> implemented a cache for the results of the external program. This was
> >> much simpler and more efficient in this case.
> > 
> > 
> > Is this a problem with i/o only?  Our the code sections that we
> > benchmarked has no i/o in it.
> > 
> > --
> > Curtis Jensen
> 
> In my case, it was only i/o related.
> 
> If your problem, as I understand it, is :
> + I've got a function f() written in C
> + f() execution is doing some benchmark telling how much time it took
>   to complete
> + calling f() from a C binary gives a (significant) shorter duration
>   than calling (the same) f() from a Python extension
> 
> you may have to check what f() is doing, because , what I was stating is,
> that it may be affected by the python environment :
> 
> - Are doing extensive calls to an external library ?
>   In my case, some glibc calls need to inforce reentrancy protection
>   when running in a multithreaded context. These protections blew out
>   any gain.
> 
> - If you're doing calls to external libraries, are you linked against
>   the same versions ? (ldd on binaries and libraries may help)
> 
> - More basicaly, did you compile with the same options ?
>   Could the differences point to a possible source of your problem ?
>   (may be worth checking optimization, debug, conditional compilation
>   options)
> 
> Regards,

Your summary of the problem is correct.  We do make calls to an
external library (NAg)  I don't know if this makes a difference, but
we are also calling Fortran.  In any case, we link with the same
libraries and with the same compiler options.  It seems to me that we
are not experiencing the same problem that you incountered.  Thanks
for your help.

--
Curtis Jensen



More information about the Python-list mailing list