Performance penalty for using python-ldap
epo001 at hotmail.com
Mon Jun 16 21:41:11 CEST 2003
I was tuning an LDAP directory for a client last week and had cause to run
some before and after benchmarks.
Basically for a 3000 entry directory I wrote a python script which did the
listed each entry using the filter (cn=*) using python-ldap and also
invoking the shell to use the ldapsearch command. These were done twice:
running all attributes an just returning the cn attribute
did 3000 random lookups using (cn=exact-match), and then (cn=exact-match*)
again using python-ldap and the ldapsearch command.
The searches were run twice on unloaded machines, the first time to populate
caches, the second time as a rough best-performance figure
The findings were somewhat surprising.
In the list whole directory search. ldap-search was generally and
consistently at least 30% faster than python-ldap. I.e. these figures apply
before and after tuning the directory. Remember the python searches are
pre-bound while ldapsearch binds each time it is called.
In the random lookup test, the performance figures were comparable but this
compares calling python-ldap to do a search against spawning a shell,
running ldpasearch, binding then doing the search, i.e. the command line
search has a LOT more overhead.
I'm happy to run some tests to identify the cause to see if we can fix it,
any suggestions where to start?
General conclusions from my tests:
python-ldap has a suprising performance penalty
searching is helped by having ample cache (doh!)
returning 1 attribute is much faster than returning all of them (doh!)
searching on indexed attributes helps a lot (doh!)
More information about the python-ldap