[python-ldap] ldap.OPT_DESC, async ops and paged search controls

Mark R Bannister mark at proseconsulting.co.uk
Wed Jan 21 00:16:24 CET 2015


Hi,

I've been using the new ldap.OPT_DESC feature introduced in python-ldap 
2.4.17 and have a question concerning the use of it with asynchronous 
search operations and paged search controls.

I have a daemon (http://dbis.sf.net) that launches async searches with 
paged search controls (page size 1000), and then uses the ldap.OPT_DESC 
feature to get the file descriptor back.  It then adds the fd to a 
select() call, with the intention to be woken up when there is something 
to do.  When it wakes up it calls ldap.result().

So far this makes sense, right?

Now, the behaviour I'm seeing is not necessarily what I would have 
expected, so I want to check with others to see if this is correct or 
not.  I have an example search operation that returns about 80,000 
entries.  With a page size of 1000, would you not expect to go round the 
loop and call select() 80 times?  Well that's not what I'm getting.  
It's looping round and calling select() 80,000 times, once per entry.  
Each time select() wakes me up, I call ldap.result() which for the first 
999 iterations returns (None, None).  Then, the 1,000th time, it returns 
1,000 entries.  I then switch to the next page and go round the loop 
again, and the same thing happens again.

Thinking that there was something wrong with my code, I ran strace to 
confirm that when I was calling ldap.result() and got (None, None) 
returned, there was some data transferred over the file descriptor.

Does this seem right to you and is there anyway to optimise this? All 
80,000 entries are taking about 15 seconds to read into Python using the 
python-ldap module compared with 5 seconds for native C.

Thanks,
Mark.



More information about the python-ldap mailing list