Problems with function ldap.explode_dn()

Tue Nov 25 07:35:36 CET 2003

OHURAMACRCUD at spammotel.com wrote:
> 
>>>>dn="uid=ascii,l=München_iso-8859-1"
>>>>ldap.explode_dn(dn)
> 
> ['uid=ascii']

Hmm, this looks wrong. Using ISO-8859-1 is wrong with LDAPv3 anyway.

>>>>ldap.explode_dn(dn.decode('latin'))
> [..]
> UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 13: ordinal not in range(128)

Off course you can't encode the raw string. dn is supposed to be an Unicode 
object.

>>>>ldap.explode_dn(dn.decode('latin').encode('utf'))
> 
> ['uid=ascii', 'l=M\\C3\\BCnchen_iso-8859-1']

Now this shows that the underlying OpenLDAP function ldap_explode_dn() is 
aware of DN strings having to be UTF-8 encoded Unicode.

Note that you SHOULD use 'utf-8' as unique ID for UTF-8.

>>>>vgl=ldap.explode_dn(dn.decode('latin').encode('utf'))[1]
>>>>vgl
> 
> 'l=M\\C3\\BCnchen_iso-8859-1'
> 
>>>>vgl=="l=München_iso-8859-1".decode('latin').encode('utf8')
> 
> False

The DN string normalization (with back slash representation) is done within 
the OpenLDAP libs. The behaviour changed from OpenLDAP 2.0 to 2.1. I already 
had a discussion with Kurt Z. about that. His argument was that LDAP 
applications should treat DNs as being opaque. There's nothing we can do 
about it. This breaks things when accessing OpenLDAP 2.0 server since 2.0 
did not implement correct matching rules for DNs. Oh yeah...

> Beeing very RFC-strict I would have to use utf-8 encoding. But the result
> also should be utf-8 encoded, which isn't the case. Any ideas?

Input and output of ldap_explode_dn() both have to be a valid DN string 
representation as of RFC2253 which is the case. Guess you have to live with 
that.

Ciao, Michael.