michael at stroeder.com
Wed Jun 15 17:35:22 CEST 2011
Dusan Stefanik wrote:
> I'm working on python-ldap for python3, my work is on right way.
Do you think it will be feasible to have a common code base for Python 2.x and
Python 3.x at least for the C wrapper part?
> There is principial question but I don't know answer.
> In my work I changed all strings to bytes - it works.
That's what I would do within the C code in Modules/.
> So may better way will be using Unicode string internally.
> What do you thing about it?
> Python on Linux use UCS2 schema, on some platform can use UCS4 schema (documentation write).
> What schema using LDAP servers and does it matter?
> What will be an impact to functionality?
1. Attribute values can contain BLOBs like JPEG images, X.509 certs etc.
There's no way to automagically converting attribute values to Unicode strings
without looking in the relevant subschema or applying a-priori knowledge about
the attribute type syntax.
2. For LDAPv3 DNs and attribute types or similar have to be UTF-8 encoded
before transmitted on the wire. For LDAPv2 it's T.61 but this varys depending
on the server implementation/configuration.
=> even when thinking more about it the best decision seems to simply use
bytes in C code in Modules/.
For 2. I was already thinking about automatic Unicode to bytes conversion in
Lib/ldap/ldapobject.py with a charset as class attribute. But I have some
doubts it's worth the effort. Can somebody point me to a *efficient* recipe
how to decide whether to encode a Unicode string or pass along raw string
which also works with Python 2.3? Yes, I know type(u'') but I'd like to avoid
clutter all the method with such if-statements.
More information about the python-ldap