[Python-Dev] Support for "wide" Unicode characters
Mon, 02 Jul 2001 18:51:58 +0200
Guido van Rossum wrote:
> > Greg Ewing wrote:
> > >
> > > > It so happened that the Unicode support was written to make it very
> > > > easy to change the compile-time code unit size
> > >
> > > What about extension modules that deal with Unicode strings?
> > > Will they have to be recompiled too? If so, is there anything
> > > to detect an attempt to import an extension module with an
> > > incompatible Unicode character width?
> > That's a good question !
> > The answer is: yes, extensions which use Unicode will have to
> > be recompiled for narrow and wide builds of Python. The question
> > is however, how to detect cases where the user imports an
> > extension built for narrow Python into a wide build and
> > vice versa.
> > The standard way of looking at the API level won't help. We'd
> > need some form of introspection API at the C level... hmm,
> > perhaps looking at the sys module will do the trick for us ?!
> > In any case, this is certainly going to cause trouble one
> > of these days...
> Here are some alternative ways to deal with this:
> (1) Use the preprocessor to rename all the Unicode APIs to get "Wide"
> appended to their name in wide mode. This makes any use of a
> Unicode API in an extension compiled for the wrong Py_UNICODE_SIZE
> fail with a link-time error. (Which should cause an ImportError
> for shared libraries.)
> (2) Ditto but only rename the PyModule_Init function. This is much
> less work but more coarse: a module that doesn't use any Unicode
> APIs (and I expect these will be a large majority) still would not
> be accepted.
> (3) Change the interpretation of PYTHON_API_VERSION so that a low bit
> of '1' means wide Unicode. Then you only get a warning (followed
> by a core dump when actually trying to use Unicode).
> I mentioned (1) and (3) in an earlier post.
(4) Add a feature flag to PyModule_Init() which then looks up the
features in the sys module and uses this as basis for
processing the import requrest.
In this case, I think that (5) would be the best solution,
since old code will notice the change in width too.
Python Pages: http://www.lemburg.com/python/