PEP: Support for "wide" Unicode characters

Paul Prescod paulp at ActiveState.com
Sun Jul 1 21:46:39 CEST 2001


Marcin 'Qrczak' Kowalczyk wrote:
> 
> Thu, 28 Jun 2001 15:33:00 -0700, Paul Prescod <paulp at ActiveState.com> pisze:
> 
> >     In order to avoid imposing this cost on every
> >     user, Python 2.2 will allow 4-byte Unicode characters as a
> >     build-time option. Users can choose whether they care about
> >     wide characters or prefer to preserve memory.
> 
> I don't like it. Scripts will work under some builds of Python and
> not work in others.

Just as they do today. i.e. if you use import win32api or if you use a
64-bit integer.

> > Rejected Suggestions
> 
> >     The other class of solution is to use some efficient storage
> >     internally but present an abstraction of wide characters
> >     to the programmer. Any of these would require a much more complex
> >     implementation than the accepted solution.
> 
> But will work, as opposed to working only sometimes.
> 
> If memory consumption is really a problem, I would definitely hide
> varying character sizes as an implementation detail.

Here are two relevant paragraphs from the version of the PEP I will
check in today:


        Another class of solution is to use some efficient storage
        internally but present an abstraction of wide characters to
        the programmer. Any of these would require a much more complex
        implementation than the accepted solution. For instance consider
        the impact on the regular expression engine. In theory, we could
        move to this implementation in the future without breaking
Python
        code. A future Python could "emulate" wide Python semantics on
        narrow Python. Guido is not willing to undertake the
        implementation right now.

....

    This PEP represents the least-effort solution. Over the next
    several years, 32-bit Unicode characters will become more common
    and that may either convince us that we need a more sophisticated 
    solution or (on the other hand) convince us that simply 
    mandating wide Unicode characters is an appropriate solution.
    Right now the two options on the table are do nothing or do
    this.

-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook




More information about the Python-list mailing list