[Python-Dev] Unicode Proposal: Version 0.4

Nov. 12, 1999

      I've uploaded a new version of the proposal which incorporates
a lot of what has been discussed on the list.

Thanks to everybody who helped so far. Note that I have extended
the list of references for those who want to join in, but are
in need of more background information.

The latest version of the proposal is available at:

	http://starship.skyport.net/~lemburg/unicode-proposal.txt

Older versions are available as:

	http://starship.skyport.net/~lemburg/unicode-proposal-X.X.txt

Some POD (points of discussion) that are still open:

    · support for line breaks (see
      http://www.unicode.org/unicode/reports/tr13/ )

    · support for case conversion: 

      Problems: string lengths can change due to multiple
      characters being mapped to a single new one, capital letters
      starting a word can be different than ones occurring in the
      middle, there are locale dependent deviations from the standard
      mappings.

    · support for numbers, digits, whitespace, etc.

    · support (or no support) for private code point areas

    · should Unicode objects support %-formatting ?

    One possibility would be to emulate this via strings and 
    <default encoding>:

    s = '%s %i abcäöü' # a Latin-1 encoded string
    t = (u,3)

    # Convert Latin-1 s to a <default encoding> string
    s1 = unicode(s,'latin-1').encode()

    # The '%s' will now add u in <default encoding>
    s2 = s1 % t

    # Finally, convert the <default encoding> encoded string to Unicode
    u1 = unicode(s2)

    · specifying file wrappers:

    Open issues: what to do with Python strings
    fed to the .write() method (may need to know the encoding of the
    strings) and when/if to return Python strings through the .read()
    method.

    Perhaps we need more than one type of wrapper here.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    49 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/

[Python-Dev] Unicode Proposal: Version 0.4

M.-A. Lemburg