Python usage numbers

Roy Smith roy at panix.com
Mon Feb 13 04:57:01 CET 2012


In article <mailman.5752.1329102603.27778.python-list at python.org>,
 Terry Reedy <tjreedy at udel.edu> wrote:

> On 2/12/2012 5:14 PM, Chris Angelico wrote:
> > On Mon, Feb 13, 2012 at 9:07 AM, Terry Reedy<tjreedy at udel.edu>  wrote:
> >> The situation before ascii is like where we ended up *before* unicode.
> >> Unicode aims to replace all those byte encoding and character sets with
> >> *one* byte encoding for *one* character set, which will be a great
> >> simplification. It is the idea of ascii applied on a global rather that
> >> local basis.
> >
> > Unicode doesn't deal with byte encodings; UTF-8 is an encoding,
> 
> The Unicode Standard specifies 3 UTF storage formats* and 8 UTF 
> byte-oriented transmission formats. UTF-8 is the most common of all 
> encodings for web pages. (And ascii pages are utf-8 also.) It is the 
> only one of the 8 most of us need to much bother with. Look here for the 
> list
> http://www.unicode.org/glossary/#U
> and for details look in various places in
> http://www.unicode.org/versions/Unicode6.1.0/ch03.pdf
> 
> > but so are UTF-16, UTF-32.
>  > and as many more as you could hope for.
> 
> All the non-UTF 'as many more as you could hope for' encodings are not 
> part of Unicode.
> 
> * The new internal unicode scheme for 3.3 is pretty much a mixture of 
> the 3 storage formats (I am of course, skipping some details) by using 
> the widest one needed for each string. The advantage is avoiding 
> problems with each of the three. The disadvantage is greater internal 
> complexity, but that should be hidden from users. They will not need to 
> care about the internals. They will be able to forget about 'narrow' 
> versus 'wide' builds and the possible requirement to code differently 
> for each. There will only be one scheme that works the same on all 
> platforms. Most apps should require less space and about the same time.

All that is just fine, but what the heck are we going to do about ascii 
art, that's what I want to know.  Python just won't be the same in UTF-8.



                    /^\/^\
                  _|__|  O|
         \/     /~     \_/ \
          \____|__________/  \
                 \_______      \
                         `\     \                 \
                           |     |                  \
                          /      /                    \
                         /     /                       \\
                       /      /                         \ \
                      /     /                            \  \
                    /     /             _----_            \   \
                   /     /           _-~      ~-_         |   |
                  (      (        _-~    _--_    ~-_     _/   |
                   \      ~-____-~    _-~    ~-_    ~-_-~    /
                     ~-_           _-~          ~-_       _-~   - jurcy -
                        ~--______-~                ~-___-~



More information about the Python-list mailing list