Python usage numbers
Roy Smith
roy at panix.com
Sun Feb 12 22:57:01 EST 2012
In article <mailman.5752.1329102603.27778.python-list at python.org>,
Terry Reedy <tjreedy at udel.edu> wrote:
> On 2/12/2012 5:14 PM, Chris Angelico wrote:
> > On Mon, Feb 13, 2012 at 9:07 AM, Terry Reedy<tjreedy at udel.edu> wrote:
> >> The situation before ascii is like where we ended up *before* unicode.
> >> Unicode aims to replace all those byte encoding and character sets with
> >> *one* byte encoding for *one* character set, which will be a great
> >> simplification. It is the idea of ascii applied on a global rather that
> >> local basis.
> >
> > Unicode doesn't deal with byte encodings; UTF-8 is an encoding,
>
> The Unicode Standard specifies 3 UTF storage formats* and 8 UTF
> byte-oriented transmission formats. UTF-8 is the most common of all
> encodings for web pages. (And ascii pages are utf-8 also.) It is the
> only one of the 8 most of us need to much bother with. Look here for the
> list
> http://www.unicode.org/glossary/#U
> and for details look in various places in
> http://www.unicode.org/versions/Unicode6.1.0/ch03.pdf
>
> > but so are UTF-16, UTF-32.
> > and as many more as you could hope for.
>
> All the non-UTF 'as many more as you could hope for' encodings are not
> part of Unicode.
>
> * The new internal unicode scheme for 3.3 is pretty much a mixture of
> the 3 storage formats (I am of course, skipping some details) by using
> the widest one needed for each string. The advantage is avoiding
> problems with each of the three. The disadvantage is greater internal
> complexity, but that should be hidden from users. They will not need to
> care about the internals. They will be able to forget about 'narrow'
> versus 'wide' builds and the possible requirement to code differently
> for each. There will only be one scheme that works the same on all
> platforms. Most apps should require less space and about the same time.
All that is just fine, but what the heck are we going to do about ascii
art, that's what I want to know. Python just won't be the same in UTF-8.
/^\/^\
_|__| O|
\/ /~ \_/ \
\____|__________/ \
\_______ \
`\ \ \
| | \
/ / \
/ / \\
/ / \ \
/ / \ \
/ / _----_ \ \
/ / _-~ ~-_ | |
( ( _-~ _--_ ~-_ _/ |
\ ~-____-~ _-~ ~-_ ~-_-~ /
~-_ _-~ ~-_ _-~ - jurcy -
~--______-~ ~-___-~
More information about the Python-list
mailing list