How do I display unicode value stored in a string variable using ord()

wxjmfauth at gmail.com wxjmfauth at gmail.com
Sat Aug 18 10:09:26 CEST 2012


>>> sys.version
'3.2.3 (default, Apr 11 2012, 07:15:24) [MSC v.1500 32 bit (Intel)]'
>>> timeit.timeit("('ab…' * 1000).replace('…', '……')")
37.32762490493721
timeit.timeit("('ab…' * 10).replace('…', 'œ…')")
0.8158757139801764

>>> sys.version
'3.3.0b2 (v3.3.0b2:4972a8f1b2aa, Aug 12 2012, 15:02:36) [MSC v.1600 32 bit 
(Intel)]'
>>> imeit.timeit("('ab…' * 1000).replace('…', '……')")
61.919225272152346
>>> timeit.timeit("('ab…' * 10).replace('…', 'œ…')")
1.2918679017971044

timeit.timeit("('ab…' * 10).replace('…', '€…')")
1.2484133226156757

* I intuitively and empirically noticed, this happens for
cp1252 or mac-roman characters and not characters which are
elements of the latin-1 coding scheme.

* Bad luck, such characters are usual characters in French scripts
(and in some other European language).

* I do not recall the extreme cases I found. Believe me, when
I'm speaking about a few 100%, I do not lie.

My take of the subject.

This is a typical Python desease. Do not solve a problem, but
find a way, a workaround, which is expecting to solve a problem
and which finally solves nothing. As far as I know, to break
the "BMP limit", the tools are here. They are called utf-8 or
ucs-4/utf-32.

One day, I fell on very, very old mail message, dating at the
time of the introduction of the unicode type in Python 2.
If I recall correctly it was from Victor Stinner. He wrote
something like this "Let's go with ucs-4, and the problems
are solved for ever". He was so right.

I'm spying the dev-list since years, my feeling is that
there is always a latent and permanent conflict between
"ascii users" and "non ascii users" (see the unicode
literal reintroduction).

Please, do not get me wrong. As a non-computer scientist,
I'm very happy with Python. If I try to take a distant
eye, I became more and more sceptical.

PS Py3.3b2 is still crashing, silently exiting, with
cp65001.

jmf



More information about the Python-list mailing list