How do I display unicode value stored in a string variable using ord()
wxjmfauth at gmail.com
wxjmfauth at gmail.com
Fri Aug 17 14:45:02 EDT 2012
Le vendredi 17 août 2012 20:21:34 UTC+2, Jerry Hill a écrit :
> On Fri, Aug 17, 2012 at 1:49 PM, <wxjmfauth at gmail.com> wrote:
>
> > The character '…', Unicode name 'HORIZONTAL ELLIPSIS',
>
> > is one of these characters existing in the cp1252, mac-roman
>
> > coding schemes and not in iso-8859-1 (latin-1) and obviously
>
> > not in ascii. It causes Py3.3 to work a few 100% slower
>
> > than Py<3.3 versions due to the flexible string representation
>
> > (ascii/latin-1/ucs-2/ucs-4) (I found cases up to 1000%).
>
> >
>
> >>>> '…'.encode('cp1252')
>
> > b'\x85'
>
> >>>> '…'.encode('mac-roman')
>
> > b'\xc9'
>
> >>>> '…'.encode('iso-8859-1') # latin-1
>
> > Traceback (most recent call last):
>
> > File "<eta last command>", line 1, in <module>
>
> > UnicodeEncodeError: 'latin-1' codec can't encode character '\u2026'
>
> > in position 0: ordinal not in range(256)
>
> >
>
> > If one could neglect this (typographically important) glyph, what
>
> > to say about the characters of the European scripts (languages)
>
> > present in cp1252 or in mac-roman but not in latin-1 (eg. the
>
> > French script/language)?
>
>
>
> So... python should change the longstanding definition of the latin-1
>
> character set? This isn't some sort of python limitation, it's just
>
> the reality of legacy encodings that actually exist in the real world.
>
>
>
>
>
> > Very nice. Python 2 was built for ascii user, now Python 3 is
>
> > *optimized* for, let say, ascii user!
>
> >
>
> > The future is bright for Python. French users are better
>
> > served with Apple or MS products, simply because these
>
> > corporates know you can not write French with iso-8859-1.
>
> >
>
> > PS When "TeX" moved from the ascii encoding to iso-8859-1
>
> > and the so called Cork encoding, "they" know this and provided
>
> > all the complementary packages to circumvent this. It was
>
> > in 199? (Python was not even born).
>
> >
>
> > Ditto for the foundries (Adobe, Linotype, ...)
>
>
>
>
>
> I don't understand what any of this has to do with Python. Just
>
> output your text in UTF-8 like any civilized person in the 21st
>
> century, and none of that is a problem at all. Python make that easy.
>
> It also makes it easy to interoperate with older encodings if you
>
> have to.
>
Sorry, you missed the point.
My comment had nothing to do with the code source coding,
the coding of a Python "string" in the code source or with
the display of a Python3 <str>.
I wrote about the *internal* Python "coding", the
way Python keeps "strings" in memory. See PEP 393.
jmf
More information about the Python-list
mailing list