[Tutor] String encoding

Prasad, Ramit ramit.prasad at jpmorgan.com
Fri Aug 26 17:49:22 CEST 2011


>In this case, the encoding is almost certainly "latin-1".  I know that
>from playing around at the interactive interpreter, like this:
>
> >>> s = 'M\xc9XICO'
> >>> print s.decode('latin-1')
> MÉXICO
>
>If you want to see charts of various encodings, wikipedia has a bunch.
> For instance, the Latin-1 encoding is here:
>http://en.wikipedia.org/wiki/ISO/IEC_8859-1 and UTF-8 is here:
>http://en.wikipedia.org/wiki/Utf-8

Yep, it is. Thanks those charts are exactly what I wanted! Now I have another question. What is the difference between what print shows and what the interpreter shows?

>>> print s.decode('latin-1')
MÉXICO
>>> s.decode('latin-1')
u'M\xc9XICO'
>>> print repr(s)
'M\xc9XICO'
>>> repr(s)
"'M\\xc9XICO'"


Ramit


Ramit Prasad | JPMorgan Chase Investment Bank | Currencies Technology
712 Main Street | Houston, TX 77002
work phone: 713 - 216 - 5423



This email is confidential and subject to important disclaimers and
conditions including on offers for the purchase or sale of
securities, accuracy and completeness of information, viruses,
confidentiality, legal privilege, and legal entity disclaimers,
available at http://www.jpmorgan.com/pages/disclosures/email.  


More information about the Tutor mailing list