[Tutor] String encoding

Fri Aug 26 17:49:22 CEST 2011

>In this case, the encoding is almost certainly "latin-1".  I know that
>from playing around at the interactive interpreter, like this:
>
> >>> s = 'M\xc9XICO'
> >>> print s.decode('latin-1')
> MÉXICO
>
>If you want to see charts of various encodings, wikipedia has a bunch.
> For instance, the Latin-1 encoding is here:
>http://en.wikipedia.org/wiki/ISO/IEC_8859-1 and UTF-8 is here:
>http://en.wikipedia.org/wiki/Utf-8

Yep, it is. Thanks those charts are exactly what I wanted! Now I have another question. What is the difference between what print shows and what the interpreter shows?

>>> print s.decode('latin-1')
MÉXICO
>>> s.decode('latin-1')
u'M\xc9XICO'
>>> print repr(s)
'M\xc9XICO'
>>> repr(s)
"'M\\xc9XICO'"

Ramit

Ramit Prasad | JPMorgan Chase Investment Bank | Currencies Technology
712 Main Street | Houston, TX 77002
work phone: 713 - 216 - 5423

This email is confidential and subject to important disclaimers and
conditions including on offers for the purchase or sale of
securities, accuracy and completeness of information, viruses,
confidentiality, legal privilege, and legal entity disclaimers,
available at http://www.jpmorgan.com/pages/disclosures/email.