[Tutor] Declaring encoding question
Kent Johnson
kent37 at tds.net
Mon Aug 22 18:12:00 CEST 2005
mailing list wrote:
> Hi all,
>
> I've got some source code which will be handling non-ASCII chars like
> umlauts and what not, and I've got a testing portion stored in the
> code.
>
> I get this deprecation warning when I run my code -
> __main__:1: DeprecationWarning: Non-ASCII character '\xfc' in file
> C:\Python24\testit.py on line 733, but no encoding declared; see
> http://www.python.org/peps/pep-0263.html for details
>
> I'm reading this - http://www.python.org/peps/pep-0263.html
>
> Now, the non-ASCII character is in the test data, so it's not actually
> part of my code.
> Will Python be able to handle \xfc and company in data without my
> telling it to use a different form of encoding?
You should tell Python what the encoding is. The non-ASCII character is part of the source file. Just include the line
# -*- coding: cp1252 -*-
at the start of the code.
> When I run the code, and get my returned data, it looks like this in
> Pythonwin -
>
>
>>>>print j["landunits"].keys()
>
> ['"J\xe4ger"', '"Deutschmeister"', '"Army of Bohemia"',
> '"Gardegrenadiere"', '"K.u.K Armee"', '"Erzherzog"', '"Army of
> Italy"', '"Army of Silesia"', '"Army of Hungary"']
>
> So J\xe4ger is actually Jäger. When I run it slightly differently -
>
>>>>for item in j["landunits"].keys():
>
> ... print item
> ...
> "Jäger"
> "Deutschmeister"
> "Army of Bohemia"
> "Gardegrenadiere"
> "K.u.K Armee"
> "Erzherzog"
> "Army of Italy"
> "Army of Silesia"
> "Army of Hungary"
>
> It prints the umlauted 'a' fine and dandy.
You are seeing the difference between printing a string and printing it's repr().
When you print a list (which is what j["landunits"].keys() is), Python prints the repr() of each element of the list. repr() of a string shows non-ascii characters as \x escapes; that's why you get J\xe4ger. When you print the string directly, the non-ascii chars are sent to the terminal directly.
Kent
More information about the Tutor
mailing list