Python encoding question

Dave Angel davea at ieee.org
Fri Feb 25 07:50:44 EST 2011


On 01/-10/-28163 02:59 PM, Marc Muehlfeld wrote:
> Hi,
>
>   <snip>
> TEST = cursor.fetchone()
> print TEST[0]
> print TEST
>
>
> When I run this script It prints me:
> München
> ('M\xc3\xbcnchen',)
>
> Why is the Umlaut of TEST[0] printed and not from TEST?
>

When you print a string, it simply prints it, control characters, 
international characters, and all.

When you print a more complex object, it's up to that object to decide 
how to print.  In the case of a tuple above, the tuple logic displays 
the parentheses and the comma, but calls the repr() of any objects it 
contains.  Tuple doesn't make a special case for strings, or for 
numbers, it just always calls repr()   (actually it's __repr__(), I think)

A list does the same thing, though it'll use square brackets at the ends.

So the question boils down to what repr() does.  It attempts to create a 
representation that could be used to create the specific object.  So if 
there's a newline, it uses \n.  And if there are non-ASCII codes, it 
uses hex escape sequences.  And of course it adds the quote marks.

DaveA



More information about the Python-list mailing list