<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
On 10/11/13 4:16 AM, Stephen Tucker wrote:<br>
<blockquote
cite="mid:CAP=-cKU6Xkt_rm4VYh7NfXLSfUDG7FZQXHz+U12gP6fbkdhxbQ@mail.gmail.com"
type="cite">
<div dir="ltr">
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>I am using IDLE, Python 2.7.2 on Windows 7,
64-bit.<br>
</div>
<div><br>
I have four questions:<br>
</div>
<div><br>
1. Why is it that<br>
print unicode_object<br>
</div>
displays non-ASCII characters in the unicode
object correctly, whereas<br>
</div>
print (unicode_object, another_unicode_object)<br>
</div>
displays non-ASCII characters in the unicode objects
as escape sequences (as repr() does)?<br>
<br>
</div>
2. Given that this is actually <i>deliberately </i>the
case (which I, at the moment, am finding difficult to
accept), what is the neatest (that is, the most
Pythonic) way to get non-ASCII characters in unicode
objects in tuples displayed correctly?<br>
<br>
</div>
3. A similar thing happens when I write such objects and
tuples to a file opened by<br>
codecs.open ( ..., "utf-8")<br>
</div>
I have also found that, even though I use write to send
the text to the file, unicode objects not in tuples get
their non-ASCII characters sent to the file correctly,
whereas, unicode objects in tuples get their characters sent
to the file as escape sequences. Why is this the case?<br>
<br>
</div>
4. As for question 1 above, I ask here also: What is the
neatest way to get round this?<br>
<br>
</div>
Stephen Tucker.<br>
<br>
</div>
</blockquote>
<br>
Although Python 3 is better than Python 2 at Unicode, as the others
have said, the most important point is one that you hit upon
yourself.<br>
<br>
When you print an object x, you are actually printing str(x). The
str() of a tuple is a paren, followed by the repr()'s of its
elements, separated by commas, then a closing paren. Tuples and
lists use the repr() of their elements when producing either their
own str() or their own repr().<br>
<br>
Python 3 does better at this because repr() in Python 3 will gladly
include non-ASCII characters in its output, while Python 2 will only
include ASCII characters, and so must resort to escape sequences.
(BTW: if you like the ASCII-only idea from Python 2, Python 3 has
the ascii() function and the %a string formatting directive for that
very purpose.)<br>
<br>
The two string representation alternatives str() and repr() can be
confusing. Think of it as: str() is for customers, repr() is for
developers, or: str() is for humans, repr() is for geeks. The
reason tuples use the repr() of their elements is that the
parens+commas representation of a tuple is geeky to begin with, so
it uses repr() of its elements, even for str(tuple).<br>
<br>
The way to avoid repr() for the elements is to format the tuple
yourself.<br>
<br>
--Ned.<br>
</body>
</html>