<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    On 10/11/13 4:16 AM, Stephen Tucker wrote:<br>

    <blockquote

cite="mid:CAP=-cKU6Xkt_rm4VYh7NfXLSfUDG7FZQXHz+U12gP6fbkdhxbQ@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div>

          <div>

            <div>

              <div>

                <div>

                  <div>

                    <div>

                      <div>I am using IDLE, Python 2.7.2 on Windows 7,

                        64-bit.<br>

                      </div>

                      <div><br>

                        I have four questions:<br>

                      </div>

                      <div><br>

                        1. Why is it that<br>

                             print unicode_object<br>

                      </div>

                      displays non-ASCII characters in the unicode

                      object correctly, whereas<br>

                    </div>

                         print (unicode_object, another_unicode_object)<br>

                  </div>

                  displays non-ASCII characters in the unicode objects

                  as escape sequences (as repr() does)?<br>

                  <br>

                </div>

                2. Given that this is actually <i>deliberately </i>the

                case (which I, at the moment, am finding difficult to

                accept), what is the neatest (that is, the most

                Pythonic) way to get non-ASCII characters in unicode

                objects in tuples displayed correctly?<br>

                <br>

              </div>

              3. A similar thing happens when I write such objects and

              tuples to a file opened by<br>

                   codecs.open ( ..., "utf-8")<br>

            </div>

            I have also found that, even though I use  write  to send

            the text to the file, unicode objects not in tuples get

            their non-ASCII characters sent to the file correctly,

            whereas, unicode objects in tuples get their characters sent

            to the file as escape sequences. Why is this the case?<br>

            <br>

          </div>

          4. As for question 1 above, I ask here also: What is the

          neatest way to get round this?<br>

          <br>

        </div>

        Stephen Tucker.<br>

        <br>

      </div>

    </blockquote>

    <br>

    Although Python 3 is better than Python 2 at Unicode, as the others

    have said, the most important point is one that you hit upon

    yourself.<br>

    <br>

    When you print an object x, you are actually printing str(x).  The

    str() of a tuple is a paren, followed by the repr()'s of its

    elements, separated by commas, then a closing paren.  Tuples and

    lists use the repr() of their elements when producing either their

    own str() or their own repr().<br>

    <br>

    Python 3 does better at this because repr() in Python 3 will gladly

    include non-ASCII characters in its output, while Python 2 will only

    include ASCII characters, and so must resort to escape sequences. 

    (BTW: if you like the ASCII-only idea from Python 2, Python 3 has

    the ascii() function and the %a string formatting directive for that

    very purpose.)<br>

    <br>

    The two string representation alternatives str() and repr() can be

    confusing.  Think of it as: str() is for customers, repr() is for

    developers, or: str() is for humans, repr() is for geeks.   The

    reason tuples use the repr() of their elements is that the

    parens+commas representation of a tuple is geeky to begin with, so

    it uses repr() of its elements, even for str(tuple).<br>

    <br>

    The way to avoid repr() for the elements is to format the tuple

    yourself.<br>

    <br>

    --Ned.<br>

  </body>

</html>