[Tutor] Why difference between printing string & typing its object reference at the prompt?

Steven D'Aprano steve at pearwood.info
Wed Oct 3 08:38:43 CEST 2012


On Tue, Oct 02, 2012 at 10:15:02PM -0500, boB Stepp wrote:

> I am puzzled by the results of the following:
> 
> >>> x = "Test"
> >>> x
> 'Test'
> >>> print(x)
> Test
> 
> I understand that 'Test' is the stored value in memory where the
> single quotes designate the value as being a string data type.

[...]
> But why does the print() strip
> the quotes off? Is just as simple as normally people when performing a
> print just want the unadorned text, so that is the behavior built into
> the print function?

The short answer is, yes.

The long answer is a bit more subtle, and rather long.

I believe that you're thinking at too low a level. Or at least writing 
at too low a level. Forget about "values in memory" and "object 
reference", and just think about "objects".

An object is a blob of data and code that operates on that data. It's a 
thing, much like the things in real life (cups, chairs, dogs) which 
carry state (data: the cup is full, or empty) and behaviour (dogs bark).

In this case, the type of object you have is a string, and that leads 
you to your other bit of confusion: if sentences are made up of words, 
how do we distinguish words that are part of the sentence structure from 
words being used as the object or subject of the sentence?

We write the word in quotation marks! E.g.:

    The above line contains the word "write".

In this case, "write" is not part of the structure of the sentence, it 
is the subject of the sentence. Or possibly the object. My knowledge of 
English grammatical terms is a bit lacking, sorry. In either case, it is 
the *data* that the rest of the sentence operates on.

Python is no different: words, text if you will, that are part of the 
code are written as normal:

# source code
class Test:
    pass

x = Test  # Test here refers to the variable Test, a class

But to create a string object, you use quotation marks to tell Python 
that this is data, not code, please create a string object:

x = "Test"  # Test here refers to a string, which is data

Notice that the quotation marks are *delimiters*, they mark the start 
and end of the string, but aren't part of the string in any way. Python 
knows that the object is a string because you put it in string 
delimiters, but the delimiters are not part of the string.

Now, take a step back and consider objects in general. There are two 
things we might like to do to an arbitrary object:

* display the object, which implicitly means turning it into a 
  string, or at least getting some representation of that object
  as a string;

* convert the object into a string.

Python has two built-in functions for that:

* repr, which takes any object and returns a string that represents
  that object;

* str, which tries to convert an object into a string, if that makes 
  sense.

Often those will do the same thing. For example:

py> str(42) == repr(42) == "42"
True

But not always. For example:

py> from decimal import Decimal as D
py> x = D("1.23")
py> print(str(x))
1.23
py> print(repr(x))
Decimal('1.23')


Unfortunately, the difference between str() and repr() is kind of 
arbitrary and depends on the object. str() is supposed to return a 
"human-readable" version of the object, for display, while repr() is 
supposed to return a string which would work as code, but those are more 
guidelines than hard rules.

So we have two different ways of converting an object to a string. But 
strings themselves are objects too. What happens there?

py> s = "Hello world"  # remember the quotes are delimiters, not part of the string
py> print(str(s))
Hello world
py> print(repr(s))
'Hello world'

str() of a string is unchanged (and why shouldn't it be? it's already a 
string, there's nothing to convert).

But repr() of a string creates a new string showing the representation 
of the original string, that is, what you would need to type in source 
code to make that string. That means:

1) wrap the whole thing in delimiters (quotation marks)
2) escaping special characters like tabs, newlines, and binary 
   characters.

Notice that the string returned by repr() includes quote marks as part 
of the new string. Given the s above:

py> t = repr(s)
py> print(t)
'Hello world'
py> t
"'Hello world'"

This tells us that the new string t includes single quote marks as the 
first and last character, so when you print it, the single quote marks 
are included in the output. But when you just display t interactively 
(see below), the delimiters are shown.

Now, at the interactive interpreter, evaluating an object on its own 
without saving the result anywhere displays the repr() to the screen. 
Why repr()? Well, why not? The decision was somewhat arbitrary.

print, on the other hand, displays the str() of the object directly to 
the screen. For strings, that means the delimiters are not shown, 
because they are not part of the string itself. Why str() rather than 
repr()? Because that's what people mostly want, and if you want the 
other, you can just say print(repr(obj)).



Does this help, or are you more confused than ever?

:-)



-- 
Steven


More information about the Tutor mailing list