Re: Python Mystery Theatre -- Episode 2: Así Fue
creedy at mitretek.org
Tue Jul 15 16:27:00 CEST 2003
Raymond Hettinger wrote:
> Here are four more mini-mysteries for your amusement
> and edification.
> In this episode, the program output is not shown.
> Your goal is to predict the output and, if anything
> mysterious occurs, then explain what happened
> (again, in blindingly obvious terms).
> There's extra credit for giving a design insight as to
> why things are as they are.
> Try to solve these without looking at the other posts.
> Let me know if you learned something new along the way.
> To challenge the those who thought the last episode
> was too easy, I've included one undocumented wrinkle
> known only to those who have read the code.
I thought this one was much tougher than the Act 1. I ended up doing a
lot of research on this one. I haven't read the other answers yet, I've
been holding off until I finished this. (Having read my response, I
apologize for the length. I don't think I scored so well on "blindingly
obvious".) Here goes ...
> ACT I -----------------------------------------------
> print '*%*r*' % (10, 'guido')
> print '*%.*f*' % ((42,) * 2)
This one wasn't hard. I've used this feature before. The stars at the
front and back tend to act as visual confusion. The stars in the middle
indicate an option to the format that is provided as a parameter. Thus
the first one prints the representation (%r) of the string 'guido' as a
ten character wide field. When I tried it, the only thing I missed was
that the representation of 'guido' is "'guido'" not "guido". So the
first one prints out:
which would have been my first guess.
The second one takes just a little more thought. The result of this is
print '*%.42f*' % 42
That is a fixed point number with 42 digits after the decimal point.
(Yes, I did copy that from Idle rather than counting zeros.)
Aside: I have to admit that the ((42,) * 2) did confuse me at first. I'm
so used to doing 2 * (42,) when I want to repeat a sequence that I
hadn't thought about the reversed form.
Having used this feature before, I have to say that I think the
documentation for how to do this is quite comprehensible.
> ACT II -----------------------------------------------
> s = '0100'
> print int(s)
> for b in (16, 10, 8, 2, 0, -909, -1000, None):
> print b, int(s, b)
Boy! This one send me to the documentation, and finally to the code.
According to the documentation the legal values for the parameter b are
b = 0 or 2 <= b <= 36. So the first print yields 100 (the default base
for a string is 10 if not specified). The next few lines of output are:
The only one that deserves an additional comment is the last line.
According to the documentation, a base of 0 means that the number is
interpreted as if it appeared in program text, in this case, since the
string begins with a '0', its interpreted as base 8.
Let's skip -909 for a moment. -1000 raises an exception. None would also
raise an exception if we ever got there. I also find that one a little
non-intuitive, more about that later.
For no immediately apparent reason (Raymond's undocumented wrinkle!),
the next line of the output (after the above) is:
The only reason I found that was to try it. After hunting through the
code (Yes, I have no problem with C. No, I'm not familiar with the
organization of the Python source.) I eventually (see int_new in
intobject.c) find out that the int function (actually new for the int
type) looks like it was defined as:
def int(x, b=-909):
That is, the default value for b is -909. So, int('0100', -909) has the
same behavior as int('0100'). This explains the result.
Having read the code, I now understand _all_ about how this function
works. I understand why there is a default value. For example:
int(100L) yields 100, but there is no documented value for b such that
int(100L, b) yields anything except a TypeError. However, using b=-909
is the same as not specifying b. This allows me to write code like:
if type(x) is str:
b = 16
b = -909
return int(x, b)
I'm not really sure whether that's better than, for example
if type(x) is str:
return int(x, 16)
or not. However, I find the use of the constant -909 is definitely
"magic". If it was up to me, I would use a default value of b = None, so
that int(x) and int(x, None) are equivalent. It seems to me that that
could be documented and would not be subject to misinterpretation.
> ACT III ----------------------------------------------------
> def once(x): return x
> def twice(x): return 2*x
> def thrice(x): return 3*x
> funcs = [once, twice, thrice]
> flim = [lambda x:funcs(x), lambda x:funcs(x), lambda x:funcs(x)]
> flam = [lambda x:f(x) for f in funcs]
> print flim(1), flim(1), flim(1)
> print flam(1), flam(1), flam(1)
This one was ugly. I guessed the right answer but then had to do some
more research to understand exactly what was going wrong.
The first line prints 1, 2, 3 just like you expect.
First reaction, the second line also prints 1, 2, 3. But, Raymond
wouldn't have asked the question if it was that easy. So, guessing that
something funny happens I guessed 3, 3, 3. I tried it. Good guessing.
After a bunch of screwing around (including wondering about the details
of how the interpreter implements lambda expressions). At one point I
tried the following (in Idle):
for f in flam: print f(1)
And wondered why I got an exception for exceeding the maximum recursion
limit. What I finally realized was that the definition of flam
repeatedly binds the variable f to each of the functions in funcs. The
lambda expression defines a function that calls the function referenced
by f. At the end of the execution of that statement, f is thrice, so all
three of the defined lambdas call thrice. That also explains why I hit
the maximum recursion limit.
At this point I felt like I had egg on my face. I've been burned by this
one in the past, and I spent a while figuring it out then. The fix is easy:
flam = [lambda x, fn=f: fn(x) for f in funcs]
which creates a new local binding which captures the correct value at
each iteration. This is the kind of problem which makes me wonder
whether we ought to re-think about binding of variables for loops.
> ACT IV ----------------------------------------------------
> import os
> os.environ['one'] = 'Now there are'
> os.putenv('two', 'three')
> print os.getenv('one'), os.getenv('two')
Obviously, this one is trying to trick you into thinking it will print
'Now there are three'. I ended up trying it and getting 'Now there are
None'. Then I went back and read the documentation. What I got confused
about was that os.putenv updates the external environment without
changing the contents of os.environ. Updating os.environ will change the
external environment as a side effect. I had read about this before but
had gotten the two behaviors reversed in my head.
Now, why is it this way? It makes sense that you may have a use case for
changing the external environment without changing the contents of
os.environ and so need a mechanism for doing so. However, on reflection,
I'm not sure whether I think the implemented mechanism is
counter-intuitive or not.
More information about the Python-list