[Python-Dev] Re: PEP 285: Adding a bool type

2 Apr 2002

      Dear Guido:

I would first like to thank you for giving us an opportunity to respond.
I have spent most of the weekend thinking and writing a reply to this,
and I think that this has made me a better teacher.  For this I am
grateful.  I realise that this is rather long, but I have condensed
it substantially from my efforts earlier this weekend.   Thank you
for taking the time to read this.

Laura Creighton

-----------------------------

I am opposed to the addition of the new proposed type on the grounds
that it will make Python harder to teach both to people who have never
programmed before and to people who _have_.  If they have no
preconceived idea of Booleans, then I do not propose to need to teach it
to them in an introductory lesson.  There is a time to learn symbolic
logic but while trying to learn how to program for the first time is not
it.  If, on the other hand, they _do_ have some preconceived idea of
Booleans then Python will not have what they want.  They will want 
stricter booleans that 'behave properly'.

Maybe some day we should give these people such a type.  People who
do symbolic logic and make push-down automatons all day long will love
it. If we implement this, not out of objects, but out of bits, and have
something really sparse in memory consumption, we will cause jubilation
in the NumPy community as well.  But I don't want to discuss this here,
I only want to discuss the new type which is proposed in this PEP.  And
I believe that this half-way thing, int-in-hat,  that is called bool 
but which does not implement what a mathematician would call truth values
will make it much harder for me to teach the Python language and programming
in general.  Not only do I not need, but I actively do not want this
change which is the integer 0 and the integer 1 that have some
odd printing properties.

        1) Should this PEP be accepted at all.

No.

        2) Should str(True) return "True" or "1": "1" might reduce
           backwards compatibility problems, but looks strange to me.
           (repr(True) would always return "True".)

Now we have a teaching problem.  If str(True) returns anything but
'True' then I am going to have to explain this to newbies, really early
on. I can't see myself claiming that 1 is the string representation of
True.  I can see myself explaining that int(True) is 1, or that bool(1)
is True.  If I say that the string representation of True is 1, then I
must assert that True is just a fancier, prettier way of writing 1.

But this breaks the more common practice where str is used for the
prettier way of writing things and repr is used for the uglier one.  And
I guarantee my students will notice this, especially if they have heard
me explain why their floating point numbers are not printed the way that
they expect. I don't want to have to explain that eval(repr(object)) is
supposed to generate the object whenever possible, for the last thing I
want newbies to be thinking about is eval().  I guess I am going to have
to say that it is a wart on the language, and that we have it this way
so as not to break too much exisiting code.

I think this wart is far uglier than the lack of a half-way-but-not-
quite Boolean Truth value.  But I am all for this wart rather than break
the exisiting code.

        3) Should the constants be called 'True' and 'False'
           (corresponding to None) or 'true' and 'false' (as in C++, Java
           and C99).

I would rather the constants, if we have them, be called anything other
than True or False.  It is not the things that you don't know that hurt
you the most in learning a language -- it is 'the things that you know
that ain't so'.  Learning C, from PL/1, my first problem was 'how do I
write a procedure'?  You see, I aleady _knew_ that _by definition_
functions returned values and procedures didn't, and so, since I didn't
want to return a value, I didn't want to write a function.  Learning
that your basic defintions are wrong generally requires a poke from the
outside.  The same thing happened to me when learning Python.  I knew,
by definition, that attributes were not callable.  Using getattr() to test
whether  a class had a method did not occur to me, despite pouring over
python docs for 2 days looking for such a thing.  Finally I went after
it out of the __class__ -- and immediately posted something to
python-list, saying 'This is ugly as sin.  What do real people do?'  I
never got around to questioning whether 'attributes by definition, may
not be a callable' in precisely the same way I never got around to
questioning whether 'you could write a function that did not return a
value'.

The problem is that _everybody_ has some conceptual understanding of
True and False.  And whatever that understanding is, it is unlikely to
be the one used by Python.  Python does not distinguish between True and
False -- Python makes the distinction between something and nothing.
And I think that this is one of the reasons why well-written Python
programs are so elegant.  And this is what I am trying to teach people.

So I out-and-out tell people this.  {} is a dictionary-shaped nothing.
[] is a list-shaped nothing. 0 is an integer-shaped nothing.
0.0 is a float shaped nothing.

I want to save them from the error of writing

if (bool(myDict) == True):

and if they start out believing that python only distinguishes between
Something and Nothing, they mostly are ok.  And to rub this point in you
can do this:
...
...
...
False = 1
True = 0
if False:
...     print 'Surprise!'
...
Surprise!
...
...
...
if True:
...     print 'I am True'
... else:
...     print 'Surprise Again!'
...
Surprise Again!

This is a very nice eye-opener.  It is a true joy.  Watch the minds go
*pop*! and the preconceived notions disappear. (You then let them know 
exactly what you will think of them if they ever do this 'for real', of
course.)

        4) Should we strive to eliminate non-Boolean operations on bools
           in the future, through suitable warnings, so that e.g. True+1
           would eventually (e.g. in Python 3000 be illegal).  Personally,
           I think we shouldn't; 28+isleap(y) seems totally reasonable to
           me.

This is not an argument for allowing non-Boolean operations on bools();
this is an argument for not writing functions that return Booleans.  Make
them return numbers instead, so that you can use them as you did.  Last
month we discussed why in 1712 February had 30 days in Sweden.  (See:

http://groups.google.com/groups?q=leap+year+Sweden+group:
comp.lang.python.*&hl=en&selm=Xns91D08815184B2svenaxelssonbokochwe%
40212.37.1.234&rnum=1

if you care, and you missed it.)

I live in Sweden.  Assigning students the problem of calculating whether
or not a given year is a leap year in Sweden appeals to me. 

But I know students.  I guarantee that I will get an isleap that returns 
a True or a False under the proposed new regime.  And this is precisely
what I do not want, which I will try to teach by assigning, next week,
a program that calculates how many days are in a given year.  I predict
that I will get a lot of answers like this:

if year == 1712:
    days = 367
elif isSwedishLeap(year):
    days = 366
else:
    days = 365

There are many things wrong with this code.  The bizzare special case of
1712 belongs in the SwedishLeap function, with the rest of the
weirdnesses. Thus I will have to convince my students that it is better
to write a function that does not return True or False.  And this is
despite the fact that I originally asked for 'is Y a leap year or not'.

This will be a necessary exercise.  If True and False are in the
language, I am going to have to work especially hard to teach my
students that you (mostly) shouldn't use the fool things.  There is
almost always some value that you would like to return instead.

Having renamed SwedishLeap and fixed it to return how many leap days,
instead of a bool, I have now made a different problem for myself and 
my students.

The new improved solution for 'is this year a leap year' will be:

if bool(SwedishLeap(year)) == True:  
    # the better students will say 'is' instead of '=='
    print 'yes.'
else:
    print 'no.'

Aaargh!
I already see too much code like this.  It's mostly written by people
who come from other languages.  They define their own True and False so
they can do this.  (And they mostly have an extra set of () as well).
Right now I have the perfect fix for this.  I just say 'Python does not
care about True and False, only nothing or something'.  You have just
stolen my great weapon.

What am I going to say?

attempt 1.

Python pretends to have bools, but they are just ints in fancy hats.  So
you are making more comparisons than are necessary.

smart student:

But you said it is better to be explicit than implicit!  And here I am
explicitly performing the type coercion rather than let it happen
implicitly! (or PyChecker _warned_ me that I was making an implicit 
conversion!) I put in the cast so that people will know exactly what
is happening!

attempt 2.

But they are _really_ ints 'under the hood'.  I was not kidding about the
fancy hats!  if SwedishLeap(year): is precisely what you want.  You
don't want to test against True at all!

smart student:

Comparisons yield boolean values.  Therefore they _want_ a Boolean
value.  You are just being lazy because it saves typing.  In the bad old
days before we had Booleans this was ok, but now that we have them we
should use them!  Otherwise what good are they?  What should I be using
them for if not for this?

attempt 3.

Got an hour? I'd like to explain signature-based polymorphism to you ...

smart student:

ha! ha! ha!

        5) Should operator.truth(x) return an int or a bool.  Tim Peters
           believes it should return an int because it's been documented
           as such.  I think it should return a bool; most other standard
           predicates (e.g. issubtype()) have also been documented as
           returning 0 or 1, and it's obvious that we want to change those
           to return a bool.

I think that operator.truth(x) should return an int because about the
only thing I use it for is operator.truth(myDict).  I really want the
integer value.  How do you propose I get it if you change things?  Why
make me go through the gyrations of int(bool(myDict))?

By the way, look at the list of those things that would be changed
to return a bool.  Most of them are python implementations of ANCIENT
C functions.  They date from the time before we invented exceptions!
The primitive old days when we had to test for every possible time
you wouldn't want to run your code before you actually got to run it.

I don't want to go back to the days of

if aflag or (bflag and (cflag or dflag)):

either.

And all Truth testing (unless you are doing symbolic logic) reeks
this way to me, of the old style I am trying to stamp out.  Truth
testing is just another form of type testing, and just as ugly.

    Rationale

        Most languages eventually grow a Boolean type; even C99 (the new
        and improved C standard, not yet widely adopted) has one.

        Many programmers apparently feel the need for a Boolean type; most
        Python documentation contains a bit of an apology for the absence
        of a Boolean type.

So fix the docs, don't change the code! <wink>.  I think the fact that
in python control flow structures distinguish between Something and
Nothing is one of the beauties and glories of the language, and you
should delete any documentation that says otherwise.

Under the proposed new scheme you will have to trade apologies for the
lack of bools, for apologies for not producing real bools, only this
int-in-a-new-hat hack that pretends to be a bool.  This is hardly
progress.

        I've seen lots of modules that defined
        constants "False=0" and "True=1" (or similar) at the top and used
        those.  The problem with this is that everybody does it
        differently.  For example, should you use "FALSE", "false",
        "False", "F" or even "f"?  And should false be the value zero or
        None, or perhaps a truth value of a different type that will print
        as "true" or "false"?  Adding a standard bool type to the language
        resolves those issues.

So would adding True and False to the __builtins__, and probably
operator.truth as well, and then modifying PEP 8 saying to use the
things if you actually have a need for True and False.  Then you could
also get a much needed word in edgewise discouraging

if bool(x) == True:

or actually using True and False much, because there is usually a better
more pythonic way to do what people used to other languages are accustomed
to doing with booleans.  This is precisely what some people have said
here: 'When I started using Python, I made True and False, but once I
stopped trying to program in some other language using python, I 
stopped needing these things'.  (see the recent post by Don Garrett
 in this thread for an example.)

This is what I have observed as well.  And I fear if you add these new
types to the langauge people will never take this step.  The existence 
of the types in the language will discourage them from thinking that using
True and False all the time is not pythonic.  It is nice that people are
puzzled, wondering how all those python programmers live without a
boolean type.  Eventually they puzzle it out.  This is not a bug, but
a feature <wink>.

        Some external libraries (like databases and RPC packages) need to
        be able to distinguish between Boolean and integral values, and
        while it's usually possible to craft a solution, it would be
        easier if the language offered a standard Boolean type.

I'm one of the people who build interfaces to databases that need to
distinguish this.  For what it's worth, can you please not add this
feature to the language?  Don't do it for me ....

        The standard bool type can also serve as a way to force a value to
        be interpreted as a Boolean, which can be used to normalize
        Boolean values.  Writing bool(x) is much clearer than "not not x"
        and much more concise than

            if x:
                return 1
            else:
                return 0

Conciseness for its own sake is no virtue.

        Here are some arguments derived from teaching Python.  When
        showing people comparison operators etc. in the interactive shell,

        I think this is a bit ugly:
            >>> a = 13
            >>> b = 12
            >>> a > b

            1

        If this was:
            >>> a > b

            True

        it would require one millisecond less thinking each time a 0 or 1
        was printed.

This is the basis of our disagreement.  I think that it is very, very
important that much more than a millisecond be spent on this.  This is a
fundamental python concept, which I want to teach.  This is precisely
where you learn that python distinguishes between Something and Nothing,
and if you have a problem seeing why this implies  a > b printing as 1,
then you probably have a problem with the whole concept.  And making it
return True is precisely what I never, ever, ever want a python learner
to see.

People who come from staticly declared languages have a terrible burden
to overcome when they meet python.  They are not used to the fact that
everything is an object.  They have rigid barriers in their heads
between 'control statements' and 'data'.  This is precisely where such
conceptual barriers first begin to crumble.  This is precisely the
experience I _want_ my students to have.  It is precisely how I train
people to give up their old ideas.  You have just made my teaching job
harder, not easier.

I've never had any trouble teaching anybody that if 1: means do it, and
if 0: means don't.  Ever.  Who is it that has had such trouble?  I fear
they may be new to teaching and are confusing 'this is taking the
students a while to learn because it is something new that they have
never seen before' with 'this is taking the students a while to learn
because the language is broken'.  Learning that python distinguishes
between Something and Nothing is a completely new idea that you are
giving these people, outside of their experience.  This is going to take
a while to get used to.  But it is going to take _longer_ to get used
to, if there is all this confusing stuff about symbolic logic, truth
values, George Boole, and why the math majors in the class are all
snickering and saying 'Python is a sucky language.  Its implementation
of booleans is so, so, lame...' thrown in as well.

If you already know what a boolean is, then chances are Python's bools
are not going to behave the way you expect them to.  If you don't know
what a bool is, then I feel morally obligated to teach you the
difference between real truth values and this int-in-a-hat kludge.  In
either case, I now have to spend a bunch of time teaching about
booleans, something I had no desire to do before today. I _want_ to teach 
that if 1: means do it and if 0: means don't.  I want to teach that python
makes a distinction between Something and Nothing. And I can teach that a
_lot_ faster than An Introduction to Boolean Algebra ...

        There's also the issue (which I've seen puzzling even experienced
        Pythonistas who had been away from the language for a while) 
        that if you see:

            >>> cmp(a, b)
            1
            >>> cmp(a, a)
            0

        you might be tempted to believe that cmp() also returned a truth
        value.  If ints are not (normally) used for Booleans results, this
        would stand out much more clearly as something completely
        different.

I don't think that people are confused about this because they think 
that cmp is returning True or False.  I've had people go on believing 
that cmp should return one of 2 values even as I was telling them that
it returned one of 3.  The desire for cmp to be a two-valued comparison 
runs deep in many souls, and is tied to a passionate belief that comparisons 
are binary, not that they return True and False.

        I don't see this as a problem, and I don't want evolve the
        language in this direction either; I don't believe that a stricter
        interpretation of "Booleanness" makes the language any clearer.

I think that a really strict boolean might be nice to have.  And if you
ever write one, you will curse the day you let this hack be named bool.
Replacing this int-with-a-hat with real bools will break so much
code. 

        Other languages (C99, C++, Java) name the constants "false" and
        "true", in all lowercase.  In Python, I prefer to stick with the
        example set by the existing built-in constants, which all use
        CapitalizedWords: None, Ellipsis, NotImplemented (as well as all
        built-in exceptions).  Python's built-in module uses all lowercase
        for functions and types only.  But I'm willing to consider the
        lowercase alternatives if enough people think it looks better.

I'd really like something other than True and False altogether.
Something Nothing or Empty Full or Yin Yang, __anything__ which people
will not believe they already understand.  It is so much easier to teach
something which people know is brand new rather than something which is
brand new but looks like something subtly different that people already
believe they know.

I have this problem a lot.  It is hard to teach people what floating
point numbers are because their grade school teachers have taught them
what fixed point decimals are so well (except for the name).  If
floating point numbers were traditionally printed xxx#yyy instead of
xxx.yyy I do not think that I would have such difficulties.  In any case
the people who do not understand floating point would be aware that
there is something that they do not understand, rather than blinding 
going off an using them to represent money.

I don't want to have to teach the pythonic meaning of True and False to
people who already believe they know what True and False is.  This is
going to be hell.  It is about as hard a thing as there is in teaching,
getting around people's previous conceptions.

        It has been suggested that, in order to satisfy user expectations,
        for every x that is considered true in a Boolean context, the
        expression x == True should be true, and likewise if x is
        considered false, x == False should be true.  This is of course
        impossible; it would mean that e.g. 6 == True and 7 == True, from
        which one could infer 6 == 7.  Similarly, [] == False == None
        would be true, and one could infer [] == None, which is not the
        case.  I'm not sure where this suggestion came from; it was made
        several times during the first review period.  For truth testing
        of a value, one should use "if", e.g. "if x: print 'Yes'", not
        comparison to a truth value; "if x == True: print 'Yes'" is not
        only wrong, it is also strangely redundant.

This suggestion came from somebody's preconceived idea of 'what is True'
and 'what does it mean for something to be True'.  Now that you have
posted this to python-list, you have found a whole lot more.  Have you
ever had such a response to a PEP?  Everybody thinks that they know what
True and False means, and that Python should do it his or her own way.
This is what everybody's classroom looks like as well.  And this is
why sane teachers do not want to discuss the meaning of True if they
can help it.

I want to keep python out of the True and False business.  Python cares
about whether a value is Something or Nothing.  This is beautiful, and
_better_ than what the other languages do.

Once again, Thank you for reading this and giving me a chance to write up my
objections.  You have made me a better teacher because of this.

Laura Creighton