[Python-bugs-list] [ python-Bugs-795791 ] bool() violates a numeric
invariant
SourceForge.net
noreply at sourceforge.net
Thu Aug 28 14:46:28 EDT 2003
Bugs item #795791, was opened at 2003-08-26 23:56
Message generated for change (Comment added) made by tim_one
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=795791&group_id=5470
Category: Python Interpreter Core
Group: Python 2.3
Status: Open
Resolution: None
Priority: 1
Submitted By: David Albert Torpey (dtorp)
Assigned to: Nobody/Anonymous (nobody)
Summary: bool() violates a numeric invariant
Initial Comment:
The Liskov Substitution Principle states: If for each
object o1 of type S there is an object o2 of type T such
that for all programs P defined in terms of T, the
behavior of P is unchanged when o1 is substituted for o2
then S is a subtype of T.
The current implementation of bool() violates this rule
by breaking an invariant for its parent, int() and its
related numeric types, float(), complex(), and long():
>>> for typ in (int, long, complex, float, bool):
... print typ(0), typ(str(typ(0)))
0 0
0 0
0j 0j
0.0 0.0
False True
In theory, polymorphism requires that a subclass object
be substitutable into any code that would accept an
instance of a parent class. Here, the False instance is
clearly not substitutable for zero in serializer code.
This is not just a theoretical problem, it arises in
practice and caused problems for a comp.lang.python
participant who was writing an XML serializer that
depended on numeric types being able to reconstruct
themselves from their string representations.
Another poster pointed out that the programmer could
have used eval(repr(typ(0))), but that is not
satisfactory because it is unsafe to run eval() on
serialized data from an external source.
One possible fix is to make a special case so that bool
("False") returns False. While this is a bit strange in
that all other non-empty strings evaluate to True, it has
the advantage of producing the expected behavior and
only being damaging to the highly unlikely and
probably erroneous case where someone relied on bool
("False") evaluating to True.
A more conservative fix is to issue a warning whenever
encountering bool("False"). The will at least keep a
probable error from passing silently into the good night.
----------------------------------------------------------------------
>Comment By: Tim Peters (tim_one)
Date: 2003-08-28 16:46
Message:
Logged In: YES
user_id=31435
Armin, in 2.3 a simple, safe and fast way to "un repr" a string
(or unicode) S is via
S.decode('string-escape')
IIRC, MvL added the string-escape codec.
----------------------------------------------------------------------
Comment By: M.-A. Lemburg (lemburg)
Date: 2003-08-28 09:53
Message:
Logged In: YES
user_id=38388
Please take these discussions to comp.lang.python; the SF
bug tracker is not the right place for this.
Thanks.
----------------------------------------------------------------------
Comment By: Armin Rigo (arigo)
Date: 2003-08-28 09:44
Message:
Logged In: YES
user_id=4771
Wouldn't it make sense to have a safe version of 'eval()' ?
One that can be used to read back any repr() of a built-in type.
The functionality would be similar to that of the parser
reading a literal from Python source, so maybe it is already
available. (However I think that the parser doesn't do
anything special with False and True.)
In any case I have sometimes wished I would have a simple
way to unquote a string quoted by repr() without all the
overhead and dangers of eval().
----------------------------------------------------------------------
Comment By: Steven Taschuk (staschuk)
Date: 2003-08-27 18:08
Message:
Logged In: YES
user_id=666873
I concur with Christos that this is desirable behaviour, not a bug.
It seems very important to me that
if x:
be equivalent to
if bool(x):
for all objects x. Thus if we adopt the proposal that
bool('False') == False, it seems to me that we'd also have to
have
if 'False':
*not* run the if-block. This would break existing code which
expects strings to be false only when empty. That's lots of
code.
Roundtripping through str() strikes me as a very minor
consideration. str() is expected to lose information in general
and is not in any way intended to be used for serialization. If
one insists on using it for serializing ints, then
if type(x) is int:
write(str(x))
else:
write(serialize_sensibly(x))
will avoid the problem with bools. The violation of Liskov
(which prevents us from using isinstance(x, int) here) is
unfortunate but not imho nearly a big enough deal to merit the
change.
----------------------------------------------------------------------
Comment By: Christos Georgiou (tzot)
Date: 2003-08-27 10:56
Message:
Logged In: YES
user_id=539787
Addendum, correcting myself: yes, if reading the LSP in math
lingo context, I can see that one can read it backwards (if S
is subtype of T, then S() can be used in place of T()
everywhere, a la <=>); I was interpreting it non-
mathematically, so David is correct about context.
Does it matter, though, if I create a subclass whose
instances are not substitutable for its parent class
instances? Python (and most other "OO" computer languages
I know of) allows me to do that; should I stop calling
this 'subclassing'?
----------------------------------------------------------------------
Comment By: Christos Georgiou (tzot)
Date: 2003-08-27 10:41
Message:
Logged In: YES
user_id=539787
The comments might not be helpful to your request, but they
are on topic (apart from the last paragraph, which was
intended to be humourous).
I still can't swap the if-then parts of the LSP, even in
context; that might be a problem of my way of thinking,
though.
You seem to confuse the purpose of str() (saying "There is a
reason that all of these..."); its intended use is not
serialising. repr() and eval() combined are intended for this
purpose (even if repr() doesn't/can't always do that
practically), and in this context you already know that bool
works fine. You can't expect str() to behave correctly for
bools when it doesn't for floats:
>>> x = 5**.5
>>> x == float(str(x))
False
>>> x == eval(repr(x))
True
Is float broken too?
A hammer is heavy and can be used as a paperweight, but
people 99% of the times use hammers for hammering and
paperweights to keep papers down. __repr__ is intended for
unambiguous representation (consequently, can be used for
serialisation too), and __str__ is intended to coerce to str
(just like __unicode__, and __float__, and __int__ etc do).
The burden of security (ie not passing arbitrary arguments to
eval()) is what the programmer must carry in this case. repr
() and eval() is the way to go, IMHO.
I will comply and won't answer your rhetorical question.
I would suggest you bring up this subject in the newsgroup
(or mailing list), which you might have done already without
my noticing it. You can calculate percentages counting the
responses, and if only 1% is against it, I am sure the BDFL will
take this into account; the issue brought up in bug 795791 is
not a bug, but a request for partial redesign.
----------------------------------------------------------------------
Comment By: David Albert Torpey (dtorp)
Date: 2003-08-27 10:05
Message:
Logged In: YES
user_id=681258
While quite funny, the comments are unhelpful. There is a
pickle protocol but not for bool. The original post also pointed
out that bool('False')==False would break code; however,
bool() is brand new in Py2.3 and the *only* code affected
would be in the extremely improbable case of someone
expecting that bool('False')==True. It is a nit of strict
reading that the LSP was originally stated as an inverted
if/then because it was being presented in the context of
mathematical proofs -- the correct reading in the current
context is that child instances *must* be fully substitutable
for parent instances.
Leaving bool() as it stands breaks an important invariant for
int, long, float, and complex. There is a reason that all of
these have a provision for being able to coerce their str()
forms back into the original -- that is a key feature and its
application is not limited to pickling. The PEP clearly states
that bools should behave like ints -- for instance, it is
intentional that True + 3 == 4.
If integer addition worked in every case except fo 5+6, would
you fix the C code for integer additional or would you have
every user of addition insert a "simple if" to create a custom
work-around? Don't answer, it is a rhetorical question.
I realize that there is no clean solution as there is a tension
between the expected invariant and the usual meaning of all
non-empty strings as True. The correct balance comes from
looking at the use cases and asking what is the
programmer's expected outcome when they type: bool
("False"). IMHO, the answer falls 99% towards False and 1%
towards "True".
----------------------------------------------------------------------
Comment By: Christos Georgiou (tzot)
Date: 2003-08-27 07:29
Message:
Logged In: YES
user_id=539787
There is the pickle protocol with its __getstate__,
__setstate__ methods, which seems more appropriate than str
() and eval(); however, these methods are not defined in the
base types for XML serialisation.
Changing bool('False') to evaluate as False instead of True for
all Python programs would introduce breakage; why do that
instead of a simple special-cased 'if' in a program that does
XML serialisation?
BTW I don't see any violation of the Liskov Substitution
Principle; as stated, it works one way (ie in this case it fails,
so based on the principle you *cannot* prove that bool is a
subtype of int, but we know that bool *is* a subtype of int --
it's in the source). The LSP is a method to understand if S is
a subtype of T, not a necessary step for the definition of S.
And a little tongue in cheek: the Christos Similarity Principle.
If for each person P there is another person B who has an
astonishing similarity in appearance, has almost exactly the
same age and was born in the same location as person P,
then person B is a sibling of person P. Does that mean that
my brother is not my sibling just because he doesn't look like
me, is 14 years younger and wasn't born in the same hospital
as me? :)
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=795791&group_id=5470
More information about the Python-bugs-list
mailing list