Coding style
Bruno Desthuilliers
bdesth.quelquechose at free.quelquepart.fr
Tue Jul 18 18:53:48 EDT 2006
Volker Grabsch a écrit :
> Bruno Desthuilliers <onurb at xiludom.gro> schrieb:
>
>>Carl Banks wrote:
>>
>>>Bruno Desthuilliers wrote:
>>>
>>>I'm well aware of Python's semantics, and it's irrelvant to my
>>>argument.
>
> [...]
>
>>>If the language
>>>were designed differently, then the rules would be different.
>>
>>Totally true - and totally irrelevant IMHO.
>
>
> I strongly advise not to treat each others thoughts as irrelevant.
> Assuming the opposite is a base of every public dicussion forum.
"Irrelevant" may not be the best expression of my thought here - it's
just that Carl's assertion is kind of a tautology and doesn't add
anything to the discussion. If Python had been designed as statically
typed (with declarative typing), the rules would be different. Yeah,
great. And now ?
> I assume here is a flaw in Python. To explain this, I'd like to
> make Bruno's point
Actually Carl's point, not mine.
> clearer. As usually, code tells more then
> thousand words (an vice versa :-)).
>
> Suppose you have two functions which somehow depend on the emptyness
> of a sequence. This is a stupid example, but it demonstrates at
> least the two proposed programming styles:
>
> ------------------------------------------------------
>
>>>>def test1(x):
>
> ... if x:
> ... print "Non-Empty"
> ... else:
> ... print "Empty"
> ...
>
>>>>def test2(x):
>
> ... if len(x) > 0:
> ... print "Non-Empty"
> ... else:
> ... print "Empty"
> ------------------------------------------------------
>
> Bruno
Carl
> pointed out a subtle difference in the behaviour of those
> functions:
>
> ------------------------------------------------------
>
>>>>a = []
>>>>test1(a)
>
> Empty
>
>>>>test1(iter(a))
>
> Non-Empty
>
>>>>test2(a)
>
> Empty
>
>>>>test2(iter(a))
>
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> File "<stdin>", line 2, in test2
> TypeError: len() of unsized object
> ------------------------------------------------------
>
>
> While test1() returns a wrong/random result when called with an
> iterator, the test2() function breaks when beeing called wrongly.
Have you tried these functions with a numpy array ?
> So if you accidently call test1() with an iterator, the program
> will do something unintended, and the source of that bug will be
> hard to find. So Bruno is IMHO right in calling that the source
> of a suptle bug.
Actually it's Carl who makes that point - MHO being that it's a
programmer error to call a function with a param of the wrong type.
> However, if you call test2() with an iterator, the program will
> cleanly break early enough with an exception. That is generally
> wanted in Python. You can see this all over the language, e.g.
> with dictionaries:
>
> ------------------------------------------------------
>
>>>>d = { 'one': 1 }
>>>>print d['one']
>
> 1
>
>>>>print d['two']
>
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> KeyError: 'two'
> ------------------------------------------------------
>
> Python could have been designed to return None when d['two'] has been
> called, as some other (bad) programming languages would. This would
> mean that the problem will occur later in the program, making it easy
> to produce a subtle bug. It would be some effort to figure out the
> real cause, i.e. that d had no entry for 'two'.
I don't think the comparison is right. The equivalent situation would be
to have a function trying to access d['two'] on a dict-like type that
would return a default value instead of raising a KeyError.
> Luckily, Python throws an exception (KeyError) just at the original
> place where the initial mistake occured. If you *want* to get None in
> case of a missing key, you'll have to say this explicitly:
>
> ------------------------------------------------------
>
>>>>print d.get('two', None)
>
> None
> ------------------------------------------------------
>
> So maybe "bool()" should also break with an exception if an object
> has neither a __nonzero__ nor a __len__ method, instead of defaulting
> to True.
FWIW, Carl's main example is with numpy arrays, that have *both* methods
- __nonzero__ raising an expression.
> Or a more strict variant of bool() called nonempty() should
> exist.
>
> Iterators don't have a meaningful Boolean representation,
> because
> phrases like "is zero" or "is empty" don't make sense for them.
If so, almost no type actually has a "meaningfull" boolean value. I'd
rather say that iterators being unsized, the mere concept of an "empty"
iterator has no meaning.
> So
> instead of answering "false", an iterator should throw an exception
> when beeing asked whether he's empty.
> If a function expects an object to have a certain protocol (e.g.
> sequence), and the given object doesn't support that protocol,
> an exception should be raised.
So you advocate static typing ? Note that numpy arrays actually have
both __len__ and __nonzero__ defined, the second being defined to
forgive boolean coercion...
> This usually happens automatically
> when the function calls a non-existing method, and it plays very
> well with duck typing.
>
> test2() behaves that way, but test1() doesn't. The reason is a
> sluttery of Python. Python should handle that problem as strict
> as it handles a missing key in a dictionary. Unfortunately, it
> doesn't.
Then proceed to write a PEP proposing that evaluating the truth value of
an iterator would raise a TypeError. Just like numpy arrays do - as a
decision of it's authors.
> I don't agree with Bruno
s/bruno/Carl/
> that it's more natural to write
> if len(a) > 0:
> ...
> instead of
> if a:
> ...
>
> But I think that this is a necessary kludge you need to write
> clean code. Otherwise you risk to create subtle bugs.
s/you risk to create/careless programmers will have to face/
And FWIW, this is clearly not the opinion of numpy authors, who state
that having len > 0 doesn't means the array is "not empty"...
> This advise,
> however, only applies when your function wants a sequence, because
> only in that can expect "len(a)" to work.
Since sequence types are defined as having a False value when empty,
this test is redondant *and* "will create subtle bugs" when applied to a
numpy array.
> I also agree with Carl that "if len(a) > 0" is less universal than
> "if a", because the latter also works with container-like objects
> that have a concept of emptiness,
s/emptiness/boolean value/
> but not of length.
> However, this case is less likely to happen than shooting yourself
> in the foot by passing accidently an iterator to the function
> without getting an exception. I think, this flaw in Python is deep
> enough to justify the "len() > 0" kludge.
It surely justify some thinking on the boolean value of iterators. Since
the common idiom for testing non-None objects is an explicit identity
test against None - which makes sens since empty sequences and zero
numerics eval to False in a boolean context - the less inappropriate
solution would be to have iterators implementing __nonzero__ like numpy
arrays do.
>
> IMHO, that flaw of Python should be documented in a PEP as it violates
> Python's priciple of beeing explicit.
Here again, while I agree that there's room for improvement, I don't
agree on this behaviour being a "flaw" - "minor wart" would better
describe the situation IMHO.
More information about the Python-list
mailing list