How to guard against bugs like this one?

Tue Feb 2 10:12:02 EST 2010

On Tue, Feb 2, 2010 at 6:13 AM, kj <no.email at please.post> wrote:

> In <pan.2010.02.02.03.28.54 at REMOVE.THIS.cybersource.com.au> Steven
> D'Aprano <steven at REMOVE.THIS.cybersource.com.au> writes:
>
> >As for fixing it, unfortunately it's not quite so simple to fix without
> >breaking backwards-compatibility. The opportunity to do so for Python 3.0
> >was missed.
>
> This last point is to me the most befuddling of all.  Does anyone
> know why this opportunity was missed for 3.0?  Anyone out there
> with the inside scoop on this?  Was the fixing of this problem
> discussed in some PEP or some mailing list thread?  (I've tried
> Googling this but did not hit on the right keywords to bring up
> the deliberations I'm looking for.)
>

The problem with "fixing" the import system is that touching it in any way
potentially breaks *vast* volumes of code that works just right, right now
in any version of Python. It's not even something you can easily test:
there's a lot of edge-cases supporting various features that have been added
over the years. A pure-python import mechanism with a set of test-cases to
"prove" it reliably mimic'd the existing mechanism wasn't even complete by
the time Python3 came out, without that, any changes to "fix" Python3+'s
import system would just be a shot in the dark.

And there'd be no telling just /what/ it would break if you start changing
these very old import semantics.

> P.S. Yes, I see the backwards-compatibility problem, but that's
> what rolling out a whole new versions is good for; it's a bit of
> a fresh start.  I remember hearing GvR's Google Talk on the coming
> Python 3, which was still in the works then, and being struck by
> the sheer *modesty* of the proposed changes (while the developers
> of the mythical Perl6 seemed to be on a quest for transcendence to
> a Higher Plane of Programming, as they still are).

Yes, it was a relatively modest upgrade, thankfully so. We actually have our
fixed language, and Perl6 is coming on some point on the horizon. Python3
allowed backwards incompatible changes, but they still needed to have a
justification to them: and more importantly, they usually needed to change
from one well-defined state to a new, 'cleaner' well-defined state, so they
could be automatically corrected or at least easily found. The Bytes/Unicode
situation is one case where there wasn't a well-defined state you can
convert from, and so with Python3, every single string you really need to
look at and decide, "Should this have been bytes, or characters?".

Having to ask that question for every string is a significant burden, to
expect someone to have to ask the same sort of question for every single
import is asking -way- too much for people porting to Python 3.

Py3k's adoption is a slow, uphill process-- no one expected (from my reading
at least)-- it to take less then years and multiple 3.x iterations before
people would /really/ significantly start using it. There's too many third
party modules people depend on and they have to slowly be converted. They're
being converted, and IMHO there's steadily growing momentum on this (albeit
not huge momentum), but had they messed with the import semantics-- this
would be /way/ more difficult, and you might end up with a DOA project.

> In particular
> the business with print -> print() seemed truly bizarre to me: this
> is a change that will break a *huge* volume of code, and yet,
> judging by the rationale given for it, the change solves what are,
> IMHO, a relatively minor annoyances.  Python's old print statement
> is, I think, at most a tiny little zit invisible to all but those
> obsessed with absolute perfection.

You have a very strange perception of huge. For one thing, every single use
of print() can be converted -automatically-: the rules of this change are
very clear. The print statement was a bizarre sort of wart (look at >>!) and
fixing it is easy. Further, *huge* volume of code-- /really/?! In my
experience, print isn't really all /that/ common after you get out of the
tutorial and start doing real code. If you're doing lots of shell scripting
maybe that's not the case: but for any sort of server-daemon-process sort of
apps, print is utterly unacceptable. I use the logging module. For GUI apps,
same.

But for those times when you use print.. its /really easy/ to fix.

And I can't imagine that whatever
> would be required to fix Python's import system could break more
> code than redefining the rules for a workhorse like print.
>

Your imagination is lacking here then: virtually every single bit of Python
code uses import, and messing with it can have wide implications that you
can't even /define/ ahead of time. As such you can't write a guide, a 2to3
fixer, or a HOWTO to tell people: hey, this is how you used to do things in
Python2, this is how you do them in Python3. Having them mess with the
import machinery would be throwing every Py3 developer under the bus,
saying, "Well, just run it. Fix whatever errors happen."

> In contrast, the Python import problem is a ticking bomb potentially
> affecting all code that imports other modules.  All that needs to
> happen is that, in a future release of Python, some new standard
> module emerges (like numbers.py emerged in 2.6), and this module
> is imported by some module your code imports.  Boom!  Note that it
> was only coincidental that the bug I reported in this thread occurred
> in a script I wrote recently.  I could have written both scripts
> before 2.6 was released, and the new numbers.py along with it;
> barring the uncanny clairvoyance of some responders, there would
> have been, at the time, absolutely no plausible reason for not
> naming one of the two scripts numbers.py.
>

This is drifting into hyperbole; yes, Python's flat import namespace isn't
ideal, yes, it can mean that when a new version is released you may shadow
something you didn't before, yes, that can cause code to break. Its really,
really, really not anything like as bad as you're making it out to be.

Your bomb can be averted via:

    - Don't use relative imports.

    - Don't structure your code and environment such that your libs and/or
scripts end up living in the global namespace if you want to be
future-proof.

    - If you didn't future-proof your code, then check What's New on new
versions of Python.

There, time bomb disarmed.

To the argument that the import system can't be easily fixed because
> it breaks existing code, one can reply that the *current* import
> system already breaks existing code, as illustrated by the example
> I've given in this thread: this could have easily been old pre-2.6
> code that got broken just because Python decided to add numbers.py
> to the distribution.

Yes, one can make that argument, but it would be specious. Messing with the
import system -would- have broken code, and done so in a way which can not
be easily or clearly identified by looking at code directly, as you wouldn't
know "break" until you somehow took the /entire/ layout of the system into
account, which is infeasible.

The current import system -may- break code, if someone writes their code in
such a way, structures their code in such a way, and Python happens to add a
new item to its standard library which causes that code to now shadow a
standard module.

The former case, 'change the import system', requires someone look at every
single import, evaluate that import in the full context of the environment,
and determine if its working as intended or not.

The latter case, 'leave the non-ideal import system in place', requires
someone look at imports of top-level modules, and ensure that they don't
shadow built-ins. You don't need to know where all the sys.path items are,
you don't need to know if you're importing from a zip file, or anything like
that. You just need to see if your top-level / relative import is shadowing
a new standard library module or not.

Yes, the latter case means that if you are supporting new Python versions
that come out, you'll have to make that evaluation with each one if you
decide not to code and structure your code in such a way that it is
future-proof.

> (Yes, Python can't guarantee that the names
> of new standard modules won't clash with the names of existing
> local modules, but this is true for Perl as well, and due to Perl's
> module import scheme (and naming conventions), a scenario like the
> one I presented in this thread would have been astronomically
> improbable.  The Perl example shows that the design of the module
> import scheme and naming conventions for standard modules can go
> a long way to minimize the consequences of this unavoidable potential
> for future name clashes.)
>

Again: It is very, very avoidable.

I know, cuz I've sorta been avoiding it easily and successfully for years
now :P

--S
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20100202/03fcdb07/attachment.html>