Provide a Python wrapper for any new C extension

One of the arguments put forward against renaming the existing time module to _time (as part of incorporating a pure-Python strptime function) is that it could break some builds. Therefore I'd suggest that it could be a useful principle for any C extension added in the future to the standard library to have an accompanying pure-Python wrapper that would be the one that client code would usually import. Hamish Lawson

Hamish Lawson wrote:
Sounds like a plan :-) BTW, this reminds me of the old idea to move that standard lib into a package, eg. 'python'... from python import time. We should at least reserve such a name RSN so that we don't run into problems later on. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/

[Hamish Lawson]
I am for that, but then again I am biased in this situation. =) But it seems reasonable. I would think everyeone who makes any major contribution of code to Python would much rather code it up in Python then C. It would probably help to get more code accepted since I know I felt a little daunted having to write that callout for strptime. The only obvious objection I can see to this is a performance hit for having to go through the Python stub to call the C extension. But I just did a very simple test of calling strftime('%c') 25,000 times from time directly and using a Python stub and it was .470 and .490 secs total respectively according to profile.run(). The oher objection I can see is that this would promote coding everything in Python when possible and that might not always be the best solution. Some things should just be coded in C, period. But I think for such situations that the person writing the code would most likely recognize that fact. Or maybe I am wrong in all of this. I don't know the exact process of how a C extension file gets accepted or what currently leads to an extension file getting a stub is. I would (and I am sure anyone else new to the list) really appreciate someone possibly explaining it to me since I would like to know. -Brett C.

If the Python module does "from _Cmodule import *", there should be *no* difference in performance, since you get the same object in either case. --Guido van Rossum (home page: http://www.python.org/~guido/)

[Hamish Lawson]
There are too many distinct use cases to make this a hard and fast rule. The problem with maintaining many builds is best served by keeping the number of extensions small, period. [Marc-Andre Lemburg]
Maybe in Python 3000. In 2.x, I think rearranging the standard library will just cause more upheaval without much benefits.
We should at least reserve such a name RSN so that we don't run into problems later on.
I can guarantee you that that name won't be used as a standard Python module or package name any time soon. If someone creates a 3rd party package or module named 'python' I'd question their sanity. :-) --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
How about adding python.py: __path__ = ['.'] This would not only reserve the name in the global namespace, but also enable applications to start using 'from python import x' now without much fuzz. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/

Then I have to ask the question I originally wanted to ask: what problem would that solve? And is this the right solution? Also, it would make *all* standard modules accessible through the python package -- surely this isn't what we want (not if we use the Java example at least). Also, for some modules (that keep some global state) it's a bad idea if they are imported twice, since their initialization code would be run twice, and there would be two separate instances of the module. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
It solves the namespace issue. Every time we add a module or package to the standard lib, there is a chance that we break someones code out there by overriding his/her own module/package (e.g. take the addition of the email package -- such generic names tend to be used a lot). Whether it's the right solution depends on how you see it. IMHO it would be ideal to move the complete std lib under a single package. You might want to use a more diverse hierarchy but I don't think that is really needed for the existing code base. Using a single package also makes the transition from non-package imports to python-package imports a lot easier.
Are you sure that you want to make things complicated ? (see above)
That's true for the trick I proposed above since the modules are reachable in two ways with the standard way of writing 'import <stdmod>' being used in tons of code. Now there is also a different way to approach this problem, though: that of directing Python to the right package by providing stubs for all current standard lib modules. I have used such a stub for my mx stuff when I moved everything from top-level to under the 'mx' umbrella: # Redirect all imports to the corresponding mx package def _redirect(mx_subpackage): global __path__ import os,mx __path__ = [os.path.join(mx.__path__[0],mx_subpackage)] _redirect('DateTime') # Now load all important symbols from mx.DateTime import * This works great -- it even let's you load pickles which store the old import names and automagically converts them to the new names. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/

I have thought some more about the idea of moving the entire stdlib into a package named "python" and I reject the idea. Think of the impact the change would have on the tutorial. Think of the amount of needless changes to perfectly working code it would entail. If you want to avoid 3rd party module/package names to be invalidated by additions to the standard library, you might just as well introduce a "nonstd" package into which all 3rd party extensions must be placed. This at least doesn't require people who don't use 3rd party code to change their programs. Maybe we should create a standard package hierarchy; Eric Raymond once started working on such a proposal but I have discouraged him because I think it would cause too much upheaval. But for Python 3 I would consider it. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
Uhm, the point I was trying to make was to provide a long running upgrade path from the current situation (everthing is top-level) to the single package structure. It is fairly easy to move from 'import os' to 'from python import os', but I understand that people will not want to do this until Python 3. I was not suggesting to start breaking code by enforcing this strategy in some way, I just though it would be a good idea to start providing means to work with the single python package approach now to make the transition less painful in Python 3.
That's what I was targetting :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/

[MAL]
And my suggestion of a "nonstd" toplevel package had the same goal. :-)
Two problems. First, your proposal has lots of practical warts that I already pointed out; your suggestion to fix one of them by making all the old names stubs would require a massive set of changes to the CVS repository. Second, I don't think a 'python' toplevel package is the right solution.
Then please think about a proper solution rather than proposing something whose only virtue seems to be that you can implement a poor approximation of it in two lines. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
With the exception that we have control over the Python core code while we don't over third party extensions, so providing means to simplify the transition for the standard lib is easier than trying to enforce your proposed 'nonstd' package.
Just testing waters here... there's no point in trying to find a solution to something which is not regarded as problem anyway. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/

I think you could get a long way with minor changes along the lines of making site-packages a package itself.
You started by claiming that there's a problem: expansion of the stdlib could conflict with 3rd party module/package names. I don't regard it as a problem that's so bad that we need to make big changes to solve it. If you still think a solution is desired, you could start by proposing a new standard package hierarchy. Then new standard modules could be placed in that new hierarchy rather than at the top level. I'm rejecting the proposal of a single top-level package named "python". --Guido van Rossum (home page: http://www.python.org/~guido/)

On Friday 12 July 2002 01:54 pm, Guido van Rossum wrote:
I've read the entire thread and still do not understand why you are suggesting the new standard package hirearchy should be named "new". The contents will eventually will grow old and they will still be in something called "new". Why not use a name like "std", "misc", "core", or "sph" for the top of the standard package hiearchy? It doesn't matter what the name will be, but I hope it will be something that isn't confusing.

[me]
[Michael]
Uh? Who is proposing to name it "new"? Not me! Maybe you should read the entire thread again? :-) --Guido van Rossum (home page: http://www.python.org/~guido/)

On Friday 12 July 2002 02:42 pm, Guido van Rossum wrote:
Ok, I guess I'm just a bit more confused than usual today. I had also read the following message and made the unfortunate assumption that you were proposing "new" as the name of a new top level module to contain all the standard python modules. Opps I merged the threads in my head. On Friday 12 July 2002 11:51 am, Guido van Rossum wrote:
Now back to the issue of moving all the top level names in the standard distribution into a "python" namespace. For the remainder of the 2.X release cycle it is important to not remove the existing names from the top level namespace. However, it might be reasonable to move all standard distribution names into a single top level namespace and grandfather the existing top level names into the top level namespace for the remainder of the 2.x series. The existing set of names would be available from either namespace. All new names for the standard distribution would only be placed in the new top level standard package namespace. With this approach all old names would still be accessible to the existing code base as top level names and introducing new names to the standard distribution will not clobber third party modules and packages. For the remainder of 2.X the rules will be messy because some standard names will be accessible from either the top level namespace or from the standard "python" namespace. Then for Python 3.0 the grandfathered names would be removed from the top level namespace. This approach should enable a smoother transition in the documentation and coding practices. The preferred coding style guide, the tutorial, and other documentation would be used to explain the transition plan. The new guidelines would promote the use of the new namespace for all cases, but it would not preclude the use of the older coding style. I"m not keen on the use the name "python" for the top level namespace. Perhaps the name "std" would be more desirable (and shorter to type).

On 12 Jul 2002 at 16:46, Michael McLay wrote:
Getting from <toplevelname> import urllib and import urllib to return the same (is, not equals) object will require very delicate surgery on some very difficult code. And without it, most non-trivial scripts will break in very mysterious ways. -- Gordon http://www.mcmillan-inc.com/

Gordon McMillan wrote:
Not really. The following code does all it takes to make this work for e.g. having 'import DateTime' and 'from mx import DateTime' provide the same symbols: # Redirect all imports to the corresponding mx package def _redirect(mx_subpackage): global __path__ import os,mx __path__ = [os.path.join(mx.__path__[0],mx_subpackage)] _redirect('DateTime') # Now load all important symbols from mx.DateTime import * from mx.DateTime import __version__,_DT,_DTD The module objects would be different, but that's just about it. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/

On 13 Jul 2002 at 19:58, M.-A. Lemburg wrote:
Gordon McMillan wrote:
[snip hackery]
The module objects would be different, but that's just about it.
Which was exactly my point. Much code that does *not* use "from ... import ..." in fact relies on having the same module object. -- Gordon http://www.mcmillan-inc.com/

Gordon McMillan wrote:
You mean for e.g. hacking the module's globals ? To solve that, you'd probably need to manipulate sys.modules as well... I'm just not sure whether this is possible from within the module implementing the redirection. Hmm, running this: testmodload.py: import sys, os sys.modules['testmodload'] = os print 'worked' Python 2.1.3 (#1, May 16 2002, 18:59:26)
Looks like this is possible, so you probably don't even need the 'from mx.DateTime import *' in the code I posted. A simple 'sys.modules['DateTime'] = mx.DateTime' would give you an even better solution. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/

[MAL]
The module objects would be different, but that's just about it.
[Gordon]
Which was exactly my point. Much code that does *not* use "from ... import ..." in fact relies on having the same module object.
[MAL]
You mean for e.g. hacking the module's globals ?
If you consider a module maintaining pieces of its own state in its own globals as an instance of hacking the module's globals, yes, that's the main problem. For example (there are many, this isn't stretching), if the user ends up with two distinct copies of the tempfile module, its "global" _tempdir_lock becomes two distinct locks, and the truly global mutual exclusion _tempdir_lock was supposed to supply is lost. Ditto for the lock used internally by tempfile's global _counter object. The system-wide uniqueness of some globals is crucial to some modules' correct functioning.

Tim Peters wrote:
Very true and that's why there is only one module containing the actual code. Globals referenced by the code live in that module. The other module only imports the symbols in the first solution I posted. The second even avoids this extra step -- there's only one module (the packaged one) left in sys.modules which is referenced under two names. pickles will gladly unpickle using this scheme while a pickle operation automagically starts using the new packaged name. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/

Marc-Andre, In this thread you have posted:
python.py: __path__ = ['.']
and
and
None of these will freeze successfully. Two of them appear to rely on an implementation detail - that __path__ (only defined for imp.PKG_DIRECTORY's) will be followed even in a plain module. The third is exactly what _xmlplus does, and consensus appears to be that that was a mistake. "Clever" does not mean "good". -- Gordon http://www.mcmillan-inc.com/

Gordon McMillan wrote:
Hmm, then how do you freeze _xmlplue ?
AFAIK, that's not an implementation detail, but a documented way of finding out whether a module is a package or not.
But it works (tm) :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/

On 14 Jul 2002 at 16:32, M.-A. Lemburg wrote:
Gordon McMillan wrote:
[various cute hacks]
None of these will freeze successfully.
Hmm, then how do you freeze _xmlplue ?
Most people whine publicly until someone comes up with a workaround. Installer has a way of hooking modules & packages that play games like that, but if you're using tools/freeze, you'll probably be told to overlay xml with _xmlplus. If the package uses lots of nasty tricks (eg, pyopengl), the answer is "you don't".
Correct. But stuffing a __path__ attribute into a module does *not* make the module a package. '''Whenever a submodule of a package is loaded, Python makes sure that the package itself is loaded first, loading its __init__.py file if necessary.''' and '''Once loaded, the difference between a package and a module is minimal.'''
But it works (tm) :-)
For a sufficiently short-sighted definition of "work". -- Gordon http://www.mcmillan-inc.com/

Gordon McMillan wrote:
Hmm, I know that Python itself uses __path__ to tell whether it has a package or not, so I don't see why a module can't be regarded as package. Moving the module into a directory of the same name and then renaming it to __init__.py has the same effect. And in that case, hacking __path__ is perfectly legal.
But it works (tm) :-)
For a sufficiently short-sighted definition of "work".
You haven't commented on the sys.modules trick yet. This one doesn't even use the __path__ hackery :-) DateTime.py: import sys import mx.DateTime sys.modules[__name__] = mx.DateTime Python 2.1.3 (#1, May 16 2002, 18:59:26)
See: it's the same module :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/

On 14 Jul 2002 at 19:53, M.-A. Lemburg wrote:
Gordon McMillan wrote:
... But stuffing a __path__ attribute into a module does *not* make the module a package.
If you put on a Richard M. Nixon mask, you might be mistaken for ("regarded as") Richard M. Nixon. That doesn't make you Richard M. Nixon. Stuffing __path__ into a module means that *most* of Python's runtime will regard your module as a package. It doesn't make it a package. In particular, most introspection tools and most programmers will not recognize your module as a package.
Yes, it now *is* a package. One which violates recommended practice, which is to keep __init__.py simple, but still a package.
Anytime x != sys.modules[x].__name__, someone, sometime will suffer. Installer and (I believe) py2exe have hooks so that this gets analyzed properly. The hook is keyed by "DateTime". If you really find it intolerable to stick your users with making a one line change in their code, you might consider contributing hooks to Installer (or patches to py2exe). Particularly for your non-free packages, since I'm not going to download those and reverse-engineer them. Or perhaps you could do like Pmw, and include a "bundle" script. -- Gordon http://www.mcmillan-inc.com/

Gordon McMillan wrote:
I don't. I'm just using my package series as example of how moving a set of top-level modules/packages to a single package can be accomplished. That's all. I told my users to upgrade their applications from 1.x to 2.0 by switching from 'import DateTime' to 'from mx import DateTime' when I made the move and indeed, only one user complained -- which is why I provided him with a backwards compatiblity package along the lines of what I've posted here. He only needed it to be able to read back pickled data, BTW.
Hmm, I don't understand this comment.
Or perhaps you could do like Pmw, and include a "bundle" script.
py2exe works just fine with the mx stuff. I suppose your installer does too. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/

[Michael McLay]
Note that "new" is already the name of a top-level module, and has been for years. That other thread was about drawing useless distinctions between the already-existing "new" and "types" modules with respect to where to house new type names that nobody needs <0.9 wink>.

Guido:
Maybe he's getting it mixed up with the thead about the "new" module? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

Guido van Rossum wrote:
This wouldn't work since in that case you'd have the problem of having to fix class names in e.g. pickles for objects which you don't know anything about. We do know about objects in the Python standard lib, so we could take care to have mechanisms like pickle deal with them properly.
I believe that the more Python grows (not only the core, but the complete set of available modules and packages in the Python universe), the less likely we are going to hit a problem.
You've written that before, but you still haven't given any explanation of why a single package would be worse than a multi-level hierarchy of modules (e.g. grouped by application space). I think that simply moving to one package would cause less breakage and make the whole transition process much easier than having to tweak code into using some complicated multi-package structure. FWIW, I've been through all this with the mx packages and using a single new package caused the least amount of work. Even better: it turned out to be easy to provide backwards compatibility code so that applications still using the old layout continue to run, but start using the new structure in their pickles. No need to get heated, though. I just thought that it would be a good time to start thinking about this option again. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/

IOW you're suggesting we do a near-infinite amount of work to the core just so that others can be sloppy in their choice of names for their modules. Bah.
I would say, OK, so it will go away by itself, but I guess you made a typo there, and really meant "the more likely...". :-) But making the core go away doesn't reduce the problem enough: the more likely problem is two 3rd parties unaware of each other each picking the same name.
Because a single package doesn't have any other benefits besides getting out of the way from 3rd party developers. At least a proper hierarchy would have the other benefits of grouping. (But better make it a shallow hierarchy! remember "Flat is better than nested.")
Given that you now want us to add special counter-measure to pickle, I doubt that very much.
So it's no big deal for 3rd party developers to do what they should do to deal with this problem. Good to hear. Given that when we change the standard library, *every* Python user (and developer) is affected, I prefer the status quo.
No need to get heated, though. I just thought that it would be a good time to start thinking about this option again.
And this would be a good time to end this thread. :-) --Guido van Rossum (home page: http://www.python.org/~guido/)

[Guido van Rossum]
And this would be a good time to end this thread. :-)
Agreed. Yet, allow me for a tiny suggestion, that could solve the stated problem at a simple cost. Suffice to choose, then announce a convention about a set of names which the Python distribution agrees to never use. It could be anything. Like, Python could guarantee that it will never ever install a standard module with a name starting with capital `W', say. If a user wants to make absolutely sure his/her module does not and will not conflict with a standard module, just prepend a `W' to its name. It is likely that people will rarely resort to this convention, but it will be there for the paranoid, and should be easy to support. Yet, it will not solve the paranoia of users against the package name of each other. If we have been many years ago, the convention I would have preferred is that Python never uses any capital letter as the first letter of a module, but it seems to be a little late for this, and I'm not so sure of the benefit. :-) The most python could say from some `from python import ...' or a `W' convention is that it gets itself out of the name fight between users, it does not participate into it. it does not really solve the problem, anyway. I guess you are right, in that whatever the direction taken, this thread is probably doomed to fall into various dead-ends. -- François Pinard http://www.iro.umontreal.ca/~pinard

[M.-A. Lemburg]
There is something to be solved here. Anecdote: I sucked an early version of Greg's textwrap.py module into my build directory. After he checked it in, I changed regrtest.py to use textwrap. This kept failing with baffling errors, until I realized I was still picking up an incompatible textwrap.py from the build directory. So I got rid of the latter. Somewhere in between, I synched my desktop and laptop machines and so got another copy on my laptop that way, which I didn't notice. When I got home and synched the laptop back to the desktop, it then restored the deleted testwrap.py to the desktop machine, and I got the same round of impossible errors all over again. I deleted it from home machine again, but the next time I used my laptop to run the test suite got the impossible errors yet another time -- and had synched the machines again in the meantime so that it once again showed up on the desktop disk. So there's one use case <wink>.

This just shows that having the current directory on sys.path (especially at the front) causes problems. --Guido van Rossum (home page: http://www.python.org/~guido/)

[Guido]
This just shows that having the current directory on sys.path (especially at the front) causes problems.
I thought it showed I shouldn't be so careless when synching machines, but I'll take an excuse to blame Python instead <wink>. Still, it's something that would not have happened had I needed to prefix the import of the standard textwrap with a "standard" name -- or of my private textwrap with a "non-standard" name. Putting the current directory in sys.path is just too useful to give up. I suspect that putting it specifically at the front is only "a feature" for Python library developers, though, and "a bug" for others -- end users stumble into this a lot by unhappy accident, like when creating a random.py to hold their initial experiments with Python's random-number facilities.

Hamish Lawson wrote:
Sounds like a plan :-) BTW, this reminds me of the old idea to move that standard lib into a package, eg. 'python'... from python import time. We should at least reserve such a name RSN so that we don't run into problems later on. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/

[Hamish Lawson]
I am for that, but then again I am biased in this situation. =) But it seems reasonable. I would think everyeone who makes any major contribution of code to Python would much rather code it up in Python then C. It would probably help to get more code accepted since I know I felt a little daunted having to write that callout for strptime. The only obvious objection I can see to this is a performance hit for having to go through the Python stub to call the C extension. But I just did a very simple test of calling strftime('%c') 25,000 times from time directly and using a Python stub and it was .470 and .490 secs total respectively according to profile.run(). The oher objection I can see is that this would promote coding everything in Python when possible and that might not always be the best solution. Some things should just be coded in C, period. But I think for such situations that the person writing the code would most likely recognize that fact. Or maybe I am wrong in all of this. I don't know the exact process of how a C extension file gets accepted or what currently leads to an extension file getting a stub is. I would (and I am sure anyone else new to the list) really appreciate someone possibly explaining it to me since I would like to know. -Brett C.

If the Python module does "from _Cmodule import *", there should be *no* difference in performance, since you get the same object in either case. --Guido van Rossum (home page: http://www.python.org/~guido/)

[Hamish Lawson]
There are too many distinct use cases to make this a hard and fast rule. The problem with maintaining many builds is best served by keeping the number of extensions small, period. [Marc-Andre Lemburg]
Maybe in Python 3000. In 2.x, I think rearranging the standard library will just cause more upheaval without much benefits.
We should at least reserve such a name RSN so that we don't run into problems later on.
I can guarantee you that that name won't be used as a standard Python module or package name any time soon. If someone creates a 3rd party package or module named 'python' I'd question their sanity. :-) --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
How about adding python.py: __path__ = ['.'] This would not only reserve the name in the global namespace, but also enable applications to start using 'from python import x' now without much fuzz. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/

Then I have to ask the question I originally wanted to ask: what problem would that solve? And is this the right solution? Also, it would make *all* standard modules accessible through the python package -- surely this isn't what we want (not if we use the Java example at least). Also, for some modules (that keep some global state) it's a bad idea if they are imported twice, since their initialization code would be run twice, and there would be two separate instances of the module. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
It solves the namespace issue. Every time we add a module or package to the standard lib, there is a chance that we break someones code out there by overriding his/her own module/package (e.g. take the addition of the email package -- such generic names tend to be used a lot). Whether it's the right solution depends on how you see it. IMHO it would be ideal to move the complete std lib under a single package. You might want to use a more diverse hierarchy but I don't think that is really needed for the existing code base. Using a single package also makes the transition from non-package imports to python-package imports a lot easier.
Are you sure that you want to make things complicated ? (see above)
That's true for the trick I proposed above since the modules are reachable in two ways with the standard way of writing 'import <stdmod>' being used in tons of code. Now there is also a different way to approach this problem, though: that of directing Python to the right package by providing stubs for all current standard lib modules. I have used such a stub for my mx stuff when I moved everything from top-level to under the 'mx' umbrella: # Redirect all imports to the corresponding mx package def _redirect(mx_subpackage): global __path__ import os,mx __path__ = [os.path.join(mx.__path__[0],mx_subpackage)] _redirect('DateTime') # Now load all important symbols from mx.DateTime import * This works great -- it even let's you load pickles which store the old import names and automagically converts them to the new names. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/

I have thought some more about the idea of moving the entire stdlib into a package named "python" and I reject the idea. Think of the impact the change would have on the tutorial. Think of the amount of needless changes to perfectly working code it would entail. If you want to avoid 3rd party module/package names to be invalidated by additions to the standard library, you might just as well introduce a "nonstd" package into which all 3rd party extensions must be placed. This at least doesn't require people who don't use 3rd party code to change their programs. Maybe we should create a standard package hierarchy; Eric Raymond once started working on such a proposal but I have discouraged him because I think it would cause too much upheaval. But for Python 3 I would consider it. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
Uhm, the point I was trying to make was to provide a long running upgrade path from the current situation (everthing is top-level) to the single package structure. It is fairly easy to move from 'import os' to 'from python import os', but I understand that people will not want to do this until Python 3. I was not suggesting to start breaking code by enforcing this strategy in some way, I just though it would be a good idea to start providing means to work with the single python package approach now to make the transition less painful in Python 3.
That's what I was targetting :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/

[MAL]
And my suggestion of a "nonstd" toplevel package had the same goal. :-)
Two problems. First, your proposal has lots of practical warts that I already pointed out; your suggestion to fix one of them by making all the old names stubs would require a massive set of changes to the CVS repository. Second, I don't think a 'python' toplevel package is the right solution.
Then please think about a proper solution rather than proposing something whose only virtue seems to be that you can implement a poor approximation of it in two lines. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
With the exception that we have control over the Python core code while we don't over third party extensions, so providing means to simplify the transition for the standard lib is easier than trying to enforce your proposed 'nonstd' package.
Just testing waters here... there's no point in trying to find a solution to something which is not regarded as problem anyway. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/

I think you could get a long way with minor changes along the lines of making site-packages a package itself.
You started by claiming that there's a problem: expansion of the stdlib could conflict with 3rd party module/package names. I don't regard it as a problem that's so bad that we need to make big changes to solve it. If you still think a solution is desired, you could start by proposing a new standard package hierarchy. Then new standard modules could be placed in that new hierarchy rather than at the top level. I'm rejecting the proposal of a single top-level package named "python". --Guido van Rossum (home page: http://www.python.org/~guido/)

On Friday 12 July 2002 01:54 pm, Guido van Rossum wrote:
I've read the entire thread and still do not understand why you are suggesting the new standard package hirearchy should be named "new". The contents will eventually will grow old and they will still be in something called "new". Why not use a name like "std", "misc", "core", or "sph" for the top of the standard package hiearchy? It doesn't matter what the name will be, but I hope it will be something that isn't confusing.

[me]
[Michael]
Uh? Who is proposing to name it "new"? Not me! Maybe you should read the entire thread again? :-) --Guido van Rossum (home page: http://www.python.org/~guido/)

On Friday 12 July 2002 02:42 pm, Guido van Rossum wrote:
Ok, I guess I'm just a bit more confused than usual today. I had also read the following message and made the unfortunate assumption that you were proposing "new" as the name of a new top level module to contain all the standard python modules. Opps I merged the threads in my head. On Friday 12 July 2002 11:51 am, Guido van Rossum wrote:
Now back to the issue of moving all the top level names in the standard distribution into a "python" namespace. For the remainder of the 2.X release cycle it is important to not remove the existing names from the top level namespace. However, it might be reasonable to move all standard distribution names into a single top level namespace and grandfather the existing top level names into the top level namespace for the remainder of the 2.x series. The existing set of names would be available from either namespace. All new names for the standard distribution would only be placed in the new top level standard package namespace. With this approach all old names would still be accessible to the existing code base as top level names and introducing new names to the standard distribution will not clobber third party modules and packages. For the remainder of 2.X the rules will be messy because some standard names will be accessible from either the top level namespace or from the standard "python" namespace. Then for Python 3.0 the grandfathered names would be removed from the top level namespace. This approach should enable a smoother transition in the documentation and coding practices. The preferred coding style guide, the tutorial, and other documentation would be used to explain the transition plan. The new guidelines would promote the use of the new namespace for all cases, but it would not preclude the use of the older coding style. I"m not keen on the use the name "python" for the top level namespace. Perhaps the name "std" would be more desirable (and shorter to type).

On 12 Jul 2002 at 16:46, Michael McLay wrote:
Getting from <toplevelname> import urllib and import urllib to return the same (is, not equals) object will require very delicate surgery on some very difficult code. And without it, most non-trivial scripts will break in very mysterious ways. -- Gordon http://www.mcmillan-inc.com/

Gordon McMillan wrote:
Not really. The following code does all it takes to make this work for e.g. having 'import DateTime' and 'from mx import DateTime' provide the same symbols: # Redirect all imports to the corresponding mx package def _redirect(mx_subpackage): global __path__ import os,mx __path__ = [os.path.join(mx.__path__[0],mx_subpackage)] _redirect('DateTime') # Now load all important symbols from mx.DateTime import * from mx.DateTime import __version__,_DT,_DTD The module objects would be different, but that's just about it. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/

On 13 Jul 2002 at 19:58, M.-A. Lemburg wrote:
Gordon McMillan wrote:
[snip hackery]
The module objects would be different, but that's just about it.
Which was exactly my point. Much code that does *not* use "from ... import ..." in fact relies on having the same module object. -- Gordon http://www.mcmillan-inc.com/

Gordon McMillan wrote:
You mean for e.g. hacking the module's globals ? To solve that, you'd probably need to manipulate sys.modules as well... I'm just not sure whether this is possible from within the module implementing the redirection. Hmm, running this: testmodload.py: import sys, os sys.modules['testmodload'] = os print 'worked' Python 2.1.3 (#1, May 16 2002, 18:59:26)
Looks like this is possible, so you probably don't even need the 'from mx.DateTime import *' in the code I posted. A simple 'sys.modules['DateTime'] = mx.DateTime' would give you an even better solution. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/

[MAL]
The module objects would be different, but that's just about it.
[Gordon]
Which was exactly my point. Much code that does *not* use "from ... import ..." in fact relies on having the same module object.
[MAL]
You mean for e.g. hacking the module's globals ?
If you consider a module maintaining pieces of its own state in its own globals as an instance of hacking the module's globals, yes, that's the main problem. For example (there are many, this isn't stretching), if the user ends up with two distinct copies of the tempfile module, its "global" _tempdir_lock becomes two distinct locks, and the truly global mutual exclusion _tempdir_lock was supposed to supply is lost. Ditto for the lock used internally by tempfile's global _counter object. The system-wide uniqueness of some globals is crucial to some modules' correct functioning.

Tim Peters wrote:
Very true and that's why there is only one module containing the actual code. Globals referenced by the code live in that module. The other module only imports the symbols in the first solution I posted. The second even avoids this extra step -- there's only one module (the packaged one) left in sys.modules which is referenced under two names. pickles will gladly unpickle using this scheme while a pickle operation automagically starts using the new packaged name. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/

Marc-Andre, In this thread you have posted:
python.py: __path__ = ['.']
and
and
None of these will freeze successfully. Two of them appear to rely on an implementation detail - that __path__ (only defined for imp.PKG_DIRECTORY's) will be followed even in a plain module. The third is exactly what _xmlplus does, and consensus appears to be that that was a mistake. "Clever" does not mean "good". -- Gordon http://www.mcmillan-inc.com/

Gordon McMillan wrote:
Hmm, then how do you freeze _xmlplue ?
AFAIK, that's not an implementation detail, but a documented way of finding out whether a module is a package or not.
But it works (tm) :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/

On 14 Jul 2002 at 16:32, M.-A. Lemburg wrote:
Gordon McMillan wrote:
[various cute hacks]
None of these will freeze successfully.
Hmm, then how do you freeze _xmlplue ?
Most people whine publicly until someone comes up with a workaround. Installer has a way of hooking modules & packages that play games like that, but if you're using tools/freeze, you'll probably be told to overlay xml with _xmlplus. If the package uses lots of nasty tricks (eg, pyopengl), the answer is "you don't".
Correct. But stuffing a __path__ attribute into a module does *not* make the module a package. '''Whenever a submodule of a package is loaded, Python makes sure that the package itself is loaded first, loading its __init__.py file if necessary.''' and '''Once loaded, the difference between a package and a module is minimal.'''
But it works (tm) :-)
For a sufficiently short-sighted definition of "work". -- Gordon http://www.mcmillan-inc.com/

Gordon McMillan wrote:
Hmm, I know that Python itself uses __path__ to tell whether it has a package or not, so I don't see why a module can't be regarded as package. Moving the module into a directory of the same name and then renaming it to __init__.py has the same effect. And in that case, hacking __path__ is perfectly legal.
But it works (tm) :-)
For a sufficiently short-sighted definition of "work".
You haven't commented on the sys.modules trick yet. This one doesn't even use the __path__ hackery :-) DateTime.py: import sys import mx.DateTime sys.modules[__name__] = mx.DateTime Python 2.1.3 (#1, May 16 2002, 18:59:26)
See: it's the same module :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/

On 14 Jul 2002 at 19:53, M.-A. Lemburg wrote:
Gordon McMillan wrote:
... But stuffing a __path__ attribute into a module does *not* make the module a package.
If you put on a Richard M. Nixon mask, you might be mistaken for ("regarded as") Richard M. Nixon. That doesn't make you Richard M. Nixon. Stuffing __path__ into a module means that *most* of Python's runtime will regard your module as a package. It doesn't make it a package. In particular, most introspection tools and most programmers will not recognize your module as a package.
Yes, it now *is* a package. One which violates recommended practice, which is to keep __init__.py simple, but still a package.
Anytime x != sys.modules[x].__name__, someone, sometime will suffer. Installer and (I believe) py2exe have hooks so that this gets analyzed properly. The hook is keyed by "DateTime". If you really find it intolerable to stick your users with making a one line change in their code, you might consider contributing hooks to Installer (or patches to py2exe). Particularly for your non-free packages, since I'm not going to download those and reverse-engineer them. Or perhaps you could do like Pmw, and include a "bundle" script. -- Gordon http://www.mcmillan-inc.com/

Gordon McMillan wrote:
I don't. I'm just using my package series as example of how moving a set of top-level modules/packages to a single package can be accomplished. That's all. I told my users to upgrade their applications from 1.x to 2.0 by switching from 'import DateTime' to 'from mx import DateTime' when I made the move and indeed, only one user complained -- which is why I provided him with a backwards compatiblity package along the lines of what I've posted here. He only needed it to be able to read back pickled data, BTW.
Hmm, I don't understand this comment.
Or perhaps you could do like Pmw, and include a "bundle" script.
py2exe works just fine with the mx stuff. I suppose your installer does too. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/

[Michael McLay]
Note that "new" is already the name of a top-level module, and has been for years. That other thread was about drawing useless distinctions between the already-existing "new" and "types" modules with respect to where to house new type names that nobody needs <0.9 wink>.

Guido:
Maybe he's getting it mixed up with the thead about the "new" module? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

Guido van Rossum wrote:
This wouldn't work since in that case you'd have the problem of having to fix class names in e.g. pickles for objects which you don't know anything about. We do know about objects in the Python standard lib, so we could take care to have mechanisms like pickle deal with them properly.
I believe that the more Python grows (not only the core, but the complete set of available modules and packages in the Python universe), the less likely we are going to hit a problem.
You've written that before, but you still haven't given any explanation of why a single package would be worse than a multi-level hierarchy of modules (e.g. grouped by application space). I think that simply moving to one package would cause less breakage and make the whole transition process much easier than having to tweak code into using some complicated multi-package structure. FWIW, I've been through all this with the mx packages and using a single new package caused the least amount of work. Even better: it turned out to be easy to provide backwards compatibility code so that applications still using the old layout continue to run, but start using the new structure in their pickles. No need to get heated, though. I just thought that it would be a good time to start thinking about this option again. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/

IOW you're suggesting we do a near-infinite amount of work to the core just so that others can be sloppy in their choice of names for their modules. Bah.
I would say, OK, so it will go away by itself, but I guess you made a typo there, and really meant "the more likely...". :-) But making the core go away doesn't reduce the problem enough: the more likely problem is two 3rd parties unaware of each other each picking the same name.
Because a single package doesn't have any other benefits besides getting out of the way from 3rd party developers. At least a proper hierarchy would have the other benefits of grouping. (But better make it a shallow hierarchy! remember "Flat is better than nested.")
Given that you now want us to add special counter-measure to pickle, I doubt that very much.
So it's no big deal for 3rd party developers to do what they should do to deal with this problem. Good to hear. Given that when we change the standard library, *every* Python user (and developer) is affected, I prefer the status quo.
No need to get heated, though. I just thought that it would be a good time to start thinking about this option again.
And this would be a good time to end this thread. :-) --Guido van Rossum (home page: http://www.python.org/~guido/)

[Guido van Rossum]
And this would be a good time to end this thread. :-)
Agreed. Yet, allow me for a tiny suggestion, that could solve the stated problem at a simple cost. Suffice to choose, then announce a convention about a set of names which the Python distribution agrees to never use. It could be anything. Like, Python could guarantee that it will never ever install a standard module with a name starting with capital `W', say. If a user wants to make absolutely sure his/her module does not and will not conflict with a standard module, just prepend a `W' to its name. It is likely that people will rarely resort to this convention, but it will be there for the paranoid, and should be easy to support. Yet, it will not solve the paranoia of users against the package name of each other. If we have been many years ago, the convention I would have preferred is that Python never uses any capital letter as the first letter of a module, but it seems to be a little late for this, and I'm not so sure of the benefit. :-) The most python could say from some `from python import ...' or a `W' convention is that it gets itself out of the name fight between users, it does not participate into it. it does not really solve the problem, anyway. I guess you are right, in that whatever the direction taken, this thread is probably doomed to fall into various dead-ends. -- François Pinard http://www.iro.umontreal.ca/~pinard

[M.-A. Lemburg]
There is something to be solved here. Anecdote: I sucked an early version of Greg's textwrap.py module into my build directory. After he checked it in, I changed regrtest.py to use textwrap. This kept failing with baffling errors, until I realized I was still picking up an incompatible textwrap.py from the build directory. So I got rid of the latter. Somewhere in between, I synched my desktop and laptop machines and so got another copy on my laptop that way, which I didn't notice. When I got home and synched the laptop back to the desktop, it then restored the deleted testwrap.py to the desktop machine, and I got the same round of impossible errors all over again. I deleted it from home machine again, but the next time I used my laptop to run the test suite got the impossible errors yet another time -- and had synched the machines again in the meantime so that it once again showed up on the desktop disk. So there's one use case <wink>.

This just shows that having the current directory on sys.path (especially at the front) causes problems. --Guido van Rossum (home page: http://www.python.org/~guido/)

[Guido]
This just shows that having the current directory on sys.path (especially at the front) causes problems.
I thought it showed I shouldn't be so careless when synching machines, but I'll take an excuse to blame Python instead <wink>. Still, it's something that would not have happened had I needed to prefix the import of the standard textwrap with a "standard" name -- or of my private textwrap with a "non-standard" name. Putting the current directory in sys.path is just too useful to give up. I suspect that putting it specifically at the front is only "a feature" for Python library developers, though, and "a bug" for others -- end users stumble into this a lot by unhappy accident, like when creating a random.py to hold their initial experiments with Python's random-number facilities.
participants (9)
-
Brett Cannon
-
Gordon McMillan
-
Greg Ewing
-
Guido van Rossum
-
Hamish Lawson
-
M.-A. Lemburg
-
Michael McLay
-
pinard@iro.umontreal.ca
-
Tim Peters