Re: [Python-Dev] pkgutil, pkg_resource and Python 3.0 name space packages
At 04:23 PM 1/6/2008 -0800, Guido van Rossum wrote:
Regarding using common words, either the stdlib grabs these, or *nobody* gets to use them (for fear of conflicting with some other 3rd party package grabbing the same).
This isn't quite true; a standalone Python application that isn't extensible doesn't need to worry about this. And it's standalone apps that are most likely to claim these common words. (For example, until recently, Chandler's database library packages were in 'repository.*'.) But of course this is still a pretty minor point overall. If the stdlib does go for deeper nestings, I have a slight preference for seeing them under std.* or some such rather than top level. But I don't really see a whole lot of point to doing a major re-org of the stdlib space to begin with, for the simple reason that package names are not really categories -- they're *names*. IMO 'from databases import sqlite' doesn't add any value over 'import pysqlite3' to begin with. Worse, it will likely be an attractive nuisance for people saying "why don't we have databases.Oracle?" and suchlike. And you still have to remember the names, only now they're longer. And was it database or databases? Greater uniqueness of names is just another reason flat is better than nested. :)
On Jan 6, 2008 7:23 PM, Phillip J. Eby <pje@telecommunity.com> wrote:
At 04:23 PM 1/6/2008 -0800, Guido van Rossum wrote:
Regarding using common words, either the stdlib grabs these, or *nobody* gets to use them (for fear of conflicting with some other 3rd party package grabbing the same).
This isn't quite true; a standalone Python application that isn't extensible doesn't need to worry about this. And it's standalone apps that are most likely to claim these common words. (For example, until recently, Chandler's database library packages were in 'repository.*'.)
But of course this is still a pretty minor point overall. If the stdlib does go for deeper nestings, I have a slight preference for seeing them under std.* or some such rather than top level. But I don't really see a whole lot of point to doing a major re-org of the stdlib space to begin with, for the simple reason that package names are not really categories -- they're *names*. IMO 'from databases import sqlite' doesn't add any value over 'import pysqlite3' to begin with.
Worse, it will likely be an attractive nuisance for people saying "why don't we have databases.Oracle?" and suchlike. And you still have to remember the names, only now they're longer. And was it database or databases? Greater uniqueness of names is just another reason flat is better than nested. :)
Right. Packages are useful if it helps make the sub-names shorter. The email package is a good example: it uses lots of generic names like errors, charset, encoders, message, parser, utils, iterators. -- --Guido van Rossum (home page: http://www.python.org/~guido/)
On Jan 6, 2008 8:28 PM, Guido van Rossum <guido@python.org> wrote:
On Jan 6, 2008 7:23 PM, Phillip J. Eby <pje@telecommunity.com> wrote:
At 04:23 PM 1/6/2008 -0800, Guido van Rossum wrote:
Regarding using common words, either the stdlib grabs these, or *nobody* gets to use them (for fear of conflicting with some other 3rd party package grabbing the same).
This isn't quite true; a standalone Python application that isn't extensible doesn't need to worry about this. And it's standalone apps that are most likely to claim these common words. (For example, until recently, Chandler's database library packages were in 'repository.*'.)
But of course this is still a pretty minor point overall. If the stdlib does go for deeper nestings, I have a slight preference for seeing them under std.* or some such rather than top level. But I don't really see a whole lot of point to doing a major re-org of the stdlib space to begin with, for the simple reason that package names are not really categories -- they're *names*. IMO 'from databases import sqlite' doesn't add any value over 'import pysqlite3' to begin with.
Worse, it will likely be an attractive nuisance for people saying "why don't we have databases.Oracle?" and suchlike. And you still have to remember the names, only now they're longer. And was it database or databases? Greater uniqueness of names is just another reason flat is better than nested. :)
Right. Packages are useful if it helps make the sub-names shorter. The email package is a good example: it uses lots of generic names like errors, charset, encoders, message, parser, utils, iterators.
So only do the 'databases' package if we can change the modules names to make it worth it? So whichdb becomes databases.which, dumbdbm becomes databases.dumb, etc.? And then extend this to any other package that we consider creating? Otherwise leave it out? How would that follow for sqlite since that is not going to get any shorter thanks to a package? Should it still go into the package for organizational purposes? In other words, should the stdlib reorg only introduce new packages if the majority of modules that go into the package end up with a shorter name? -Brett
On Jan 7, 2008 12:19 PM, Brett Cannon <brett@python.org> wrote:
On Jan 6, 2008 8:28 PM, Guido van Rossum <guido@python.org> wrote:
On Jan 6, 2008 7:23 PM, Phillip J. Eby <pje@telecommunity.com> wrote:
At 04:23 PM 1/6/2008 -0800, Guido van Rossum wrote:
Regarding using common words, either the stdlib grabs these, or *nobody* gets to use them (for fear of conflicting with some other 3rd party package grabbing the same).
This isn't quite true; a standalone Python application that isn't extensible doesn't need to worry about this. And it's standalone apps that are most likely to claim these common words. (For example, until recently, Chandler's database library packages were in 'repository.*'.)
But of course this is still a pretty minor point overall. If the stdlib does go for deeper nestings, I have a slight preference for seeing them under std.* or some such rather than top level. But I don't really see a whole lot of point to doing a major re-org of the stdlib space to begin with, for the simple reason that package names are not really categories -- they're *names*. IMO 'from databases import sqlite' doesn't add any value over 'import pysqlite3' to begin with.
Worse, it will likely be an attractive nuisance for people saying "why don't we have databases.Oracle?" and suchlike. And you still have to remember the names, only now they're longer. And was it database or databases? Greater uniqueness of names is just another reason flat is better than nested. :)
Right. Packages are useful if it helps make the sub-names shorter. The email package is a good example: it uses lots of generic names like errors, charset, encoders, message, parser, utils, iterators.
So only do the 'databases' package if we can change the modules names to make it worth it? So whichdb becomes databases.which, dumbdbm becomes databases.dumb, etc.?
Bad example IMO; these are all about relatively simple "dict-on-disk" APIs, not about (relational) databases. I'd be +0 things like dbm.gnu, dbm.any, dbm.dumb, dbm.which.
And then extend this to any other package that we consider creating? Otherwise leave it out? How would that follow for sqlite since that is not going to get any shorter thanks to a package? Should it still go into the package for organizational purposes?
If you're asking me, the "organizational purpose" is 100% misguided.
In other words, should the stdlib reorg only introduce new packages if the majority of modules that go into the package end up with a shorter name?
See what others say. Another reason to have a top-level package would be if there are a lot of subpackages/submodules. This applies to the xml package for example. -- --Guido van Rossum (home page: http://www.python.org/~guido/)
On Jan 7, 2008 12:40 PM, Guido van Rossum <guido@python.org> wrote:
On Jan 7, 2008 12:19 PM, Brett Cannon <brett@python.org> wrote:
On Jan 6, 2008 8:28 PM, Guido van Rossum <guido@python.org> wrote:
On Jan 6, 2008 7:23 PM, Phillip J. Eby <pje@telecommunity.com> wrote:
At 04:23 PM 1/6/2008 -0800, Guido van Rossum wrote:
Regarding using common words, either the stdlib grabs these, or *nobody* gets to use them (for fear of conflicting with some other 3rd party package grabbing the same).
This isn't quite true; a standalone Python application that isn't extensible doesn't need to worry about this. And it's standalone apps that are most likely to claim these common words. (For example, until recently, Chandler's database library packages were in 'repository.*'.)
But of course this is still a pretty minor point overall. If the stdlib does go for deeper nestings, I have a slight preference for seeing them under std.* or some such rather than top level. But I don't really see a whole lot of point to doing a major re-org of the stdlib space to begin with, for the simple reason that package names are not really categories -- they're *names*. IMO 'from databases import sqlite' doesn't add any value over 'import pysqlite3' to begin with.
Worse, it will likely be an attractive nuisance for people saying "why don't we have databases.Oracle?" and suchlike. And you still have to remember the names, only now they're longer. And was it database or databases? Greater uniqueness of names is just another reason flat is better than nested. :)
Right. Packages are useful if it helps make the sub-names shorter. The email package is a good example: it uses lots of generic names like errors, charset, encoders, message, parser, utils, iterators.
So only do the 'databases' package if we can change the modules names to make it worth it? So whichdb becomes databases.which, dumbdbm becomes databases.dumb, etc.?
Bad example IMO; these are all about relatively simple "dict-on-disk" APIs, not about (relational) databases. I'd be +0 things like dbm.gnu, dbm.any, dbm.dumb, dbm.which.
OK. So an html package could have htmllib for its __init__ (or html.lib), and then have html.entities and html.parser for htmlentitydefs and HTMLParser, respectively. Another example is http: http.lib, http.server.cgi, http.server.base, http.server.simple. Both examples are grouped because they make sense, but primarily to make the tail module name simpler.
And then extend this to any other package that we consider creating? Otherwise leave it out? How would that follow for sqlite since that is not going to get any shorter thanks to a package? Should it still go into the package for organizational purposes?
If you're asking me, the "organizational purpose" is 100% misguided.
Well that will make the reorg simpler. =)
In other words, should the stdlib reorg only introduce new packages if the majority of modules that go into the package end up with a shorter name?
See what others say.
This will be interesting.
Another reason to have a top-level package would be if there are a lot of subpackages/submodules. This applies to the xml package for example.
The only place I can see that coming into play is if there is any desire to group OS-specific modules together. Otherwise I think Tk-specific stuff is the only place where this has not been done before. -Brett
On Jan 7, 2008, at 3:56 PM, Brett Cannon wrote:
OK. So an html package could have htmllib for its __init__ (or html.lib), and then have html.entities and html.parser for htmlentitydefs and HTMLParser, respectively.
Actually, I'd be inclined not to have both HTMLParser and htmllib (regardless of names); a single capable interface should be provided. But that's a separate discussion. -Fred -- Fred Drake <fdrake at acm.org>
On Jan 7, 2008 12:56 PM, Brett Cannon <brett@python.org> wrote:
OK. So an html package could have htmllib for its __init__ (or html.lib), and then have html.entities and html.parser for htmlentitydefs and HTMLParser, respectively.
I'd be very reluctant to have more "asymmetric" packages like os where the package contains functionality at the top level as well as submodules, because it means that anyone using one of the submodules will pay the price of importing the top-level package. In this example, I can easily see someone using htmlentitydefs without needing htmllib.
Another example is http: http.lib, http.server.cgi, http.server.base, http.server.simple.
That sounds like a good one.
Both examples are grouped because they make sense, but primarily to make the tail module name simpler.
[...]
Another reason to have a top-level package would be if there are a lot of subpackages/submodules. This applies to the xml package for example.
The only place I can see that coming into play is if there is any desire to group OS-specific modules together. Otherwise I think Tk-specific stuff is the only place where this has not been done before.
Well, that's a little different -- plat-* and lib-tk are not subpackages but subdirectories. For the plat-* subdirs, this is done so that the same logical module name can have different implementations per platform. For lib-tk it was done to make it easier to create a distribution that didn't include any Tk stuff. But the test package structure doesn't follow this lead, and I'm not sure if it still makes sense for lib-tk. OTOH maybe lib-tk could be promoted to being a real package (named tkinter?), with the core tkinter functionality in __init__ and the rest in submodules with names conforming to PEP 8; this is one example where that type of organization actually makes sense. -- --Guido van Rossum (home page: http://www.python.org/~guido/)
On Jan 7, 2008 3:47 PM, Guido van Rossum <guido@python.org> wrote:
On Jan 7, 2008 12:56 PM, Brett Cannon <brett@python.org> wrote:
OK. So an html package could have htmllib for its __init__ (or html.lib), and then have html.entities and html.parser for htmlentitydefs and HTMLParser, respectively.
I'd be very reluctant to have more "asymmetric" packages like os where the package contains functionality at the top level as well as submodules, because it means that anyone using one of the submodules will pay the price of importing the top-level package. In this example, I can easily see someone using htmlentitydefs without needing htmllib.
Fair enough. Then something like html.lib or html.tools could be had for miscellaneous code.
Another example is http: http.lib, http.server.cgi, http.server.base, http.server.simple.
That sounds like a good one.
Great! I think I get what you are after then for the reorg in terms of any new packages coming about.
Both examples are grouped because they make sense, but primarily to make the tail module name simpler.
[...]
Another reason to have a top-level package would be if there are a lot of subpackages/submodules. This applies to the xml package for example.
The only place I can see that coming into play is if there is any desire to group OS-specific modules together. Otherwise I think Tk-specific stuff is the only place where this has not been done before.
Well, that's a little different -- plat-* and lib-tk are not subpackages but subdirectories. For the plat-* subdirs, this is done so that the same logical module name can have different implementations per platform. For lib-tk it was done to make it easier to create a distribution that didn't include any Tk stuff. But the test package structure doesn't follow this lead, and I'm not sure if it still makes sense for lib-tk. OTOH maybe lib-tk could be promoted to being a real package (named tkinter?), with the core tkinter functionality in __init__ and the rest in submodules with names conforming to PEP 8; this is one example where that type of organization actually makes sense.
If the platform-specific stuff is made into their own packages (e.g., unix, mac, win, tkinter, etc.) then this can apply generically across the stdlib (sans Modules, of course, unless we eventually change how we handle building extension modules such that they are kept in /Lib as well). I think that would be nice in terms of organization of the code and the documentation as it makes it more obvious which modules are platform-specific. Is applying this to OS-specific modules work for you like it does for tkinter stuff? -Brett
participants (4)
-
Brett Cannon -
Fred Drake -
Guido van Rossum -
Phillip J. Eby