[Python-Dev] pkgutil, pkg_resource and Python 3.0 name space packages

Mon Jan 7 21:56:23 CET 2008

On Jan 7, 2008 12:40 PM, Guido van Rossum <guido at python.org> wrote:
>
> On Jan 7, 2008 12:19 PM, Brett Cannon <brett at python.org> wrote:
> > On Jan 6, 2008 8:28 PM, Guido van Rossum <guido at python.org> wrote:
> > > On Jan 6, 2008 7:23 PM, Phillip J. Eby <pje at telecommunity.com> wrote:
> > > > At 04:23 PM 1/6/2008 -0800, Guido van Rossum wrote:
> > > > >Regarding using common words, either the stdlib grabs these, or
> > > > >*nobody* gets to use them (for fear of conflicting with some other 3rd
> > > > >party package grabbing the same).
> > > >
> > > > This isn't quite true; a standalone Python application that isn't
> > > > extensible doesn't need to worry about this.  And it's standalone
> > > > apps that are most likely to claim these common words.  (For example,
> > > > until recently, Chandler's database library packages were in 'repository.*'.)
> > > >
> > > > But of course this is still a pretty minor point overall.  If the
> > > > stdlib does go for deeper nestings, I have a slight preference for
> > > > seeing them under std.* or some such rather than top level.  But I
> > > > don't really see a whole lot of point to doing a major re-org of the
> > > > stdlib space to begin with, for the simple reason that package names
> > > > are not really categories -- they're *names*.  IMO 'from databases
> > > > import sqlite' doesn't add any value over 'import pysqlite3' to begin with.
> > > >
> > > > Worse, it will likely be an attractive nuisance for people saying
> > > > "why don't we have databases.Oracle?" and suchlike.  And you still
> > > > have to remember the names, only now they're longer.  And was it
> > > > database or databases?  Greater uniqueness of names is just another
> > > > reason flat is better than nested.  :)
> > >
> > > Right. Packages are useful if it helps make the sub-names shorter. The
> > > email package is a good example: it uses lots of generic names like
> > > errors, charset, encoders, message, parser, utils, iterators.
> >
> > So only do the 'databases' package if we can change the modules names
> > to make it worth it?  So whichdb becomes databases.which, dumbdbm
> > becomes databases.dumb, etc.?
>
> Bad example IMO; these are all about relatively simple "dict-on-disk"
> APIs, not about (relational) databases. I'd be +0 things like dbm.gnu,
> dbm.any, dbm.dumb, dbm.which.
>

OK. So an html package could have htmllib for its __init__ (or
html.lib), and then have html.entities and html.parser for
htmlentitydefs and HTMLParser, respectively.  Another example is http:
http.lib, http.server.cgi, http.server.base, http.server.simple.

Both examples are grouped because they make sense, but primarily to
make the tail module name simpler.

> > And then extend this to any other
> > package that we consider creating? Otherwise leave it out?  How would
> > that follow for sqlite since that is not going to get any shorter
> > thanks to a package?  Should it still go into the package for
> > organizational purposes?
>
> If you're asking me, the "organizational purpose" is 100% misguided.
>

Well that will make the reorg simpler.  =)

> > In other words, should the stdlib reorg only introduce new packages if
> > the majority of modules that go into the package end up with a shorter
> > name?
>
> See what others say.

This will be interesting.

>
> Another reason to have a top-level package would be if there are a lot
> of subpackages/submodules. This applies to the xml package for
> example.

The only place I can see that coming into play is if there is any
desire to group OS-specific modules together.  Otherwise I think
Tk-specific stuff is the only place where this has not been done
before.

-Brett