[Python-Dev] zipimport.c broken with implicit namespace packages

Sun Jan 3 00:29:52 EST 2016

>>>>> " " == Brett Cannon <brett at python.org> writes:

     > I just wanted to quickly say that Guido's observation as to how
     > a VFS is overkill is right. Imagine implementing a loader using
     > sqlite and you quickly realize that doing a dull VFS is more
     > than necessary to implement what import needs to function.

  I fear I've made a poor choice in calling this abstract class a VFS
(I'm terrible with names).  I'm not thinking of anything along the
lines of a full file system that supports open(), seek(), read() and
everything else.   That for sure would be overkill and way more
complicated than it needs to be.

  All I'm really thinking about is a simple abstract interface that is
used by an importer to actually locate and retrieve the binary objects
that will be loaded.  For the simple case I think just two methods
would/could server a read only "blob/byte database":

  listdir(path)  # returns an iterable container of "files"/"dirs" found
                 # at path

  get(path)      # returns a bytes object for the given path

  I think with those two virtual calls a more high level import layer
can locate and retrieve modules to be loaded without even knowing
where they came from.

  The higher level would know about things such as the difference
between .py and .pyc "files" or the possible existance of __pycache__
directories and what may be found in them.  Right now the zipimporter
contains a list of file extensions to try and load (and in what
order).  It also lacks any knowledge of __pycache__ directories (which
is one of the outstanding bugs with it).   It just seems to me that
this sorta logic would be better moved to a higher layer and the zip
layer just translates paths into reads of byte blobs.

  I mentioned write()/put() for two reasons:

  1)  When I import a .py file then a .pyc file is created on my
      filesystem.  I don't really know what piece of code created it.
      But a write to the filesystem (assuming it is writeable and
      permissions set etc) occurs.   It might be nice for other
      storage systems (zip, sql, etc) could optionally support this.
      They could if the code that crated the .pyc simply did a put()
      to the object that pulled in the .py file.  The interface is
      expanded by two calls (put() and delete()).

  2)  Integration with package data.  I know there are
      modules/packages out there that help a module try and locate
      data files that may be associated with a package.  I think it
      would be kinda cool for a module to instead be able to get a
      handle to the abstract class that loaded it.  it could then use
      the same listdir() get() and possibly write methods the importer
      did.  The writing bit of this may or may not be a good idea :)

  Anyway, hope I did not muddy the waters.  I was just thinking a bit
out loud and none of this may live past my own experiments.   I was/am
just trying to think of why the importers like the zipimporter don't
work like a filesystem importer and how they would be cleaner if they
just dealt with paths and byte blobs to store/get based on those paths.

Mike