[Python-3000] Chaning the import machinery; was: Re: [Python-Dev] setuptools in 2.5.

Thu Apr 20 15:06:34 CEST 2006

Sorry, there's so much here that seems poorly thought out that I don't
know where to start.

Getting rid of the existing import syntax in favor of the incredibly
verbose and ugly

  foo = import("foo")

just isn't acceptable.

Importing from remote URLs is a non-starter from a security POV; and
using HTTPS would be too slow. For code that's known to reside
remotely, a better approach is to use setuptools to install that code
once and for all.

How would a module know its own name? How do you deal with packages
(importing a module from a package implies importing/loading the
package's __init__.py).

I suggest that instead of answering these questions from the
perspective of the solution you're offering here, you tell us a bit
more about the use cases that make you think of this solution. What
are you trying to do? Why are you importing code from a specific file
instead of configuring sys.path so the file will be found naturally? I
suspect that your module-from-a-database use case is actually intended
as a building block for an import hook.

I think we ought to redesign the import machinery in such a way that
you could do things like importing from a specific file or from a
database, but without changing the import statement syntax -- instead,
we should change what import actually *does*. While we're at it, we
should also fix the silliness that __import__("foo.bar") imports
foo.bar but returns foo.

--Guido

On 4/20/06, Walter Dörwald <walter at livinglogic.de> wrote:
> Guido van Rossum wrote:
>
> > On 4/20/06, Walter Dörwald <walter at livinglogic.de> wrote:
> >> I'd like to be able to simply import a file if I have the filename. And
> >> I'd like to be able to load sourcecode from an Oracle database and have
> >> useful "filename" and line number information if I get a traceback.
> >
> > Great use cases. Could I ask you to elaborate these in the Python-3000
> > list? It would be very useful if you attempted to specify what
> > (approximately) the API for these would look like and how it would
> > work (e.g. the immediate question with import from a file or URL is
> > what happens if another module imports the same thing).
>
> Maybe import should become a function (maybe even a generic function ;))
> so we can do:
>
> cStringIO = import(url("file:/usr/local/lib/python2.5/cStringIO"))
>
> import cx_Oracle
> db = cx_Oracle.connect("...")
>
> mymodule = import(oraclesource(db=db, query="select source from modules
> where name='mymodule'"))
>
> code = """
> cache = 1000
> color = True
> """
>
> options = import(literal(code))
>
> Simple imports could look like this:
> urllib2 = import("urllib2")
>
> (i.e. pass the module name as a string). Unfortunately this means that
> the module name has to be specified twice.
>
> All objects passed to import() should be usable as dictionary keys, so
> that they can be stored as keys in sys.modules which would continue to
> be the module cache.
>
> In a traceback the repr() of this object should be displayed (maybe with
> the exception of import("urllib2") which should display the real filename).
>
> So far this means that we would have to get rid of relative imports.
> Another option would be to make a relative import be the responsibility
> of the module to which the import is relative.
>
> Even better would be if import() would do some kind of dependency
> tracking during import for modules that are recursively imported during
> import of the original module. Then
>
> mymodule = reload(oraclesource(db=db, query="select source from modules
> where name='mymodule'")
>
> could do the right thing (i.e. import the module if it's not in
> sys.modules or it has changed since the last import or one of the
> modules it uses has changed; otherwise use the cached module). This
> would mean that each import resource would have to provide some kind of
> cookie (a checksum or a timestamp), so that it's possible to detect if
> the source has changed (and of course it needs a method to return the
> real source code string).
>
> I've implemented something like this once, but abandoned the idea
> because tracebacks simply have "<string>" as the filename, and that
> makes debugging a PITA. Anyway the source code for this is here:
>
> http://styx.livinglogic.de/~walter/pythonimport/resload.py
>
> Servus,
>    Walter
>
>

--
--Guido van Rossum (home page: http://www.python.org/~guido/)