Re: [Distutils] Freeze and new import architecture
At 18:15 17/12/98 -0800, Greg Stein wrote:
... sys.path not be restricted to path names. sys.path has "strings", and an associated map of "module finders". Thus, a sys.path entry could have a directory name (like now) or .zip file, URL, etc.
I would much prefer to see the module finder instances in the sys.path.
I agree. But that would be a compatibility problem?
Sometimes, it is *very* difficult to map strings to module finders.
Yes. I think this is a serious weakness of the proposal.
1. Finding the module in a specific namespace. 2. Importing a module of a specific type, once it has been found.
I think the separation is bogus.
I don't: I'd _like_ to add two things, as a client of the system: 1) Add a new type. For example, allow .c files to be loaded, by compiling them first. 2) Add a new kind of namespace. For example, an FTP server. Or a hook to Trove. Or a private data structure I designed myself. (1) has to do with what kinds of things are loaded, whereas (2) has to do with where they are. Note that the finder must be able to _fetch_ the data to a place that the loader can load it from. If a single place is enough, then the two features are orthogonal, and thus each is separately amenable to Object Oriented development. So .. it is desirable to build an abstraction in which the functionality is separate.
Regarding 2: the finder currently returns a structure that enables the correct importer to be called later on. Importers that we have are for builtin, frozen, .py/.pyc/.pyo modules, various dll-importers all hiding behind the same interface, PYC-resource importers (mac-only) and PYD-resource importers (mac-only).
Punt this. Just import the dumb thing in one shot.
But you can't: a .dll file is imported by saying dlopen(), whereas a .py file is imported by compiling it to a .pyc file which is then imported. Etc. 'One shot' implies a single function which is not extensible.
Take the example of an HTTP-based import. Separating that into *two* transactions would be painful. It should be imported in one fell swoop. And no, you can't just keep the socket open and pass that to the loader -- that implies that you can defer the passing for a while, but the web server will time out your connection and close it. Conversely, if the intent is *not* to hold the "structure" for a while, then why the heck have two pieces?
The way I see it, the _finder_ is responsible for downloading the file to the local file system, where the loader requires it to be. The loader turns these raw bits into a module.
Both of our proposals guarantee that stuff in sys.path are not pathnames. If I insert a "foo.zip" or a "http://host.domain.name/pymodules/", then you certainly dont have pathnames.
I believe the biggest issue with my proposal is the fact that the values are no longer strings.
That's easy to fix: have a default 'finder' that is used if the sys.path entry is a string. ------------------------------------------------------- John Skaller email: skaller@maxtal.com.au http://www.maxtal.com.au/~skaller phone: 61-2-96600850 snail: 10/1 Toxteth Rd, Glebe NSW 2037, Australia
I think you may have misunderstood some of my points, so I'll try to clarify below. John Skaller wrote:
At 18:15 17/12/98 -0800, Greg Stein wrote:
... sys.path not be restricted to path names. sys.path has "strings", and an associated map of "module finders". Thus, a sys.path entry could have a directory name (like now) or .zip file, URL, etc.
I would much prefer to see the module finder instances in the sys.path.
I agree. But that would be a compatibility problem?
Nope. I pointed out that a default installation would only have strings in there. A person would not have a compatibility problem until they chose to start using a special importer. In that case, I don't define it as a problem because they chose to do so. Of course, we would need to update various standard modules to improve their handle of sys.path. But again that is not a compatibility issue.
...
1. Finding the module in a specific namespace. 2. Importing a module of a specific type, once it has been found.
I think the separation is bogus.
I don't: I'd _like_ to add two things, as a client of the system:
1) Add a new type. For example, allow .c files to be loaded, by compiling them first.
2) Add a new kind of namespace. For example, an FTP server. Or a hook to Trove. Or a private data structure I designed myself.
(1) has to do with what kinds of things are loaded, whereas (2) has to do with where they are.
Yes, it would be nice to do those things. However, I don't see that you need the separate find/import paradigm to do it. If your particular importer that you've placed into sys.path wants to use two steps, then fine. If your importers share functionality using those steps, then kudos to them. Note that Python's __import__ hook is a single step(!). The two part find/load scheme is an artifact of ihooks, not Python itself. And I disagree that we need to formalize it within Python itself. I simply maintain that we should have a very simple interface from Python to any import system. In summary, that is placing importers into sys.path and invoking a "do_import" method on them. Simple and clean.
Note that the finder must be able to _fetch_ the data to a place that the loader can load it from. If a single place is enough, then the two features are orthogonal, and thus each is separately amenable to Object Oriented development.
So .. it is desirable to build an abstraction in which the functionality is separate.
You can build your importers this way, but I don't think we need to place that mechanism in Python. Personally, I have issues with the style of "put it into a temporary location, then import it". It seems subject to race conditions and/or /tmp hacks.
Regarding 2: the finder currently returns a structure that enables the correct importer to be called later on. Importers that we have are for builtin, frozen, .py/.pyc/.pyo modules, various dll-importers all hiding behind the same interface, PYC-resource importers (mac-only) and PYD-resource importers (mac-only).
Punt this. Just import the dumb thing in one shot.
But you can't: a .dll file is imported by saying dlopen(), whereas a .py file is imported by compiling it to a .pyc file which is then imported. Etc.
Sorry, I meant "punt the whole structure thing". In my little corner of the universe, I don't believe we have two steps, so we don't need to formalize any mechanism for passing state between them. Basically, I see the state thing as a compensation for introducing the two-step find/load into Python's single-step import mechnaism.
'One shot' implies a single function which is not extensible.
This is just argumentative. My proposal is just as extensible, and I would maintain that it is simpler for the interpreter, and simpler for many importers (rather than import-writers needing to deal with the funky two-step).
Take the example of an HTTP-based import. Separating that into *two* ... The way I see it, the _finder_ is responsible for downloading the file to the local file system, where the loader requires it to be. The loader turns these raw bits into a module.
As I mentioned before, I (personally) don't like this style. I'd rather write an importer that loads it straight in from the wire. If it hits the disk, then it would be *very* transitory. The two-step thing that returns state structures implies an indeterminite time between those steps, which I think is wrong.
Both of our proposals guarantee that stuff in sys.path are not pathnames. If I insert a "foo.zip" or a "http://host.domain.name/pymodules/", then you certainly dont have pathnames.
I believe the biggest issue with my proposal is the fact that the values are no longer strings.
That's easy to fix: have a default 'finder' that is used if the sys.path entry is a string.
I meant "values are no long [only] strings". Please review my proposal again; you'll note in the path processing that I tested for a string and call "old_import" to import things using Python's current string-based mechanism. Cheers, -g -- Greg Stein, http://www.lyra.org/
participants (2)
-
Greg Stein
-
John Skaller