<br><br><div><span class="gmail_quote">On 9/27/06, <b class="gmail_sendername">Phillip J. Eby</b> &lt;<a href="mailto:pje@telecommunity.com">pje@telecommunity.com</a>&gt; wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

At 02:11 PM 9/27/2006 -0700, Brett Cannon wrote:<br>&gt;But it has been suggested here that the import machinery be rewritten in<br>&gt;Python.&nbsp;&nbsp;Now I have never touched the import code since it has always had<br>&gt;the reputation of being less than friendly to work with.&nbsp;&nbsp;I am asking for

<br>&gt;opinions from people who have worked with the import machinery before if<br>&gt;it is so bad that it is worth trying to re-implement the import semantics<br>&gt;in pure Python or if in the name of time to just work with the C

<br>&gt;code.&nbsp;&nbsp;Basically I will end up breaking up built-in, .py, .pyc, and<br>&gt;extension modules into individual importers and then have a chaining class<br>&gt;to act as a combined .pyc/.py combination importer (this will also make

<br>&gt;writing out to .pyc files an optional step of the .py import).<br><br>The problem you would run into here would be supporting zip imports.</blockquote><div><br>I have not looked at zipimport so I don't know the exact issue in terms of how it hooks into the import machinery.&nbsp; But a C level API will most likely be needed.

<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">&nbsp;&nbsp;It<br>would probably be more useful to have a mapping of file types to &quot;format

<br>handlers&quot;, because then a filesystem importer or zip importer would then be<br>able to work with any .py/.pyc/.pyo/whatever formats, along with any new<br>ones that are invented, without reinventing the wheel.</blockquote>

<div><br>So you are saying the zipimporter would then pull out of the zip file the individual file to import and pass that to the format-specific importer?<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

Thus, whether it's file import, zip import, web import, or whatever, the<br>same handlers would be reusable, and when people invent new extensions like<br>.ptl, .kid, etc., they can just register format handlers instead.</blockquote>

<div><br>So a sepration of data store from data interpretation for importation.&nbsp; My only worry is a possible explosion of checks for the various data types.&nbsp; If you are using the file data store and had .py, .pyc, .so, module.so

, .ptl, and .kid registered that might suck in terms of performance hit.&nbsp; And I am assuming for a web import that it would decide based on the extension of the resulting web address?&nbsp; And checking for the various types might not work well for other data store types.&nbsp; Guess you would need a way to register with the data store exactly what types of data interpretation you might want to check.

<br><br>Other option is to just have the data store do its magic and somehow know what kind of data interpretation is needed for the string returned (e.g., a database data store might implicitly only store .py code and thus know that it will only return a string of source).&nbsp; Then that string and the supposed file extension is passed ot the next step of creating a module from that data string.

<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Format handlers could of course be based on the PEP 302 protocol, and<br>simply accept a &quot;parent importer&quot; with a get_data() method.&nbsp;&nbsp;So, let's say

<br>you have a PyImporter:<br><br>&nbsp;&nbsp;&nbsp;&nbsp; class PyImporter:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; def __init__(self, parent_importer):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; self.parent = parent_importer<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; def find_module(self, fullname):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; path = 

fullname.split('.')[-1]+'.py'<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; try:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; source = self.parent.get_data(path)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; except IOError:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return None<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return PySourceLoader(source)

<br><br>See what I mean?&nbsp;&nbsp;The importers and loaders thus don't have to do direct<br>filesystem operations.</blockquote><div><br>I think so.&nbsp; Basically you want more of a way to stack imports so that the basic importers are just passed the string of what it is supposed to load from.&nbsp; Other importers higher in the chain can handle getting that string.

<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Of course, to fully support .pyc timestamp checking and writeback, you'd<br>need some sort of &quot;stat&quot; or &quot;getmtime&quot; feature on the parent importer, as

<br>well as perhaps an optional &quot;save_data&quot; method.&nbsp;&nbsp;These would be extensions<br>to PEP 302, but welcome ones.</blockquote><div><br>Could pass the string representing the location of where the string came from.&nbsp; That would allow for the required stat calls for .pyc files as needed without having to implement methods just for this one use case.

<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Anyway, based on my previous work with pkg_resource, pkgutil, zipimport,<br>import.c

, etc. I would say this is how I'd want to structure a<br>reimplementation of the core system.&nbsp;&nbsp;And if it were for Py3K, I'd probably<br>treat sys.path and all the import hooks associated with it as a single<br>meta-importer on 

sys.meta_path -- listed after a meta-importer for handling<br>frozen and built-in modules.&nbsp;&nbsp;(I.e., the meta-importer that uses sys.path<br>and its path hooks would be last on sys.meta_path.)</blockquote><div><br>Ah, interesting idea!&nbsp; Could even go as far as removing 

sys.path and just making it an attribute of the base importer if you really wanted to make it just meta_path for imports.<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

In other words, sys.meta_path is really the only critical import hook from<br>the raw interpreter's point of view.&nbsp;&nbsp;sys.path, however, (along with<br>sys.path_hooks and sys.path_importer_cache) is critical from the<br>perspective of users, applications, etc., as there has to be some way to

<br>get things onto Python's path in the first place.<br><br></blockquote></div><br>Yeah, I think I get it.&nbsp; I don't know how much it simplifies things for users but I think it might make it easier for alternative import writers.

<br><br>-Brett<br>