[Python-Dev] difficulty of implementing phase 2 of PEP 302 in Python source

Phillip J. Eby pje at telecommunity.com
Thu Sep 28 02:41:15 CEST 2006


At 05:26 PM 9/27/2006 -0700, Brett Cannon wrote:
>Ah, OK.  So for importing 'email', the zipimporter would call the .pyc 
>importer and it would ask the zipimporter, "can you get me email.pyc?" and 
>if it said no it would move on to asking the .py importer for email.py, etc.

Yes, exactly.


>That's fine.  Just thinking about how the current situation sucks for NFS 
>but how caching just isn't done.  But obvoiusly this could change.

Well, with this design, you can have a CachingFilesystemImporter as your 
storage mechanism to speed things up.


>> >>Of course, to fully support .pyc timestamp checking and writeback, you'd
>> >>need some sort of "stat" or "getmtime" feature on the parent importer, as
>> >>well as perhaps an optional "save_data" method.  These would be extensions
>> >>to PEP 302, but welcome ones.
>> >
>> >Could pass the string representing the location of where the string came
>> >from.  That would allow for the required stat calls for .pyc files as
>> >needed without having to implement methods just for this one use case.
>>
>>Huh?  In order to know if a .pyc is up to date, you need the st_mtime of
>>the .py file.  That can't be done in the parent importer without giving it
>>format knowledge, which goes against the point of the exercise.
>
>Sorry, thought .pyc files based whether they needed to be recompiled based 
>on the stat info on the .py and .pyc file, not on data stored from within 
>the .pyc .

It's not just that (although I believe it's also the case that there is a 
timestamp inside .pyc), it's that to do the check in the parent importer, 
the parent importer would have to know that there is such a thing as 
.py-and-.pyc.  The whole point of this design is that the parent importer 
doesn't have to know *anything* about filename extensions OR how those 
files are formatted internally.  In this scheme, adding more child 
importers is sufficient to add all the special handling needed for 
.py/.pyc-style schemes.

Of course, for maximum flexibility, you might want get_stream() and 
get_file() methods optionally available, since a .so loader really needs a 
file, and .pyc might want to read in two stages.  But the child importers 
can be defensively coded so as to be able to live with only a 
parent.get_data(), if necessary, and do the enhanced behaviors only if 
stat() or get_stream() or write_data() etc. attributes are available on the 
parent.

If we get some standards for these additional attributes, we can document 
them as standard PEP 302 extensions.

The format importer mechanism might want to have something like 
'sys.import_formats' as a list of importer classes (or factories).  Parent 
(storage) importer classes would then create instances to use.

If you add a new format importer to sys.import_formats, you would of course 
need to clear sys.path_importer_cache, so that the individual importers are 
rebuilt on the next import, and thus they will create new child importer 
chains.

Yeah, that pretty much ought to do it.



More information about the Python-Dev mailing list