Re: [Python-Dev] how important is setting co_filename for a module being imported to what file is set to?

31 Aug 2009


      On Sun, Aug 30, 2009 at 5:34 PM, Brett Cannon wrote:
...
On Sun, Aug 30, 2009 at 17:24, Guido van Rossum wrote:
...
On Sun, Aug 30, 2009 at 4:28 PM, Brett Cannon wrote:
...
I am going through and running the entire test suite using importlib
to ferret out incompatibilities. I have found a bunch, although all
rather minor (raising a different exception typically; not even sure
they are worth backporting as anyone reliant on the old exceptions
might get a nasty surprise in the next micro release), and now I am
down to my last failing test suite: test_import.
Ignoring the execution bit problem (http://bugs.python.org/issue6526
but I have no clue why this is happening), I am bumping up against
TestPycRewriting.test_incorrect_code_name. Turns out that import
resets co_filename on a code object to __file__ before exec'ing it to
create a module's namespace in order to ignore the file name passed
into compile() for the filename argument. Now I can't change
co_filename from Python as it's a read-only attribute and thus can't
match this functionality in importlib w/o creating some custom code to
allow me to specify the co_filename somewhere (marshal.loads() or some
new function).
My question is how important is this functionality? Do I really need
to go through and add an argument to marshal.loads or some new
function just to set co_filename to something that someone explicitly
set in a .pyc file? Or I can let this go and have this be the one
place where builtins.__import__ and importlib.__import__ differ and
just not worry about it?
ISTR that Bill Janssen once mentioned a file replication mechanism
whereby there were two names for each file: the "canonical" name on a
replicated read-only filesystem, and the longer "writable" name on a
unique master copy. He ended up with the filenames in the .pyc files
being pretty bogus (since not everyone had access to the writable
filesystem). So setting co_filename to match __file__ (i.e. the name
under which the module is being imported) would be a nice service in
this case.
In general this would happen whenever you pre-compile a bunch of .py
files to .pyc/.pyo and then copy the lot to a different location. Not
a completely unlikely scenario.
...
Well, to get this level of compatibility I am going to need to add
some magical API somewhere then to overwrite a code object's "file"
location. Blah.
Agreed, no fun. Unfortunately for core Python it really pays to go the
extra mile...
...
I will either add an argument to marshal.loads to specify an
overriding file path or add an imp.exec that takes a file path
argument to override the code object with.
Remember, there are many code objects created from one pyc file.
Adding it to marshal.load*() makes sense because then it's usable for
other purposes too, and that attacks the issue from the root. (in
import.c it's done by update_compiled_module() right after
read_compiled_module(), which is a thin wrapper around marshal.load())
I'm not sure how imp.exec would make sure that introspection of the
loaded code objects always gets the right thing.
...
...
(I was going to comment on the execution bit issue but I realized I'm
not even sure if you're talking about import.c or not. :-)
So it turns out a bunch of execution/write bit stuff has come up in
Python 2.7 and importlib has been ignoring it. =) Importlib has simply
been opening up the bytecode files with 'wb' and writing out the file.
But test_import tests that no execution bit get set or that a write
bit gets added if the source file lacks it. I guess I can use
posix.chmod and posix.stat to copy the source file's read and write
bits and always mask out the execution bits. I hate this low-level
file permission stuff.
It's no fun -- see the layers of #ifdefs in open_exclusive() in
import.c. (Though I think you won't need to worry about VMS. :-) But
it's somewhat important to get it right from a security POV. I would
use os.open() and wrap an io.BufferedWriter around it.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

Re: [Python-Dev] how important is setting co_filename for a module being imported to what __file__ is set to?

Guido van Rossum

Re: [Python-Dev] how important is setting co_filename for a module being imported to what file is set to?