[Python-Dev] how important is setting co_filename for a module being imported to what __file__ is set to?

Guido van Rossum guido at python.org
Mon Aug 31 19:02:24 CEST 2009


On Mon, Aug 31, 2009 at 9:57 AM, Brett Cannon<brett at python.org> wrote:
> On Mon, Aug 31, 2009 at 09:33, Guido van Rossum<guido at python.org> wrote:
>> Hm... I still wonder if there would be bad side effects of making
>> co_filename writable, but I can't think of any, so maybe you can make
>> this work... The next step would be to not write it out when
>> marshalling a code object -- this might save a bit of space in pyc
>> files too! (I guess for compatibility you might want to write it as an
>> empty string.)
>
> I would only want to consider stripping out the filename from the
> marshal format if a filename argument to marshal.load* was required to
> guarantee that code objects always in some sensible state. Otherwise
> everyone would end up with tracebacks that made no sense by default.
> But adding a required argument to marshal.load* would be quite the
> pain for compatibility.

Well... It would be, but consider this: marshal.load() already takes a
file argument; in most cases you can extract the name from the file
easily. And for marshal.loads(), I'm not sure that the filename baked
into the data is all that reliable anyways.

>> Of course, tracking down all the code objects in the return value of
>> marshal.load*() might be a bit tricky -- API-wise I still think that
>> making it an argument to marshal.load*() might be simpler. Also it
>> would preserve the purity of code objects.
>>
>> (Michael: it would be fine if *other* implementations of Python made
>> co_filename writable, as long as you can't think of security issues
>> with this.)
>
> OK, so what does co_filename get used for? I think it is referenced to
> open files for use in printing out the traceback. Python won't be able
> to open files that you can't as a user, so that shouldn't be a
> security risk. All places where co_filename is referenced would need
> to gain a check or start using some new C function/macro which
> verified that co_filename was a string and not some number or
> something else which wouldn't get null-terminated and thus lead to
> buffer overflow.

You could also do the validation on assignment.

> A quick grep for co_filename turns up 17 uses in C
> code, although having to add some check would ruin the purity Guido is
> talking about and make a single attribute on code objects something
> people have to be careful about instead of having a guarantee that all
> attributes have some specific type of value.
>
> I'm with Guido; I would rather add an optional argument to
> marshal.load*. It must be a string and, if present, is used to
> override co_filename in the resulting code object. Once we have had
> the argument around we can then potentially make it a required
> argument and have file paths in the marshal data go away (or decide to
> default to some string constant when people don't specify the path
> argument).

Actually that sounds like a fine transitional argument.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-Dev mailing list