[Python-Dev] how important is setting co_filename for a module being imported to what __file__ is set to?

Brett Cannon brett at python.org
Mon Aug 31 18:57:13 CEST 2009

On Mon, Aug 31, 2009 at 09:33, Guido van Rossum<guido at python.org> wrote:
> On Mon, Aug 31, 2009 at 9:27 AM, Brett Cannon<brett at python.org> wrote:
>> On Mon, Aug 31, 2009 at 08:10, Antoine Pitrou<solipsis at pitrou.net> wrote:
>>> Benjamin Peterson <benjamin <at> python.org> writes:
>>>> > Why can't we simply make co_filename a writable attribute instead of
>>> inventing
>>>> > some complicated API?
>>>> Because code objects are supposed to be a immutable hashable object?
>>> Right, but co_filename is used neither in tp_hash nor in tp_richcompare.
>> I didn't suggest this since I assumed co_filename was made read-only
>> for a reason back when the design decision was made. But if the
>> original safety concerns are not there then I am happy to simply
>> change the attribute to writable.
> Hm... I still wonder if there would be bad side effects of making
> co_filename writable, but I can't think of any, so maybe you can make
> this work... The next step would be to not write it out when
> marshalling a code object -- this might save a bit of space in pyc
> files too! (I guess for compatibility you might want to write it as an
> empty string.)

I would only want to consider stripping out the filename from the
marshal format if a filename argument to marshal.load* was required to
guarantee that code objects always in some sensible state. Otherwise
everyone would end up with tracebacks that made no sense by default.
But adding a required argument to marshal.load* would be quite the
pain for compatibility.

> Of course, tracking down all the code objects in the return value of
> marshal.load*() might be a bit tricky -- API-wise I still think that
> making it an argument to marshal.load*() might be simpler. Also it
> would preserve the purity of code objects.
> (Michael: it would be fine if *other* implementations of Python made
> co_filename writable, as long as you can't think of security issues
> with this.)

OK, so what does co_filename get used for? I think it is referenced to
open files for use in printing out the traceback. Python won't be able
to open files that you can't as a user, so that shouldn't be a
security risk. All places where co_filename is referenced would need
to gain a check or start using some new C function/macro which
verified that co_filename was a string and not some number or
something else which wouldn't get null-terminated and thus lead to
buffer overflow. A quick grep for co_filename turns up 17 uses in C
code, although having to add some check would ruin the purity Guido is
talking about and make a single attribute on code objects something
people have to be careful about instead of having a guarantee that all
attributes have some specific type of value.

I'm with Guido; I would rather add an optional argument to
marshal.load*. It must be a string and, if present, is used to
override co_filename in the resulting code object. Once we have had
the argument around we can then potentially make it a required
argument and have file paths in the marshal data go away (or decide to
default to some string constant when people don't specify the path


More information about the Python-Dev mailing list