[Python-Dev] Strange artifacts with PEP 3121 and monkey-patching sys.modules (in csv, ElementTree and others)

Antoine Pitrou solipsis at pitrou.net
Sun Aug 11 17:56:53 CEST 2013


On Sun, 11 Aug 2013 08:49:56 -0700
Eli Bendersky <eliben at gmail.com> wrote:

> On Sun, Aug 11, 2013 at 6:40 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> 
> > On Sun, 11 Aug 2013 06:26:55 -0700
> > Eli Bendersky <eliben at gmail.com> wrote:
> > > On Sun, Aug 11, 2013 at 3:33 AM, Antoine Pitrou <solipsis at pitrou.net>
> > wrote:
> > >
> > > >
> > > > Hi Eli,
> > > >
> > > > On Sat, 10 Aug 2013 17:12:53 -0700
> > > > Eli Bendersky <eliben at gmail.com> wrote:
> > > > >
> > > > > Note how doing some sys.modules acrobatics and re-importing suddenly
> > > > > changes the internal state of a previously imported module. This
> > happens
> > > > > because:
> > > > >
> > > > > 1. The first import of 'csv' (which then imports `_csv) creates
> > > > > module-specific state on the heap and associates it with the current
> > > > > sub-interpreter. The list of dialects, amongst other things, is in
> > that
> > > > > state.
> > > > > 2. The 'del's wipe 'csv' and '_csv' from the cache.
> > > > > 3. The second import of 'csv' also creates/initializes a new '_csv'
> > > > module
> > > > > because it's not in sys.modules. This *replaces* the
> > per-sub-interpreter
> > > > > cached version of the module's state with the clean state of a new
> > module
> > > >
> > > > I would say this is pretty much expected.
> > >
> > > I'm struggling to see how it's expected. The two imported csv modules are
> > > different (i.e. different id() of members), and yet some state is shared
> > > between them.
> >
> > There are two csv modules, but there are not two _csv modules.
> > Extension modules are currently immortal until the end of the
> > interpreter:
> >
> > >>> csv = __import__('csv')
> > >>> wcsv = weakref.ref(csv)
> > >>> w_csv = weakref.ref(sys.modules['_csv'])
> > >>> del sys.modules['csv']
> > >>> del sys.modules['_csv']
> > >>> del csv
> > >>> gc.collect()
> > 50
> > >>> wcsv()
> > >>> w_csv()
> > <module '_csv' from
> > '/home/antoine/cpython/default/build/lib.linux-x86_64-3.4-pydebug/_
> > csv.cpython-34dm.so'>
> >
> >
> > So, "sharing" a state is pretty much expected, since you are
> > re-initializating an existing module.
> > (but the module does get re-initialized, which is the point of PEP 3121)
> >
> 
> Yes, you're right - this is an oversight on my behalf. Indeed, the
> extensions dict in import.c keeps it alive once loaded, and only ever gets
> cleaned up in Py_Finalize.

It's not the extensions dict in import.c, it's modules_by_index in the
interpreter state.
(otherwise it wouldn't be per-interpreter)

The extensions dict holds the module *definition* (the struct
PyModuleDef), not the module instance.

Regards

Antoine.


More information about the Python-Dev mailing list