<div dir="ltr"><br><div class="gmail_extra"><br><br><div class="gmail_quote">On Sun, Aug 11, 2013 at 3:33 AM, Antoine Pitrou <span dir="ltr"><<a href="mailto:solipsis@pitrou.net" target="_blank">solipsis@pitrou.net</a>></span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>

Hi Eli,<br>

<div class="im"><br>

On Sat, 10 Aug 2013 17:12:53 -0700<br>

Eli Bendersky <<a href="mailto:eliben@gmail.com">eliben@gmail.com</a>> wrote:<br>

><br>

> Note how doing some sys.modules acrobatics and re-importing suddenly<br>

> changes the internal state of a previously imported module. This happens<br>

> because:<br>

><br>

> 1. The first import of 'csv' (which then imports `_csv) creates<br>

> module-specific state on the heap and associates it with the current<br>

> sub-interpreter. The list of dialects, amongst other things, is in that<br>

> state.<br>

> 2. The 'del's wipe 'csv' and '_csv' from the cache.<br>

> 3. The second import of 'csv' also creates/initializes a new '_csv' module<br>

> because it's not in sys.modules. This *replaces* the per-sub-interpreter<br>

> cached version of the module's state with the clean state of a new module<br>

<br>

</div>I would say this is pretty much expected. </blockquote><div><br></div><div>I'm struggling to see how it's expected. The two imported csv modules are different (i.e. different id() of members), and yet some state is shared between them. I think the root reason for it is that "PyModuleDev _csvmodule" is uniqued per interpreter, not per module instance.<br>


<br></div><div>Even if dialects were not a PyObject, this would still be problematic, don't you think? And note that here, unlike the ET.ParseError case, I don't think the problem is exporting internal per-module state as a module attribute. The following two are un-reconcilable, IMHO:<br>


<br></div><div>1. Wanting to have two instances of the same module in the same interpterer.<br></div><div>2. Using a global shared PyModuleDef between all instances of the same module in the same interpterer.<br></div><div>


<br> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">The converse would be a bug<br>

IMO (but perhaps Martin disagrees). PEP 3121's stated goal is not only<br>

subinterpreter support:<br>

<br>

  "Extension module initialization currently has a few deficiencies.<br>

  There is no cleanup for modules, the entry point name might give<br>

  naming conflicts, the entry functions don't follow the usual calling<br>

  convention, and multiple interpreters are not supported well."<br>

<br>

Re-initializing state when importing a module anew makes extension<br>

modules more like pure Python modules, which is a good thing.<br>

<br>

<br>

I think the piece of interpretation you offered yesterday on IRC may be<br>

the right explanation for the ET shenanigans:<br>

<br>

  "Maybe the bug is that ParseError is kept in per-module state, and<br>

  also exported from the module?"<br>

<br>

PEP 3121 doesn't offer any guidelines for using its API, and its<br>

example shows PyObject* fields in a module state.<br>

<br>

I'm starting to think that it might be a bad use of PEP 3121. PyObjects<br>

can, and therefore should be stored in the extension module dict where<br>

they will participate in normal resource management (i.e. garbage<br>

collection). If they are in the module dict, then they shouldn't be<br>

held alive by the module state too, otherwise the (currently tricky)<br>

lifetime management of extension modules can produce oddities.<br>

<br>

<br>

So, the PEP 3121 "module state" pointer (the optional opaque void*<br>

thing) should only be used to hold non-PyObjects.  PyObjects should go<br>

to the module dict, like they do in normal Python modules.  Now, the<br>

reason our PEP 3121 extension modules abuse the module state pointer to<br>

keep PyObjects is two-fold:<br>

<br>

1. it's surprisingly easier (it's actually a one-liner if you don't<br>

handle errors - a rather bad thing, but all PEP 3121 extension modules<br>

currently don't handle a NULL return from PyState_FindModule...)<br>

<br>

2. it protects the module from any module dict monkeypatching. It's not<br>

important if you are using a generic API on the PyObject, but it is if<br>

the PyObject is really a custom C type with well-defined fields.<br>

<br>

Those two issues can be addressed if we offer an API for it. How about:<br>

<br>

  PyObject *PyState_GetModuleAttr(struct PyModuleDef *def,<br>

                                  const char *name,<br>

                                  PyObject *restrict_type)<br>

<br>

  *def* is a pointer to the module definition.<br>

  *name* is the attribute to look up on the module dict.<br>

  *restrict_type*, if non-NULL, is a type object the looked up attribute<br>

  must be an instance of.<br>

<br>

  Lookup an attribute in the current interpreter's extension module<br>

  instance for the module definition *def*.<br>

  Returns a *new* reference (!), or NULL if an error occurred.<br>

  An error can be:<br>

  - no such module exists for the current interpreter (ImportError?<br>

      RuntimeError? SystemError?)<br>

  - no such attribute exists in the module dict (AttributeError)<br>

  - the attribute doesn't conform to *restrict_type* (TypeError)<br>

<br>

So code can be written like:<br>

<br>

  PyObject *dialects = PyState_GetModuleAttr(<br>

      &_csvmodule, "dialects", &PyDict_Type);<br>

  if (dialects == NULL)<br>

      return NULL;<br></blockquote></div><br><br><br><br></div></div>