[Python-Dev] Fwd: Deadlock by a second import in a thread

Facundo Batista facundobatista at gmail.com
Fri Oct 26 15:50:31 CEST 2007


2007/10/26, Christian Heimes <lists at cheimes.de>:

> First of all I don't understand what you mean with "that some imports
> are inside the call() function". Please elaborate on it.

I'm talking about the "call" function defined in the _sre.c file. This
function has a call to PyImport_Import() inside it.

In the "bug" I was persuing, this call to PyImport_Import() causes the
deadlock after "call" being called from line 2314:

    filter = call(
        SRE_PY_MODULE, "_subx",
        PyTuple_Pack(2, self, ptemplate)
        );

I just was saying the calls to "imports" are so deep, that you need to
touch a lot of code to make sure that they're executed in a sane way.

With "sane way", I mean that the imports should be executed when the
user makes "import foobar". Importing thing when the user makes
time.strptime() or re.sub() is breaking the "least surprise" rule,
which is a bad thing especially when talking of imports, that can
cause (and actually do!) deadlocks.


> I skimmed through the code and found just a handful of modules that are
> important by name as well as imported more than once through the life
> time of a python process. The most noticeable modules are time,
> _strptime, resource, unicodedata and warnings. The other modules like
> __builtin__, zlib and warnings are just loaded once or twice during the
> boot strapping of the interpreter.

This goes along what I said in my first mail: everytime you do
time.strptime(), that function tries to import "_strptime.py", which
is, at least, inefficient.


> What do you think about storing the modules in an "extern PyObject
> *PyMod_Spam" variable? I could either store them when they are used the
> first time or I could load them in Py_InitializeEx.

You shouldn't import them when they're used in the first time, are you
still are breaking the least surprise rule (see last parragraph).

However, one possible way to solve this problems is to, in every
module, import everything the module will ever need at init time. Note
that you actually do not need ane extern variable here, as you won't
be accessing it from other files, making it global and static would be
ok.

I was about to commit the following change in timemodule.c, that is a
good example of what I just said (I didn't commit it, because I'm not
full aware of the issue that Brett described):

1. I extracted the import from the strptime function:

@@ -514,13 +517,11 @@
 static PyObject *
 time_strptime(PyObject *self, PyObject *args)
 {
-    PyObject *strptime_module = PyImport_ImportModule("_strptime");
     PyObject *strptime_result;

     if (!strptime_module)
         return NULL;
     strptime_result = PyObject_CallMethod(strptime_module,
"strptime", "O", args);
-    Py_DECREF(strptime_module);
     return strptime_result;
 }


2. Created a global static variable that will hold the module:

@@ -98,6 +98,9 @@
 /* For Y2K check */
 static PyObject *moddict;

+/* This will be initializied at module init time */
+static PyObject *strptime_module;
+
 /* Exposed in timefuncs.h. */
 time_t
 _PyTime_DoubleToTimet(double x)


3. Imported the module at init time:

@@ -848,6 +849,8 @@
 	Py_INCREF(&StructTimeType);
 	PyModule_AddObject(m, "struct_time", (PyObject*) &StructTimeType);
 	initialized = 1;
+
+    strptime_module = PyImport_ImportModule("_strptime");
 }

Regards,

-- 
.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/


More information about the Python-Dev mailing list