Fwd: Deadlock by a second import in a thread
2007/10/19, Adam Olsen
The solution then is, if your python file will ever be imported, you must write a main function and do all the work there instead. Do not write it in the style of a script (with significant work in the global scope.)
I had this a as a good coding style, not so mandatory. I agree with you that the OP shouldn't be doing that, but note that the main problem arises here because it's completely unpredictable the import in strptime for an external user. Do you recommend to close the bug as "won't fix" saying something like... The deadlock happens because strptime has an import inside it, and recursive imports are not allowed in different threads. As a general rule and good coding style, don't run your code when the module is imported, but put it in a function like "main" in the second file, import it and call it from the first one. This will solve your problem. Note that this happens to you with strptime, but could happen with a lot of functions that do this internal import of something else. So, you'll never be sure. What do you think? Thank you! -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/
On 10/19/07, Facundo Batista
2007/10/19, Adam Olsen
: The solution then is, if your python file will ever be imported, you must write a main function and do all the work there instead. Do not write it in the style of a script (with significant work in the global scope.)
I had this a as a good coding style, not so mandatory.
I agree with you that the OP shouldn't be doing that, but note that the main problem arises here because it's completely unpredictable the import in strptime for an external user.
Do you recommend to close the bug as "won't fix" saying something like...
The deadlock happens because strptime has an import inside it, and recursive imports are not allowed in different threads.
As a general rule and good coding style, don't run your code when the module is imported, but put it in a function like "main" in the second file, import it and call it from the first one. This will solve your problem.
Note that this happens to you with strptime, but could happen with a lot of functions that do this internal import of something else. So, you'll never be sure.
What do you think?
Whether this is a minor problem due to poor style or a major problem due to a language defect is a matter of perspective. I'm working on redesigning Python's threading support, expecting it to be used a great deal more, which'd push it into the major problem category. For now I'd leave it open. -- Adam Olsen, aka Rhamphoryncus
2007/10/19, Adam Olsen
Whether this is a minor problem due to poor style or a major problem due to a language defect is a matter of perspective. I'm working on redesigning Python's threading support, expecting it to be used a great deal more, which'd push it into the major problem category.
For now I'd leave it open.
It's a matter of perspective, yes. But I'll close this bug, because he's hitting the problem through a weird way, doing something that he shouldn't. The real problem here, if any, is that you can not make a second import in another thread. Feel free to open a bug for this, but addressing this specifically. I'd prefer a PEP, though, ;) Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/
Facundo Batista wrote:
It's a matter of perspective, yes. But I'll close this bug, because he's hitting the problem through a weird way, doing something that he shouldn't.
The real problem here, if any, is that you can not make a second import in another thread. Feel free to open a bug for this, but addressing this specifically.
I had a look into the code. I think it's possible to get rid of most imports by caching the import in a static variable. For warnings, time and resource it's even possible to import the module in module initializer but not for strptime. It depends on a Python module that imports datetime and time. I could look into the matter and provide a patch for the trunk. Christian
2007/10/25, Christian Heimes
I could look into the matter and provide a patch for the trunk.
Feel free to do it. But note, that some imports are inside the call() function, this could have more implications that you see (at least I saw) at first glance. Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/
Facundo Batista wrote:
Feel free to do it. But note, that some imports are inside the call() function, this could have more implications that you see (at least I saw) at first glance.
CC to get Guido's attention First of all I don't understand what you mean with "that some imports are inside the call() function". Please elaborate on it. I skimmed through the code and found just a handful of modules that are important by name as well as imported more than once through the life time of a python process. The most noticeable modules are time, _strptime, resource, unicodedata and warnings. The other modules like __builtin__, zlib and warnings are just loaded once or twice during the boot strapping of the interpreter. Guido: What do you think about storing the modules in an "extern PyObject *PyMod_Spam" variable? I could either store them when they are used the first time or I could load them in Py_InitializeEx. Christian $ find -name '*.c' | xargs grep PyImport_ImportModule\(\" ./Objects/unicodeobject.c: m = PyImport_ImportModule("unicodedata"); ./Objects/exceptions.c: bltinmod = PyImport_ImportModule("__builtin__"); ./PC/bdist_wininst/install.c: mod = PyImport_ImportModule("__builtin__"); ./Modules/_ctypes/callbacks.c: mod = PyImport_ImportModule("ctypes"); ./Modules/_ctypes/callbacks.c: mod = PyImport_ImportModule("ctypes"); ./Modules/cPickle.c: if (!( copy_reg = PyImport_ImportModule("copy_reg"))) ./Modules/cPickle.c: if (!( t=PyImport_ImportModule("__builtin__"))) return -1; ./Modules/posixmodule.c: PyObject *m = PyImport_ImportModule("resource"); ./Modules/zipimport.c: zlib = PyImport_ImportModule("zlib"); /* import zlib */ ./Modules/datetimemodule.c: PyObject *time = PyImport_ImportModule("time"); ./Modules/datetimemodule.c: PyObject *time = PyImport_ImportModule("time"); ./Modules/datetimemodule.c: time = PyImport_ImportModule("time"); ./Modules/datetimemodule.c: if ((module = PyImport_ImportModule("time")) == NULL) ./Modules/timemodule.c: PyObject *strptime_module = PyImport_ImportModule("_strptime"); ./Modules/timemodule.c: m = PyImport_ImportModule("time"); ./Modules/gcmodule.c: tmod = PyImport_ImportModule("time"); ./Modules/main.c: runpy = PyImport_ImportModule("runpy"); ./Modules/main.c: v = PyImport_ImportModule("readline"); ./Modules/parsermodule.c: copyreg = PyImport_ImportModule("copy_reg"); ./Modules/_cursesmodule.c: PyObject *m = PyImport_ImportModule("curses"); ./Python/mactoolboxglue.c: m = PyImport_ImportModule("MacOS"); ./Python/pythonrun.c: warnings_module = PyImport_ImportModule("warnings"); ./Python/pythonrun.c: PyObject *bimod = PyImport_ImportModule("__builtin__"); ./Python/pythonrun.c: m = PyImport_ImportModule("site"); ./Python/errors.c: mod = PyImport_ImportModule("warnings"); ./Python/import.c: zimpimport = PyImport_ImportModule("zipimport"); ./Doc/tools/sphinx/jinja/_speedups.c: PyObject *datastructure = PyImport_ImportModule("jinja.datastructure"); ./Mac/Modules/MacOS.c: m = PyImport_ImportModule("macresource");
2007/10/26, Christian Heimes
First of all I don't understand what you mean with "that some imports are inside the call() function". Please elaborate on it.
I'm talking about the "call" function defined in the _sre.c file. This function has a call to PyImport_Import() inside it. In the "bug" I was persuing, this call to PyImport_Import() causes the deadlock after "call" being called from line 2314: filter = call( SRE_PY_MODULE, "_subx", PyTuple_Pack(2, self, ptemplate) ); I just was saying the calls to "imports" are so deep, that you need to touch a lot of code to make sure that they're executed in a sane way. With "sane way", I mean that the imports should be executed when the user makes "import foobar". Importing thing when the user makes time.strptime() or re.sub() is breaking the "least surprise" rule, which is a bad thing especially when talking of imports, that can cause (and actually do!) deadlocks.
I skimmed through the code and found just a handful of modules that are important by name as well as imported more than once through the life time of a python process. The most noticeable modules are time, _strptime, resource, unicodedata and warnings. The other modules like __builtin__, zlib and warnings are just loaded once or twice during the boot strapping of the interpreter.
This goes along what I said in my first mail: everytime you do time.strptime(), that function tries to import "_strptime.py", which is, at least, inefficient.
What do you think about storing the modules in an "extern PyObject *PyMod_Spam" variable? I could either store them when they are used the first time or I could load them in Py_InitializeEx.
You shouldn't import them when they're used in the first time, are you still are breaking the least surprise rule (see last parragraph). However, one possible way to solve this problems is to, in every module, import everything the module will ever need at init time. Note that you actually do not need ane extern variable here, as you won't be accessing it from other files, making it global and static would be ok. I was about to commit the following change in timemodule.c, that is a good example of what I just said (I didn't commit it, because I'm not full aware of the issue that Brett described): 1. I extracted the import from the strptime function: @@ -514,13 +517,11 @@ static PyObject * time_strptime(PyObject *self, PyObject *args) { - PyObject *strptime_module = PyImport_ImportModule("_strptime"); PyObject *strptime_result; if (!strptime_module) return NULL; strptime_result = PyObject_CallMethod(strptime_module, "strptime", "O", args); - Py_DECREF(strptime_module); return strptime_result; } 2. Created a global static variable that will hold the module: @@ -98,6 +98,9 @@ /* For Y2K check */ static PyObject *moddict; +/* This will be initializied at module init time */ +static PyObject *strptime_module; + /* Exposed in timefuncs.h. */ time_t _PyTime_DoubleToTimet(double x) 3. Imported the module at init time: @@ -848,6 +849,8 @@ Py_INCREF(&StructTimeType); PyModule_AddObject(m, "struct_time", (PyObject*) &StructTimeType); initialized = 1; + + strptime_module = PyImport_ImportModule("_strptime"); } Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/
participants (3)
-
Adam Olsen
-
Christian Heimes
-
Facundo Batista