Deadlock by a second import in a thread

Hi! I was looking to this bug: http://bugs.python.org/issue1255 It basically creates a deadlock in Python by doing the following: - aa.py imports bb.py - bb.py imports time and generates a thread - the thread uses time.strptime The deadlock is because the strptime function imports another module, line 517 of timemodule.c: PyObject *strptime_module = PyImport_ImportModule("_strptime"); This situation is well known, found a lot of references to this import-thread-import problem in discussions and previous bugs (i.e.: http://bugs.python.org/issue683658). What I did *not* find, and why I'm asking here, is how to solve it. Exists a known solution to this? Thank you! -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

On 10/19/07, Facundo Batista <facundobatista@gmail.com> wrote:
Hi!
I was looking to this bug: http://bugs.python.org/issue1255
It basically creates a deadlock in Python by doing the following:
- aa.py imports bb.py - bb.py imports time and generates a thread - the thread uses time.strptime
The deadlock is because the strptime function imports another module, line 517 of timemodule.c:
PyObject *strptime_module = PyImport_ImportModule("_strptime");
This situation is well known, found a lot of references to this import-thread-import problem in discussions and previous bugs (i.e.: http://bugs.python.org/issue683658).
What I did *not* find, and why I'm asking here, is how to solve it.
Exists a known solution to this?
When python encounters a recursive import within a single thread it allows you to get access to partially-imported modules, making the assumption that you won't do any significant work until the entire import process completes. Only one thread is allowed to do an import at a time though, as they'll do significant work with it immediately, so being partially-imported would be a race condition. Writing a python file as a script means you do significant work in the body, rather than in a function called after importing completes. Importing this python file then violates the assumption for single-threaded recursive imports, and creating threads then violates their safety assumptions. The solution then is, if your python file will ever be imported, you must write a main function and do all the work there instead. Do not write it in the style of a script (with significant work in the global scope.) -- Adam Olsen, aka Rhamphoryncus

Facundo Batista wrote:
What I did *not* find, and why I'm asking here, is how to solve it.
Exists a known solution to this?
I know a possible solution. You could write a patch that moves the imports in C code to the module init function and stores the modules in a global static variable. Christian

2007/10/19, Christian Heimes <lists@cheimes.de>:
I know a possible solution. You could write a patch that moves the imports in C code to the module init function and stores the modules in a global static variable.
I thought about this. It even will have the good side efect of not shooting the whole "import" machinery to see that you already imported it, every time you do an strptime. One question: The program makes this: PyObject *strptime_module = PyImport_ImportModule("_strptime"); ... Py_DECREF(strptime_module); If I'd import it in the beggining of the file with the following... static PyObject *strptime_module = PyImport_ImportModule("_strptime"); ... I'd never decref it, right? Thank you! -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

On 10/20/07, Facundo Batista <facundobatista@gmail.com> wrote:
2007/10/19, Christian Heimes <lists@cheimes.de>:
I know a possible solution. You could write a patch that moves the imports in C code to the module init function and stores the modules in a global static variable.
I thought about this. It even will have the good side efect of not shooting the whole "import" machinery to see that you already imported it, every time you do an strptime.
One question: The program makes this:
PyObject *strptime_module = PyImport_ImportModule("_strptime"); ... Py_DECREF(strptime_module);
If I'd import it in the beggining of the file with the following...
static PyObject *strptime_module = PyImport_ImportModule("_strptime");
... I'd never decref it, right?
Right. Otherwise, if the module is removed from sys.modules then it will have a refcount of 0 and be collected, leaving your static variable holding junk memory. One issue with this approach, though, is that the import is a one-time thing, and so replacing what is in sys.modules['_strptime'] will not take affect in the module ever thanks to the caching of the module's dict. Granted this is a small issue as normal Python code does not pick up changes in sys.modules, but it is something to be aware of. -Brett

Facundo Batista wrote:
Hi!
I was looking to this bug: http://bugs.python.org/issue1255
It basically creates a deadlock in Python by doing the following:
- aa.py imports bb.py - bb.py imports time and generates a thread
bb.py is broken - importing a module should never spawn a new thread as a side effect (precisely because it will deadlock if the spawned thread tries to do an import, which can happen in a myriad of ways). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org

2007/10/20, Nick Coghlan <ncoghlan@gmail.com>:
bb.py is broken - importing a module should never spawn a new thread as a side effect (precisely because it will deadlock if the spawned thread tries to do an import, which can happen in a myriad of ways).
Agreed. But if we move the import of the _strptime module to do it once when "time" is imported, we enhance the "least surprise" situation (and has a marginal performance benefit of not trying to import it again every time.strptime() call). And even solves this bug, which does not hurt enaybody, ;) Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

2007/10/20, Nick Coghlan <ncoghlan@gmail.com>:
bb.py is broken - importing a module should never spawn a new thread as a side effect (precisely because it will deadlock if the spawned thread tries to do an import, which can happen in a myriad of ways).
Exactly, :(. I changed timeobject.c to import _strptime once at init time. But the problem still happened, because the function strptime needs to parse the format, and this involves a regex sub() call, which finish in a pattern_subx() call in _srec.c, which needs to call to _subx() in the re module through the call() function, which finally issues another import, creating the deadlock again. In short: you can not avoid a second import unless you rewrite *a lot* of internal code. BTW, I'll leave the optimization of importing strptime one time, there's no reason to try to import it everytime strptime() is called. Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

2007/10/25, Facundo Batista <facundobatista@gmail.com>:
BTW, I'll leave the optimization of importing strptime one time, there's no reason to try to import it everytime strptime() is called.
No, I'm not. In consideration to the possible warning raised by Brett, I won't commit the change (it does not fix anything, just a marginal optimization, so there's no enough reason to bring a possible issue). For the record, I'm attaching the would-be patch. Thank you all! Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

On 10/25/07, Facundo Batista <facundobatista@gmail.com> wrote:
2007/10/25, Facundo Batista <facundobatista@gmail.com>:
BTW, I'll leave the optimization of importing strptime one time, there's no reason to try to import it everytime strptime() is called.
No, I'm not. In consideration to the possible warning raised by Brett, I won't commit the change (it does not fix anything, just a marginal optimization, so there's no enough reason to bring a possible issue).
For the record, I'm attaching the would-be patch.
I honestly have no problem if you do the import once, Facundo. My warning is so marginal I wouldn't worry about it. Plus almost all other code, C and Python alike, assume the import happens only once at import so it shouldn't be a surprise to anyone.
Thank you all!
Thanks for putting the time in to figure this out! -Brett
participants (5)
-
Adam Olsen
-
Brett Cannon
-
Christian Heimes
-
Facundo Batista
-
Nick Coghlan