Inconsistency of PyModule_AddObject()

There are three functions (or at least three documented functions) in C API that "steals" references: PyList_SetItem(), PyTuple_SetItem() and PyModule_AddObject(). The first two "steals" references even on failure, and this is well known behaviour. But PyModule_AddObject() "steals" a reference only on success. There is nothing in the documentation that points on this. Most usages of PyModule_AddObject() in the stdlib don't decref the reference to the value on PyModule_AddObject() failure. The only exceptions are in _json, _io, and _tkinter modules. In many cases, including examples in the documentation, the successfulness of PyModule_AddObject() is not checked either, but this is different issue. We can just fix the documentation but adding a note that PyModule_AddObject() doesn't steal a reference on failure. And add explicit decrefs after PyModule_AddObject() in hundreds of places in the code. But I think it would be better to "fix" PyModule_AddObject() by making it decrefing a reference on failure as expected by most developers. But this is dangerous change, because if the author of third-party code read not only the documentation, but CPython code, and added explicit decref on PyModule_AddObject() failure, we will get a double decrefing. I think that we can resolve this issue by following steps: 1. Add a new function PyModule_AddObject2(), that steals a reference even on failure. 2. Introduce a special macro like PY_SSIZE_T_CLEAN (any suggestions about a name?). If it is defined, define PyModule_AddObject as PyModule_AddObject2. Define this macro before including Python.h in all CPython modules except _json, _io, and _tkinter. 3. Make old PyModule_AddObject to emit a warning about possible leak and a suggestion to define above macro.

On Wed, Apr 27, 2016 at 10:14 AM, Serhiy Storchaka <storchaka@gmail.com> wrote:
+1 It would be good to document PyModule_AddObject's current behavior in 3.5+ (already attached a patch).
+1
3. Make old PyModule_AddObject to emit a warning about possible leak and a suggestion to define above macro.
+0

On 04/27/2016 09:14 AM, Serhiy Storchaka wrote:
This inconsistency has caused bugs (or, more fairly, potential leaks) before, see http://bugs.python.org/issue1782 Unfortunately, the suggested Python 3 change to PyModule_AddObject was not accepted.
1. Add a new function PyModule_AddObject2(), that steals a reference even on failure.
This sounds like a good idea, except the name could be prettier :), e.g. PyModule_InsertObject. PyModule_AddObject could be deprecated. Hrvoje

On 27.04.16 15:31, Hrvoje Niksic wrote:
Glad to hear I'm not the first faced with this problem.
Unfortunately, the suggested Python 3 change to PyModule_AddObject was not accepted.
Bad. May be it happened because of the risk to break third-party working code. I propose a gradual path to change PyModule_AddObject.
I have decided to not introduce new public function. But just control the behavior of old function with the macro. This needs minimal changes to user code.

Hrvoje Niksic <hrvoje.niksic <at> avl.com> writes:
First, these "leaks" only potentially show up when you already have much bigger problems (i.e. on Linux the machine would already freeze due to overallocation). Second, these "leaks" don't even show up as "definitely lost" in Valgrind (yes, I checked). On the bright side, Python must be in a very healthy state if we can afford to spend time on issues such as this one. Stefan Krah

On 27 April 2016 at 17:14, Serhiy Storchaka <storchaka@gmail.com> wrote:
I'd suggest a variant on this that more closely matches the PyList_SetItem and PyTuple_SetItem cases: PyModule_SetAttrString The first two match the signature of PySequence_SetItem, but steal the reference instead of making a new one, and the same relationship would exist between PyObject_SetAttrString and the new PyModule_SetAttrString. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 27 April 2016 at 23:08, Nick Coghlan <ncoghlan@gmail.com> wrote:
And for the record: that suggestion was prompted by Hrvoje's email suggesting using a more descriptive name, I just went and looked up the name of the corresponding PyObject_* API. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 27.04.16 16:08, Nick Coghlan wrote:
I think it is better to have relation with PyModule_AddIntConstant() etc than with PyObject_SetAttrString. My patch doesn't introduce new public function, but changes the behavior of the old function. This needs minimal changes to user code that mostly use PyModule_AddObject() incorrectly (not blaming authors).

On 28.04.16 01:24, Case Van Horsen wrote:
No impact except emitting a deprecation warning at build time. But we can remove a deprecation warning and add it in future release if this is annoying. But are you sure, that your code uses PyModule_AddObject() correctly? Only two modules in the stdlib (_json and _tkinter) used it correctly. Other modules have bugs even in tries to use PyModule_AddObject() correctly for some operations.

Serhiy Storchaka <storchaka <at> gmail.com> writes:
Could you perhaps stop labeling this as a bug? Usually we are talking about a *single* "leak" that a) does not even show up in Valgrind and b) only occurs under severe memory pressure when the OOM-killer is already waiting. I'm honestly mystified by your terminology and it's beginning to feel that you need to justify this patch at all costs. Stefan Krah

On 28.04.16 11:38, Stefan Krah wrote:
I say this is a bug because 1. PyModule_AddObject() behavior doesn't match the documentation. 2. Most code that use PyModule_AddObject() doesn't work as intended. Since the bahavior of PyModule_AddObject() contradicts the documentation and is contrintuitive, we can't blame authors in this. I don't say this is a high-impacting bug, I even agree that there is no need to fix the second part in maintained releases. But this is a bug unless you propose different definition for a bug. What can we do with this? 1. Change the documentation of PyModule_AddObject(). I think this is not questionable, and Berker provided a patch in http://bugs.python.org/issue26868 . 2. Update examples in the documentation to correctly handle errors of PyModule_AddObject(). This is more questionable, due to the case (3c) below and because correct error handling code distracts attention from main purpose of examples. 3. One of alternatives: 3a) Fix almost all usages of PyModule_AddObject() in stdlib extension modules. This is hundreds occurrences in over a half-hundred files. 3b) Allow to change the behavior of PyModule_AddObject() to match most authors expectations. This needs to add only one line to switch on new behavior in most files. 3c) Ignore issue. In this case we can not check the result of PyModule_AddObject() at all. But I afraid that correct fixing issues with subinterpreters will need us to return to this issue.

Serhiy Storchaka <storchaka <at> gmail.com> writes:
Why do you think that module authors don't know that? For _decimal, I was aware of the strange behavior. Yes, a single reference can "leak" on failure. The problem is that we don't seem to have any common ground here. Do you accept the following? 1) PyModule_AddObject() can only fail if malloc() fails. a) Normally (for small allocations) this is such a serious problem that the whole application fails anyway. b) Say that you're lucky and the application continues. i) The import fails. In some cases ImportError is caught and a fallback is imported (example _pydecimal). In that case you leak an entire DSO and something small like a single context object. What is the practical difference between the two? ii) The import fails and there's no fallback. Usually the application stops, otherwise DSO+small leak again. iii) Retry the import (I have never seen this): while(1): try: import leftpad except (ImportError, MemoryError): continue break You could have a legitimate leak here, but see a). Module initializations are intricate and boring. I suspect that if we promote wide changes across PyPI packages we'll see more additional segfaults than theoretically plugged memory leaks. Stefan Krah

Serhiy Storchaka <storchaka <at> gmail.com> writes:
For the list, this is the extent of this horrible "bug": diff --git a/Modules/_decimal/_decimal.c b/Modules/_decimal/_decimal.c --- a/Modules/_decimal/_decimal.c +++ b/Modules/_decimal/_decimal.c @@ -5804,8 +5804,7 @@ PyObject_CallObject((PyObject *)&PyDecContext_Type, NULL)); init_basic_context(basic_context_template); Py_INCREF(basic_context_template); - CHECK_INT(PyModule_AddObject(m, "BasicContext", - basic_context_template)); + CHECK_INT(-1); $ valgrind --suppressions=Misc/valgrind-python.supp ./python -c "import decimal" [...] ==16945== LEAK SUMMARY: ==16945== definitely lost: 0 bytes in 0 blocks ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [...] Stefan Krah

On Thu, Apr 28, 2016, at 05:05, Stefan Krah wrote:
Well, the obvious flaw with your test case is that a reference is retained forever in the C static variable basic_context_template. Now, it is arguable that this may be a reasonably common pattern, and that this doesn't actually constitute misuse of the API (the reference count will be wrong, but the object itself is immortal anyway, so it doesn't matter if it's 2 or 1 since it can't be 0 even with correct usage)

On Thu, Apr 28, 2016, at 10:11, Stefan Krah wrote:
For actual users of Valgrind this is patently obvious and was pretty much the point of my post.
A more relevant point would be that _decimal does *not* use the API in a way *which would be broken by the proposed change*, regardless of whether the way in which it uses it is subjectively correct or can cause leaks.

On 04/28/2016 08:26 AM, Stefan Krah wrote:
Random832 writes:
Considering you have to opt-in to the change, why would this be a big deal for you? Or are you saying you'd rather have the PyModule_AddObject deprecated (without removal?), and a new PyWhatever_Whatever to take it's place? -- ~Ethan~

On 27.04.16 10:14, Serhiy Storchaka wrote:
Opened an issue: http://bugs.python.org/issue26871 . Provided patch introduces new macros PY_MODULE_ADDOBJECT_CLEAN that controls the behavior of PyModule_AddObject() as PY_SSIZE_T_CLEAN controls the behavior of PyArg_Parse* functions. If the macro is defined before including "Python.h", PyModule_AddObject() steals a reference unconditionally. Otherwise it steals a reference only on success, and the caller is responsible for decref'ing it on error (current behavior).

On Wed, Apr 27, 2016 at 10:14 AM, Serhiy Storchaka <storchaka@gmail.com> wrote:
+1 It would be good to document PyModule_AddObject's current behavior in 3.5+ (already attached a patch).
+1
3. Make old PyModule_AddObject to emit a warning about possible leak and a suggestion to define above macro.
+0

On 04/27/2016 09:14 AM, Serhiy Storchaka wrote:
This inconsistency has caused bugs (or, more fairly, potential leaks) before, see http://bugs.python.org/issue1782 Unfortunately, the suggested Python 3 change to PyModule_AddObject was not accepted.
1. Add a new function PyModule_AddObject2(), that steals a reference even on failure.
This sounds like a good idea, except the name could be prettier :), e.g. PyModule_InsertObject. PyModule_AddObject could be deprecated. Hrvoje

On 27.04.16 15:31, Hrvoje Niksic wrote:
Glad to hear I'm not the first faced with this problem.
Unfortunately, the suggested Python 3 change to PyModule_AddObject was not accepted.
Bad. May be it happened because of the risk to break third-party working code. I propose a gradual path to change PyModule_AddObject.
I have decided to not introduce new public function. But just control the behavior of old function with the macro. This needs minimal changes to user code.

Hrvoje Niksic <hrvoje.niksic <at> avl.com> writes:
First, these "leaks" only potentially show up when you already have much bigger problems (i.e. on Linux the machine would already freeze due to overallocation). Second, these "leaks" don't even show up as "definitely lost" in Valgrind (yes, I checked). On the bright side, Python must be in a very healthy state if we can afford to spend time on issues such as this one. Stefan Krah

On 27 April 2016 at 17:14, Serhiy Storchaka <storchaka@gmail.com> wrote:
I'd suggest a variant on this that more closely matches the PyList_SetItem and PyTuple_SetItem cases: PyModule_SetAttrString The first two match the signature of PySequence_SetItem, but steal the reference instead of making a new one, and the same relationship would exist between PyObject_SetAttrString and the new PyModule_SetAttrString. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 27 April 2016 at 23:08, Nick Coghlan <ncoghlan@gmail.com> wrote:
And for the record: that suggestion was prompted by Hrvoje's email suggesting using a more descriptive name, I just went and looked up the name of the corresponding PyObject_* API. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 27.04.16 16:08, Nick Coghlan wrote:
I think it is better to have relation with PyModule_AddIntConstant() etc than with PyObject_SetAttrString. My patch doesn't introduce new public function, but changes the behavior of the old function. This needs minimal changes to user code that mostly use PyModule_AddObject() incorrectly (not blaming authors).

On 28.04.16 01:24, Case Van Horsen wrote:
No impact except emitting a deprecation warning at build time. But we can remove a deprecation warning and add it in future release if this is annoying. But are you sure, that your code uses PyModule_AddObject() correctly? Only two modules in the stdlib (_json and _tkinter) used it correctly. Other modules have bugs even in tries to use PyModule_AddObject() correctly for some operations.

Serhiy Storchaka <storchaka <at> gmail.com> writes:
Could you perhaps stop labeling this as a bug? Usually we are talking about a *single* "leak" that a) does not even show up in Valgrind and b) only occurs under severe memory pressure when the OOM-killer is already waiting. I'm honestly mystified by your terminology and it's beginning to feel that you need to justify this patch at all costs. Stefan Krah

On 28.04.16 11:38, Stefan Krah wrote:
I say this is a bug because 1. PyModule_AddObject() behavior doesn't match the documentation. 2. Most code that use PyModule_AddObject() doesn't work as intended. Since the bahavior of PyModule_AddObject() contradicts the documentation and is contrintuitive, we can't blame authors in this. I don't say this is a high-impacting bug, I even agree that there is no need to fix the second part in maintained releases. But this is a bug unless you propose different definition for a bug. What can we do with this? 1. Change the documentation of PyModule_AddObject(). I think this is not questionable, and Berker provided a patch in http://bugs.python.org/issue26868 . 2. Update examples in the documentation to correctly handle errors of PyModule_AddObject(). This is more questionable, due to the case (3c) below and because correct error handling code distracts attention from main purpose of examples. 3. One of alternatives: 3a) Fix almost all usages of PyModule_AddObject() in stdlib extension modules. This is hundreds occurrences in over a half-hundred files. 3b) Allow to change the behavior of PyModule_AddObject() to match most authors expectations. This needs to add only one line to switch on new behavior in most files. 3c) Ignore issue. In this case we can not check the result of PyModule_AddObject() at all. But I afraid that correct fixing issues with subinterpreters will need us to return to this issue.

Serhiy Storchaka <storchaka <at> gmail.com> writes:
Why do you think that module authors don't know that? For _decimal, I was aware of the strange behavior. Yes, a single reference can "leak" on failure. The problem is that we don't seem to have any common ground here. Do you accept the following? 1) PyModule_AddObject() can only fail if malloc() fails. a) Normally (for small allocations) this is such a serious problem that the whole application fails anyway. b) Say that you're lucky and the application continues. i) The import fails. In some cases ImportError is caught and a fallback is imported (example _pydecimal). In that case you leak an entire DSO and something small like a single context object. What is the practical difference between the two? ii) The import fails and there's no fallback. Usually the application stops, otherwise DSO+small leak again. iii) Retry the import (I have never seen this): while(1): try: import leftpad except (ImportError, MemoryError): continue break You could have a legitimate leak here, but see a). Module initializations are intricate and boring. I suspect that if we promote wide changes across PyPI packages we'll see more additional segfaults than theoretically plugged memory leaks. Stefan Krah

Serhiy Storchaka <storchaka <at> gmail.com> writes:
For the list, this is the extent of this horrible "bug": diff --git a/Modules/_decimal/_decimal.c b/Modules/_decimal/_decimal.c --- a/Modules/_decimal/_decimal.c +++ b/Modules/_decimal/_decimal.c @@ -5804,8 +5804,7 @@ PyObject_CallObject((PyObject *)&PyDecContext_Type, NULL)); init_basic_context(basic_context_template); Py_INCREF(basic_context_template); - CHECK_INT(PyModule_AddObject(m, "BasicContext", - basic_context_template)); + CHECK_INT(-1); $ valgrind --suppressions=Misc/valgrind-python.supp ./python -c "import decimal" [...] ==16945== LEAK SUMMARY: ==16945== definitely lost: 0 bytes in 0 blocks ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [...] Stefan Krah

On Thu, Apr 28, 2016, at 05:05, Stefan Krah wrote:
Well, the obvious flaw with your test case is that a reference is retained forever in the C static variable basic_context_template. Now, it is arguable that this may be a reasonably common pattern, and that this doesn't actually constitute misuse of the API (the reference count will be wrong, but the object itself is immortal anyway, so it doesn't matter if it's 2 or 1 since it can't be 0 even with correct usage)

On Thu, Apr 28, 2016, at 10:11, Stefan Krah wrote:
For actual users of Valgrind this is patently obvious and was pretty much the point of my post.
A more relevant point would be that _decimal does *not* use the API in a way *which would be broken by the proposed change*, regardless of whether the way in which it uses it is subjectively correct or can cause leaks.

On 04/28/2016 08:26 AM, Stefan Krah wrote:
Random832 writes:
Considering you have to opt-in to the change, why would this be a big deal for you? Or are you saying you'd rather have the PyModule_AddObject deprecated (without removal?), and a new PyWhatever_Whatever to take it's place? -- ~Ethan~

On 27.04.16 10:14, Serhiy Storchaka wrote:
Opened an issue: http://bugs.python.org/issue26871 . Provided patch introduces new macros PY_MODULE_ADDOBJECT_CLEAN that controls the behavior of PyModule_AddObject() as PY_SSIZE_T_CLEAN controls the behavior of PyArg_Parse* functions. If the macro is defined before including "Python.h", PyModule_AddObject() steals a reference unconditionally. Otherwise it steals a reference only on success, and the caller is responsible for decref'ing it on error (current behavior).
participants (9)
-
Berker Peksağ
-
Case Van Horsen
-
Ethan Furman
-
Guido van Rossum
-
Hrvoje Niksic
-
Nick Coghlan
-
Random832
-
Serhiy Storchaka
-
Stefan Krah