Embedding Python crash on PyTuple_New
Arnaud Loonstra
arnaud at sphaero.org
Wed Nov 24 02:59:15 EST 2021
On 24-11-2021 01:46, MRAB wrote:
> On 2021-11-23 20:25, Arnaud Loonstra wrote:
>> On 23-11-2021 18:31, MRAB wrote:
>>> On 2021-11-23 16:04, Arnaud Loonstra wrote:
>>>> On 23-11-2021 16:37, MRAB wrote:
>>>>> On 2021-11-23 15:17, MRAB wrote:
>>>>>> On 2021-11-23 14:44, Arnaud Loonstra wrote:
>>>>>>> On 23-11-2021 15:34, MRAB wrote:
>>>>>>>> On 2021-11-23 12:07, Arnaud Loonstra wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I've got Python embedded successfully in a program up until now
>>>>>>>>> as I'm
>>>>>>>>> now running into weird GC related segfaults. I'm currently
>>>>>>>>> trying to
>>>>>>>>> debug this but my understanding of CPython limits me here.
>>>>>>>>>
>>>>>>>>> I'm creating a Tuple in C but it crashes on creating it after a
>>>>>>>>> while.
>>>>>>>>> It doesn't make sense which makes me wonder something else must be
>>>>>>>>> happening? Could be it just crashes here because the GC is
>>>>>>>>> cleaning up
>>>>>>>>> stuff completely unrelated to the allocation of the new tuple?
>>>>>>>>> How can I
>>>>>>>>> troubleshoot this?
>>>>>>>>>
>>>>>>>>> I've got CPython compiled with --with-valgrind --without-pymalloc
>>>>>>>>> --with-pydebug
>>>>>>>>>
>>>>>>>>> In C I'm creating a tuple with the following method:
>>>>>>>>>
>>>>>>>>> static PyObject *
>>>>>>>>> s_py_zosc_tuple(pythonactor_t *self, zosc_t *oscmsg)
>>>>>>>>> {
>>>>>>>>> assert(self);
>>>>>>>>> assert(oscmsg);
>>>>>>>>> char *format = zosc_format(oscmsg);
>>>>>>>>>
>>>>>>>>> PyObject *rettuple = PyTuple_New((Py_ssize_t)
>>>>>>>>> strlen(format) );
>>>>>>>>>
>>>>>>>>> It segfaults here (frame 16) after 320 times (consistently)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 1 __GI_raise raise.c 49 0x7ffff72c4e71
>>>>>>>>> 2 __GI_abort abort.c 79 0x7ffff72ae536
>>>>>>>>> 3 fatal_error pylifecycle.c 2183 0x7ffff7d84b4f
>>>>>>>>> 4 Py_FatalError pylifecycle.c 2193 0x7ffff7d878b2
>>>>>>>>> 5 _PyObject_AssertFailed object.c 2200 0x7ffff7c93cf2
>>>>>>>>> 6 visit_decref gcmodule.c 378 0x7ffff7dadfd5
>>>>>>>>> 7 tupletraverse tupleobject.c 623 0x7ffff7ca3e81
>>>>>>>>> 8 subtract_refs gcmodule.c 406 0x7ffff7dad340
>>>>>>>>> 9 collect gcmodule.c 1054 0x7ffff7dae838
>>>>>>>>> 10 collect_with_callback gcmodule.c 1240 0x7ffff7daf17b
>>>>>>>>> 11 collect_generations gcmodule.c 1262 0x7ffff7daf3f6
>>>>>>>>> 12 _PyObject_GC_Alloc gcmodule.c 1977 0x7ffff7daf4f2
>>>>>>>>> 13 _PyObject_GC_Malloc gcmodule.c 1987 0x7ffff7dafebc
>>>>>>>>> 14 _PyObject_GC_NewVar gcmodule.c 2016 0x7ffff7daffa5
>>>>>>>>> 15 PyTuple_New tupleobject.c 118 0x7ffff7ca4da7
>>>>>>>>> 16 s_py_zosc_tuple pythonactor.c 366 0x55555568cc82
>>>>>>>>> 17 pythonactor_socket pythonactor.c 664 0x55555568dac7
>>>>>>>>> 18 pythonactor_handle_msg pythonactor.c 862 0x55555568e472
>>>>>>>>> 19 pythonactor_handler pythonactor.c 828 0x55555568e2e2
>>>>>>>>> 20 sphactor_actor_run sphactor_actor.c 855 0x5555558cb268
>>>>>>>>> ... <More>
>>>>>>>>>
>>>>>>>>> Any pointer really appreciated.
[snip]
>>>>
>>> Basically, yes, but I won't be surprised if it was due to too few
>>> INCREFs or too many DECREFs somewhere.
>>>
>>>> https://github.com/hku-ect/gazebosc/blob/505b30c46bf3f78d188c3f575c80e294d3db7e5d/Actors/pythonactor.c#L286
>>>>
>>>>
>>> Incidentally, in s_py_zosc_tuple, you're not doing "assert(rc == 0);"
>>> after "after zosc_pop_float" or "zosc_pop_double".
>>
>> Thanks for those pointers! I think your intuition is right. I might have
>> found the bugger. In s_py_zosc I call Py_DECREF on pAddress and pData.
>> However they are acquired by PyTuple_GetItem which returns a borrowed
>> reference. I think pAddress and pData are then also 'decrefed' when the
>> pReturn tuple which contains pAddress and pData is 'decrefed'?
>>
> Yes, members of a container are DECREFed when the container is destroyed.
>
> It's bad practice for a function to DECREF its arguments unless the
> function's sole purpose is cleanup because the function won't know where
> the arguments came from.
>
I'm finding it out now. What strikes me was how hard it was to debug
this. I think it was caused because I INCREFed the return object. I
guess I did that to workaround the wrong DECREF data in the return
object. However that caused a hell to debug. I'm really curious what the
best practices are for debugging embedded CPython.
Thanks big time for your feedback!
More information about the Python-list
mailing list