[Numpy-discussion] Unreliable crash when converting using numpy.asarray via C buffer interface

Friedrich Romstedt friedrichromstedt at gmail.com
Tue Feb 16 05:00:34 EST 2021


Hello again,

Am Mo., 15. Feb. 2021 um 16:57 Uhr schrieb Sebastian Berg
<sebastian at sipsolutions.net>:
>
> On Mon, 2021-02-15 at 10:12 +0100, Friedrich Romstedt wrote:
> > Last week I updated my example code to be more slim.  There now
> > exists
> > a single-file extension module:
> > https://github.com/friedrichromstedt/bughunting-01/blob/master/lib/bughuntingfrmod/bughuntingfrmod.cpp
> > .
> > The corresponding test program
> > https://github.com/friedrichromstedt/bughunting-01/blob/master/test/2021-02-11_0909.py
> > crashes "properly" both on Windows 10 (Python 3.8.2, numpy 1.19.2) as
> > well as on Arch Linux (Python 3.9.1, numpy 1.20.0), when the
> > ``print``
> > statement contained in the test file is commented out.
>
> I have tried it out, and can confirm that using debugging tools (namely
> valgrind), will allow you track down the issue (valgrind reports it
> from within python, running a python without debug symbols may
> obfuscate the actual problem; if that is the limiting you, I can post
> my valgrind output).
> Since you are running a linux system, I am confident that you can run
> it in valgrind to find it yourself.  (There may be other ways.)
>
> Just remember to run valgrind with `PYTHONMALLOC=malloc valgrind` and
> ignore some errors e.g. when importing NumPy.

>From running ``PYTHONMALLOC=malloc valgrind python3
2021-01-11_0909.py`` (with the preceding call of ``print`` in
:file:`2021-01-11_0909.py` commented out) I found a few things:

-   The call might or might not succeed.  It doesn't always lead to a segfault.
-   "at 0x4A64A73: ??? (in /usr/lib/libpython3.9.so.1.0), called by
0x4A64914: PyMemoryView_FromObject (in /usr/lib/libpython3.9.so.1.0)",
a "Conditional jump or move depends on uninitialised value(s)".  After
one more block of valgrind output ("Use of uninitialised value of size
8 at 0x48EEA1B: ??? (in /usr/lib/libpython3.9.so.1.0)"), it finally
leads either to "Invalid read of size 8 at 0x48EEA1B: ??? (in
/usr/lib/libpython3.9.so.1.0) [...] Address 0x1 is not stack'd,
malloc'd or (recently) free'd", resulting in a segfault, or just to
another "Use of uninitialised value of size 8 at 0x48EEA15: ??? (in
/usr/lib/libpython3.9.so.1.0)", after which the program completes
successfully.
-   All this happens within "PyMemoryView_FromObject".

So I can only guess that the "uninitialised value" is compared to 0x0,
and when it is different (e.g. 0x1), it leads via "Address 0x1 is not
stack'd, malloc'd or (recently) free'd" to the segfault observed.

I suppose I need to compile Python and numpy myself to see the debug
symbols instead of the "???" marks? Maybe even with ``-O0``?

Furthermore, the shared object belonging to my code isn't involved
directly in any way, so the segfault possibly has to do with some data
I am leaving "uninitialised" at the moment.

Thanks for the other replies as well; for the moment I feel that going
the valgrind way might teach me how to debug errors of this kind
myself.

So far,
Friedrich


More information about the NumPy-Discussion mailing list