Newly Built Python3 Binary Throws Segfault

All, I recently ditched my attempts to port Python 2.7.8 to Android in favor of Python 3.4.2. Unfortunately, after using the same configure options in the same environment, and modifying the setup.py as needed, the newly built binary throws a segfault when the generate-posix-vars portion of the build is reached...and when it is run as well (i.e. ./python --help, ./python -E -S -m sysconfig, or similar) I took a strace of ./python, however I'm a bit lost when reviewing it. Any ideas as to what may be going on...i.e. why Python 2.7 works but 3.x throws a segfault? Thanks in advance, Cyd

There could be a million differences relevant (unicode, ints, ...). Perhaps the importlib bootstrap is failing. Perhaps the dynamic loading code changed. Did you get a stack track? (IIRC strace shows a syscall trace -- also useful, but doesn't tell you precisely how it segfaulted.) On Wed, Jan 28, 2015 at 6:43 AM, Cyd Haselton <chaselton@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

Apologies...I'm not sure what a stack track is, but I do have the strace. Nearest I can tell, it happens due to an open call, though I am probably wrong. Attaching the strace output to this email. I'm going to head back to the documentation and to back out of some Android-related changes in _localemodule.c On Wed, Jan 28, 2015 at 9:43 AM, Guido van Rossum <guido@python.org> wrote:

What I see in the strace: ... load libpython3.4m.so.1.0 ... load libm ... open /dev/__properties__ and do something to it (what?) ... get current time ... allocate memory ... getuid ... segfault That's not a lot to go on, but it doesn't look as if it has started to load modules yet. Does /dev/__properties__ ring a bell? Not to me. That stack trace would be really helpful. On Wed, Jan 28, 2015 at 8:34 AM, Cyd Haselton <chaselton@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

On Wed, Jan 28, 2015 at 10:43 AM, Guido van Rossum <guido@python.org> wrote:
https://android.googlesource.com/platform/system/core/+/tools_r22/init/prope... is the code that works with that file. This <http://sssslide.com/www.slideshare.net/tetsu.koba/interprocess-communication...> explains it a bit (slides 24-29). Looks like something to do with interprocess communication. Likely has nothing to do with Python itself. Maybe this <http://www.andrew-kirkpatrick.com/2013/01/get-android-stack-trace-from-devic...> would be useful?
-- Ryan If anybody ever asks me why I prefer C++ to C, my answer will be simple: "It's becauseslejfp23(@#Q*(E*EIdc-SEGFAULT. Wait, I don't think that was nul-terminated." Personal reality distortion fields are immune to contradictory evidence. - srean Check out my website: http://kirbyfan64.github.io/

That is interesting.../dev/__properties__ is in memory...not the filesystem, apparently proccesses read global properties from it. It's read-only...not sure why the build or the python binary would access it...or if that's the cause of the segfault. I have root access on the tablet so I was able to check for the traces.txt file. There are a number of them, but none contain information about the segfault. On Wed, Jan 28, 2015 at 11:23 AM, Ryan Gonzalez <rymg19@gmail.com> wrote:

Found a patch for at bugs.python.org addressing a segfault issue for android but even afternapplying it i'm still getting a segfault. I ran a strace with the verbose option and am attaching it to this update. If not helpful, I'll see if I can hook up the debugging bridge to the tablet but as mentioned earlier there was no helpful info in the anr/traces files On Wed, Jan 28, 2015 at 11:23 AM, Ryan Gonzalez <rymg19@gmail.com> wrote:

Managed to get this out of logcat: F(11914) Fatal signal 11 (SIGSEGV) at 0x00000000 (code=1), thread 11914 (python) (libc) [ 01-29 19:30:55.855 23373:23373 F/libc ] Fatal signal 11 (SIGSEGV) at 0x00000000 (code=1), thread 23373 (python) Less detail than strace but it seems to be that python is segfaulting libc... On Wed, Jan 28, 2015 at 11:23 AM, Ryan Gonzalez <rymg19@gmail.com> wrote:

Could you try the steps at http://stackoverflow.com/a/11369475/2097780? They allow you to get a better idea of where libc is crashing. Cyd Haselton <chaselton@gmail.com> wrote:
-- Sent from my Android phone with K-9 Mail. Please excuse my brevity. Check out my website: http://kirbyfan64.github.io/

Absolutely. Good thing I have addr2line on device /bld/python/Python-3.4.2 $ addr2line -C -f -e /lib/libpython3.4m.so.1.0 0008bbc8 _PyMem_RawStrdup /bld/python/Python-3.4.2/Objects/obmalloc.c:323 /bld/python/Python-3.4.2 $ On Thu, Jan 29, 2015 at 8:26 PM, Ryan <rymg19@gmail.com> wrote:

There's a related strdup patch for readline.c, mentioned here:http://bugs.python.org/issue21390 and here https://github.com/rave-engine/python3-android/issues/2. There's a patch, but I'm not sure how to modify it for obmalloc.c, as (I think) the functions all belong to Python...they're all prefixed with _PyXx On Fri, Jan 30, 2015 at 9:05 AM, Cyd Haselton <chaselton@gmail.com> wrote:

I seriously doubt the issue is in that file; _PyMem_RawStrdup crashes when calling strlen. It's that whatever is calling it is likely asking it to duplicate a null pointer. Basically, it's probably the caller's fault. You could always try modifying _PyMem_RawStrdup to return NULL when given a null pointer and see where it then segfaults. On Fri, Jan 30, 2015 at 11:53 AM, Cyd Haselton <chaselton@gmail.com> wrote:
-- Ryan If anybody ever asks me why I prefer C++ to C, my answer will be simple: "It's becauseslejfp23(@#Q*(E*EIdc-SEGFAULT. Wait, I don't think that was nul-terminated." Personal reality distortion fields are immune to contradictory evidence. - srean Check out my website: http://kirbyfan64.github.io/

Unless i'm reading something incorrectly, _PyMem_RawStrdup is currently returning NULL when given a null pointer.
From obmalloc.c
_PyMem_RawStrdup(const char *str) { size_t size; char *copy; size = strlen(str) + 1; copy = PyMem_RawMalloc(size); if (copy == NULL) return NULL; memcpy(copy, str, size); return copy; } On Fri, Jan 30, 2015 at 11:56 AM, Ryan Gonzalez <rymg19@gmail.com> wrote:

No, it returns NULL if malloc gives it a raw pointer. It unconditionally checks the length of the (possibly null) string argument first. Please try the patch I attached in the last email. It *might* fix the issue. Android has crappy locale handling. On Fri, Jan 30, 2015 at 12:09 PM, Cyd Haselton <chaselton@gmail.com> wrote:
-- Ryan If anybody ever asks me why I prefer C++ to C, my answer will be simple: "It's becauseslejfp23(@#Q*(E*EIdc-SEGFAULT. Wait, I don't think that was nul-terminated." Personal reality distortion fields are immune to contradictory evidence. - srean Check out my website: http://kirbyfan64.github.io/

Ok...that makes sense.. apologies I do not do a lot of debugging. My goal was to get Python (then spidermonkey) on my device then start learning languages where i'd need to learn debugging. Tried patch, see my reply, agree about Android's locale...at least where native codeis concerned On Fri, Jan 30, 2015 at 12:10 PM, Ryan Gonzalez <rymg19@gmail.com> wrote:

Are you sure the patch was applied correctly? I was SO sure it would work! FYI, you tried the patch I attached to the email message, right? On Fri, Jan 30, 2015 at 12:58 PM, Cyd Haselton <chaselton@gmail.com> wrote:
-- Ryan If anybody ever asks me why I prefer C++ to C, my answer will be simple: "It's becauseslejfp23(@#Q*(E*EIdc-SEGFAULT. Wait, I don't think that was nul-terminated." Personal reality distortion fields are immune to contradictory evidence. - srean Check out my website: http://kirbyfan64.github.io/

Is it possible at all to get a stack trace of the crash using gdb? Try the steps here <http://stackoverflow.com/a/10539883/2097780>. That way we can see where Python's own strdup function is getting called. On Fri, Jan 30, 2015 at 9:05 AM, Cyd Haselton <chaselton@gmail.com> wrote:
-- Ryan If anybody ever asks me why I prefer C++ to C, my answer will be simple: "It's becauseslejfp23(@#Q*(E*EIdc-SEGFAULT. Wait, I don't think that was nul-terminated." Personal reality distortion fields are immune to contradictory evidence. - srean Check out my website: http://kirbyfan64.github.io/

No... ...but I think I found the issue with grep. Try applying the attached patch to the Python/frozenmain.c. It comments out the locale handling. It seems that Python always calls its strdup function on the locale string. On Android, this can apparently be null (as seen in the bug report you linked to). On Fri, Jan 30, 2015 at 12:00 PM, Cyd Haselton <chaselton@gmail.com> wrote:
-- Ryan If anybody ever asks me why I prefer C++ to C, my answer will be simple: "It's becauseslejfp23(@#Q*(E*EIdc-SEGFAULT. Wait, I don't think that was nul-terminated." Personal reality distortion fields are immune to contradictory evidence. - srean Check out my website: http://kirbyfan64.github.io/

Android provides a minimal support of locales. Most functions return a fake result, do nothing. I'm not sure that it supports any codec. To support Android, we may force UTF-8 for the filesystem encoding, as done on Mac OS X. Victor 2015-01-30 19:04 GMT+01:00 Ryan Gonzalez <rymg19@gmail.com>:

There could be a million differences relevant (unicode, ints, ...). Perhaps the importlib bootstrap is failing. Perhaps the dynamic loading code changed. Did you get a stack track? (IIRC strace shows a syscall trace -- also useful, but doesn't tell you precisely how it segfaulted.) On Wed, Jan 28, 2015 at 6:43 AM, Cyd Haselton <chaselton@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

Apologies...I'm not sure what a stack track is, but I do have the strace. Nearest I can tell, it happens due to an open call, though I am probably wrong. Attaching the strace output to this email. I'm going to head back to the documentation and to back out of some Android-related changes in _localemodule.c On Wed, Jan 28, 2015 at 9:43 AM, Guido van Rossum <guido@python.org> wrote:

What I see in the strace: ... load libpython3.4m.so.1.0 ... load libm ... open /dev/__properties__ and do something to it (what?) ... get current time ... allocate memory ... getuid ... segfault That's not a lot to go on, but it doesn't look as if it has started to load modules yet. Does /dev/__properties__ ring a bell? Not to me. That stack trace would be really helpful. On Wed, Jan 28, 2015 at 8:34 AM, Cyd Haselton <chaselton@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

On Wed, Jan 28, 2015 at 10:43 AM, Guido van Rossum <guido@python.org> wrote:
https://android.googlesource.com/platform/system/core/+/tools_r22/init/prope... is the code that works with that file. This <http://sssslide.com/www.slideshare.net/tetsu.koba/interprocess-communication...> explains it a bit (slides 24-29). Looks like something to do with interprocess communication. Likely has nothing to do with Python itself. Maybe this <http://www.andrew-kirkpatrick.com/2013/01/get-android-stack-trace-from-devic...> would be useful?
-- Ryan If anybody ever asks me why I prefer C++ to C, my answer will be simple: "It's becauseslejfp23(@#Q*(E*EIdc-SEGFAULT. Wait, I don't think that was nul-terminated." Personal reality distortion fields are immune to contradictory evidence. - srean Check out my website: http://kirbyfan64.github.io/

That is interesting.../dev/__properties__ is in memory...not the filesystem, apparently proccesses read global properties from it. It's read-only...not sure why the build or the python binary would access it...or if that's the cause of the segfault. I have root access on the tablet so I was able to check for the traces.txt file. There are a number of them, but none contain information about the segfault. On Wed, Jan 28, 2015 at 11:23 AM, Ryan Gonzalez <rymg19@gmail.com> wrote:

Found a patch for at bugs.python.org addressing a segfault issue for android but even afternapplying it i'm still getting a segfault. I ran a strace with the verbose option and am attaching it to this update. If not helpful, I'll see if I can hook up the debugging bridge to the tablet but as mentioned earlier there was no helpful info in the anr/traces files On Wed, Jan 28, 2015 at 11:23 AM, Ryan Gonzalez <rymg19@gmail.com> wrote:

Managed to get this out of logcat: F(11914) Fatal signal 11 (SIGSEGV) at 0x00000000 (code=1), thread 11914 (python) (libc) [ 01-29 19:30:55.855 23373:23373 F/libc ] Fatal signal 11 (SIGSEGV) at 0x00000000 (code=1), thread 23373 (python) Less detail than strace but it seems to be that python is segfaulting libc... On Wed, Jan 28, 2015 at 11:23 AM, Ryan Gonzalez <rymg19@gmail.com> wrote:

Could you try the steps at http://stackoverflow.com/a/11369475/2097780? They allow you to get a better idea of where libc is crashing. Cyd Haselton <chaselton@gmail.com> wrote:
-- Sent from my Android phone with K-9 Mail. Please excuse my brevity. Check out my website: http://kirbyfan64.github.io/

Absolutely. Good thing I have addr2line on device /bld/python/Python-3.4.2 $ addr2line -C -f -e /lib/libpython3.4m.so.1.0 0008bbc8 _PyMem_RawStrdup /bld/python/Python-3.4.2/Objects/obmalloc.c:323 /bld/python/Python-3.4.2 $ On Thu, Jan 29, 2015 at 8:26 PM, Ryan <rymg19@gmail.com> wrote:

There's a related strdup patch for readline.c, mentioned here:http://bugs.python.org/issue21390 and here https://github.com/rave-engine/python3-android/issues/2. There's a patch, but I'm not sure how to modify it for obmalloc.c, as (I think) the functions all belong to Python...they're all prefixed with _PyXx On Fri, Jan 30, 2015 at 9:05 AM, Cyd Haselton <chaselton@gmail.com> wrote:

I seriously doubt the issue is in that file; _PyMem_RawStrdup crashes when calling strlen. It's that whatever is calling it is likely asking it to duplicate a null pointer. Basically, it's probably the caller's fault. You could always try modifying _PyMem_RawStrdup to return NULL when given a null pointer and see where it then segfaults. On Fri, Jan 30, 2015 at 11:53 AM, Cyd Haselton <chaselton@gmail.com> wrote:
-- Ryan If anybody ever asks me why I prefer C++ to C, my answer will be simple: "It's becauseslejfp23(@#Q*(E*EIdc-SEGFAULT. Wait, I don't think that was nul-terminated." Personal reality distortion fields are immune to contradictory evidence. - srean Check out my website: http://kirbyfan64.github.io/

Unless i'm reading something incorrectly, _PyMem_RawStrdup is currently returning NULL when given a null pointer.
From obmalloc.c
_PyMem_RawStrdup(const char *str) { size_t size; char *copy; size = strlen(str) + 1; copy = PyMem_RawMalloc(size); if (copy == NULL) return NULL; memcpy(copy, str, size); return copy; } On Fri, Jan 30, 2015 at 11:56 AM, Ryan Gonzalez <rymg19@gmail.com> wrote:

No, it returns NULL if malloc gives it a raw pointer. It unconditionally checks the length of the (possibly null) string argument first. Please try the patch I attached in the last email. It *might* fix the issue. Android has crappy locale handling. On Fri, Jan 30, 2015 at 12:09 PM, Cyd Haselton <chaselton@gmail.com> wrote:
-- Ryan If anybody ever asks me why I prefer C++ to C, my answer will be simple: "It's becauseslejfp23(@#Q*(E*EIdc-SEGFAULT. Wait, I don't think that was nul-terminated." Personal reality distortion fields are immune to contradictory evidence. - srean Check out my website: http://kirbyfan64.github.io/

Ok...that makes sense.. apologies I do not do a lot of debugging. My goal was to get Python (then spidermonkey) on my device then start learning languages where i'd need to learn debugging. Tried patch, see my reply, agree about Android's locale...at least where native codeis concerned On Fri, Jan 30, 2015 at 12:10 PM, Ryan Gonzalez <rymg19@gmail.com> wrote:

Are you sure the patch was applied correctly? I was SO sure it would work! FYI, you tried the patch I attached to the email message, right? On Fri, Jan 30, 2015 at 12:58 PM, Cyd Haselton <chaselton@gmail.com> wrote:
-- Ryan If anybody ever asks me why I prefer C++ to C, my answer will be simple: "It's becauseslejfp23(@#Q*(E*EIdc-SEGFAULT. Wait, I don't think that was nul-terminated." Personal reality distortion fields are immune to contradictory evidence. - srean Check out my website: http://kirbyfan64.github.io/

Is it possible at all to get a stack trace of the crash using gdb? Try the steps here <http://stackoverflow.com/a/10539883/2097780>. That way we can see where Python's own strdup function is getting called. On Fri, Jan 30, 2015 at 9:05 AM, Cyd Haselton <chaselton@gmail.com> wrote:
-- Ryan If anybody ever asks me why I prefer C++ to C, my answer will be simple: "It's becauseslejfp23(@#Q*(E*EIdc-SEGFAULT. Wait, I don't think that was nul-terminated." Personal reality distortion fields are immune to contradictory evidence. - srean Check out my website: http://kirbyfan64.github.io/

No... ...but I think I found the issue with grep. Try applying the attached patch to the Python/frozenmain.c. It comments out the locale handling. It seems that Python always calls its strdup function on the locale string. On Android, this can apparently be null (as seen in the bug report you linked to). On Fri, Jan 30, 2015 at 12:00 PM, Cyd Haselton <chaselton@gmail.com> wrote:
-- Ryan If anybody ever asks me why I prefer C++ to C, my answer will be simple: "It's becauseslejfp23(@#Q*(E*EIdc-SEGFAULT. Wait, I don't think that was nul-terminated." Personal reality distortion fields are immune to contradictory evidence. - srean Check out my website: http://kirbyfan64.github.io/

Android provides a minimal support of locales. Most functions return a fake result, do nothing. I'm not sure that it supports any codec. To support Android, we may force UTF-8 for the filesystem encoding, as done on Mac OS X. Victor 2015-01-30 19:04 GMT+01:00 Ryan Gonzalez <rymg19@gmail.com>:
participants (5)
-
Cyd Haselton
-
Guido van Rossum
-
Ryan
-
Ryan Gonzalez
-
Victor Stinner