What is the purpose of the _PyThreadState_Current symbol in Python 3?
Hi Victor,
I understand that you are writing a debugger and you can only *read* modify, not execute code, right?
I'm working on a frame stack sampler that runs independently from the Python process. The project is "Austin" (https://github.com/P403n1x87/austin). Whilst I could, in principle, execute code with other system calls, I prefer not to in this case.
In the master branch, it's now _PyRuntime.gilstate.tstate_current. If you run time.sleep(3600) and look into _PyRuntime.gilstate.tstate_current using gdb, you can a NULL pointer (tstate_current=0) because Python releases the GIL..
I would like my application to make as few assumptions as possible. The _PyRuntime symbol might not be available if all the symbols have been stripped out of the binaries. That's why I was trying to rely on _PyThreadState_Current, which is in the .dynsym section. Judging by the output of nm -D `which python3` (I'm on Python 3.6.6 at the moment) I cannot see anything more useful than that. My current strategy is to try and make something out of this symbol and then fall back to a brute force approach to scan the .bss section for valid PyInterpreterState instances (which works reliably well and is quite fast too, but a bit ugly).
There is also _PyGILState_GetInterpreterStateUnsafe() which gives access to the current Python interpreter: _PyRuntime.gilstate.autoInterpreterState. From the interpreter, you can use the linked list of thread states from interp->tstate_head.
I hope that I helped :-)
Yes thanks! Your comment made me realise why I can use PyThreadState_Current at the very beginning, and it is because Python is going through the intensive startup process, which involves, among other things, the loading of frozen modules (I can clearly see most if not all the steps in the output of Austin, as mentioned in the repo's README). During this phase, the main (and only thread) holds the GIL and is quite busy doing stuff. The long-running applications that I was trying to attach to have very long wait periods where they sit idle waiting for a timer to trigger the next operations, that fire very quickly and put the threads back to sleep again. If this is what the _PyThreadState_Current is designed for, then I guess I cannot really rely on it, especially when attaching Austin to another process. Best regards, Gabriele
What information do you wish the interpreter provided, that would make your program simpler and more reliable? On Fri, Sep 28, 2018, 07:21 Gabriele <phoenix1987@gmail.com> wrote:
Hi Victor,
I understand that you are writing a debugger and you can only *read* modify, not execute code, right?
I'm working on a frame stack sampler that runs independently from the Python process. The project is "Austin" (https://github.com/P403n1x87/austin). Whilst I could, in principle, execute code with other system calls, I prefer not to in this case.
In the master branch, it's now _PyRuntime.gilstate.tstate_current. If you run time.sleep(3600) and look into _PyRuntime.gilstate.tstate_current using gdb, you can a NULL pointer (tstate_current=0) because Python releases the GIL..
I would like my application to make as few assumptions as possible. The _PyRuntime symbol might not be available if all the symbols have been stripped out of the binaries. That's why I was trying to rely on _PyThreadState_Current, which is in the .dynsym section. Judging by the output of nm -D `which python3` (I'm on Python 3.6.6 at the moment) I cannot see anything more useful than that.
My current strategy is to try and make something out of this symbol and then fall back to a brute force approach to scan the .bss section for valid PyInterpreterState instances (which works reliably well and is quite fast too, but a bit ugly).
There is also _PyGILState_GetInterpreterStateUnsafe() which gives access to the current Python interpreter: _PyRuntime.gilstate.autoInterpreterState. From the interpreter, you can use the linked list of thread states from interp->tstate_head.
I hope that I helped :-)
Yes thanks! Your comment made me realise why I can use PyThreadState_Current at the very beginning, and it is because Python is going through the intensive startup process, which involves, among other things, the loading of frozen modules (I can clearly see most if not all the steps in the output of Austin, as mentioned in the repo's README). During this phase, the main (and only thread) holds the GIL and is quite busy doing stuff. The long-running applications that I was trying to attach to have very long wait periods where they sit idle waiting for a timer to trigger the next operations, that fire very quickly and put the threads back to sleep again.
If this is what the _PyThreadState_Current is designed for, then I guess I cannot really rely on it, especially when attaching Austin to another process.
Best regards, Gabriele _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/njs%40pobox.com
On Fri, 28 Sep 2018 at 23:12, Nathaniel Smith <njs@pobox.com> wrote:
What information do you wish the interpreter provided, that would make your program simpler and more reliable?
An exported global variable that points to the head of the PyInterpreterState linked list (i.e. the return value of PyInterpreterState_Head). This way my program could just look this up from the dynsym section instead of scanning a dump of the bss section in memory to find a possible candidate. It would be grand if also the string in the rodata section that gives the Python version could be dereferenced from dynsym, but that's a different question.
On Fri, Sep 28, 2018 at 3:29 PM, Gabriele <phoenix1987@gmail.com> wrote:
On Fri, 28 Sep 2018 at 23:12, Nathaniel Smith <njs@pobox.com> wrote:
What information do you wish the interpreter provided, that would make your program simpler and more reliable?
An exported global variable that points to the head of the PyInterpreterState linked list (i.e. the return value of PyInterpreterState_Head). This way my program could just look this up from the dynsym section instead of scanning a dump of the bss section in memory to find a possible candidate.
Hmm, it looks like in 3.7+, _PyRuntime is marked PyAPI_DATA, which I think should make it exported from dynsym? https://github.com/python/cpython/blob/4b430e5f6954ef4b248e95bfb4087635dcdef... And PyInterpreterState_Head is just _PyRuntime.interpreters.head. So maybe this is already done... -n -- Nathaniel J. Smith -- https://vorpus.org
Ah ok, this might be related to Victor's observation based on the latest sources. I haven't tested 3.7 yet, but if _PyRuntime is available from dynsym then this is already enough. Thanks, Gabriele On Sat, 29 Sep 2018 at 11:00, Nathaniel Smith <njs@pobox.com> wrote:
On Fri, Sep 28, 2018 at 3:29 PM, Gabriele <phoenix1987@gmail.com> wrote:
On Fri, 28 Sep 2018 at 23:12, Nathaniel Smith <njs@pobox.com> wrote:
What information do you wish the interpreter provided, that would make your program simpler and more reliable?
An exported global variable that points to the head of the PyInterpreterState linked list (i.e. the return value of PyInterpreterState_Head). This way my program could just look this up from the dynsym section instead of scanning a dump of the bss section in memory to find a possible candidate.
Hmm, it looks like in 3.7+, _PyRuntime is marked PyAPI_DATA, which I think should make it exported from dynsym?
https://github.com/python/cpython/blob/4b430e5f6954ef4b248e95bfb4087635dcdef...
And PyInterpreterState_Head is just _PyRuntime.interpreters.head. So maybe this is already done...
-n
-- Nathaniel J. Smith -- https://vorpus.org
-- "Egli è scritto in lingua matematica, e i caratteri son triangoli, cerchi, ed altre figure geometriche, senza i quali mezzi è impossibile a intenderne umanamente parola; senza questi è un aggirarsi vanamente per un oscuro laberinto." -- G. Galilei, Il saggiatore.
participants (2)
-
Gabriele
-
Nathaniel Smith