Hi all,
I've got a pretty good sense of how signal handling works in the
runtime (i.e. via a dance with the eval loop), but still have some
questions:
1. Why do we restrict calls to signal.signal() to the main thread?
2. Why must signal handlers run in the main thread?
3. Why does signal handling operate via the "pending calls" machinery
and not distinctly?
More details are below. My interest in the topic relates to improving
in-process interpreter isolation.
#1 & #2
-----------
Toward the top of signalmodule.c we find the following comment [1]
(written in 1994):
/*
NOTES ON THE INTERACTION BETWEEN SIGNALS AND THREADS
When threads are supported, we want the following semantics:
- only the main thread can set a signal handler
- any thread can get a signal handler
- signals are only delivered to the main thread
I.e. we don't support "synchronous signals" like SIGFPE (catching
this doesn't make much sense in Python anyway) nor do we support
signals as a means of inter-thread communication, since not all
thread implementations support that (at least our thread library
doesn't).
We still have the problem that in some implementations signals
generated by the keyboard (e.g. SIGINT) are delivered to all
threads (e.g. SGI), while in others (e.g. Solaris) such signals are
delivered to one random thread (an intermediate possibility would
be to deliver it to the main thread -- POSIX?). For now, we have
a working implementation that works in all three cases -- the
handler ignores signals if getpid() isn't the same as in the main
thread. XXX This is a hack.
*/
At the very top of the file we see another relevant comment:
/* XXX Signals should be recorded per thread, now we have thread state. */
That one was written in 1997, right after PyThreadState was introduced.
So is the constraint about the main thread just a historical artifact?
If not, what would be an appropriate explanation for why signals must
be strictly bound to the main thread?
#3
-----
Regarding the use of Py_MakePendingCalls() for signal handling, I can
imagine the history there. However, is there any reason signal
handling couldn't be separated from the "pending calls" machinery at
this point? As far as I can tell there is no longer any strong
relationship between the two.
-eric
[1] https://github.com/python/cpython/blob/master/Modules/signalmodule.c#L71
In trying to find the location of a valid instance of PyInterpreterState in
the virtual memory of a running Python (3.6) application (using
process_vm_read on Linux), I have noticed that I can only rely on
_PyThreadState_Current.interp at the very beginning of the execution. If I
try to attach to a running Python process, then
_PythreadState_Current.interp doesn't seem to point to anything useful to
derive the currently running threads and the frame stacks for each of them.
This makes me wonder about the purpose of this symbol in the
.dynsym section. Apart from a brute force approach for finding a valid
PyInterpreterState, is there a more reliable approach for the version of
Python that I'm targeting?
Thanks,
Gabriele
The new postponed annotations have an unexpected interaction with
dataclasses. Namely, you cannot get the type hints of any of the data
classes methods.
For example, I have some code that inspects the type parameters of a
class's `__init__` method. (The real use case is to provide a default
serializer for the class, but that is not important here.)
```
from dataclasses import dataclass
from typing import get_type_hints
class Foo:
pass
@dataclass
class Bar:
foo: Foo
print(get_type_hints(Bar.__init__))
```
In Python 3.6 and 3.7, this does what is expected; it prints `{'foo':
<class '__main__.Foo'>, 'return': <class 'NoneType'>}`.
However, if in Python 3.7, I add `from __future__ import annotations`, then
this fails with an error:
```
NameError: name 'Foo' is not defined
```
I know why this is happening. The `__init__` method is defined in the
`dataclasses` module which does not have the `Foo` object in its
environment, and the `Foo` annotation is being passed to `dataclass` and
attached to `__init__` as the string `"Foo"` rather than as the original
object `Foo`, but `get_type_hints` for the new annotations only does a name
lookup in the module where `__init__` is defined not where the annotation
is defined.
I know that the use of lambdas to implement PEP 563 was rejected for
performance reasons. I could be wrong, but I think this was motivated by
variable annotations because the lambda would have to be constructed each
time the function body ran. I was wondering if I could motivate storing the
annotations as lambdas in class bodies and function signatures, in which
the environment is already being captured and is code that usually only
runs once.
Python 3.7.1rc1 and 3.6.7rc1 are now available. 3.7.1rc1 is the release
preview of the first maintenance release of Python 3.7, the latest feature
release of Python. 3.6.7rc1 is the release preview of the next maintenance
release of Python 3.6, the previous feature release of Python. Assuming no
critical problems are found prior to 2018-10-06, no code changes are
planned between these release candidates and the final releases. These
release candidates are intended to give you the opportunity to test the
new security and bug fixes in 3.7.1 and 3.6.7. We strongly encourage you
to test your projects and report issues found to bugs.python.org as soon
as possible.
Please keep in mind that these are preview releases and, thus, their use
is not recommended for production environments.
You can find these releases and more information here:
https://www.python.org/downloads/release/python-371rc1/https://www.python.org/downloads/release/python-367rc1/
--
Ned Deily
nad(a)python.org -- []
Sorry for mailing about a bug instead of putting in a bug tracker
ticket. The bug tracker's login system just sits there for a minute
then says "an error has occurred".
This line of code is broken in Windows:
https://github.com/python/cpython/blob/v2.7.15/Objects/fileobject.c#L721
_lseeki64 only modifies the kernel's seek position, not the cached
position stored in stdio. This sometimes leads to problems where
fgetpos does not return the correct file pointer.
The solution is to not do that "SIZEOF_FPOS_T >= 8" code at all on
Windows, and just do this instead (or make a new HAVE_FSEEKI64 macro):
#elif defined(MS_WINDOWS)
return _fseeki64(fp, offset, whence);
3.x is unaffected because it uses the Unix-like FD API in Windows
instead of stdio, in which its usage of _lseeki64 is correct.
Thanks,
Melissa
On 2018-09-25 16:01, Barry Warsaw wrote:
> Maybe this is better off discussed in doc-sig but I think we need to consider documenting the private C API.
Even the *public* C API is not fully documented. For example, none of
the PyCFunction_... functions appears in the documentation.
What follows is the text of issue 34690:
https://bugs.python.org/issue34690
The PR is here:
https://github.com/python/cpython/pull/9320
I don't know if we should be discussing this here on python-dev, or on
bpo, or on Zulip, or on the soon-to-be-created Discourse. But maybe we
can talk about it somewhere!
//arry/
----
This patch was sent to me privately by Jeethu Rao at Facebook. It's a
change they're working with internally to improve startup time. What
I've been told by Carl Shapiro at Facebook is that we have their
blessing to post it publicly / merge it / build upon it for CPython.
Their patch was written for 3.6, I have massaged it to the point where
it minimally works with 3.8.
What the patch does: it takes all the Python modules that are loaded as
part of interpreter startup and deserializes the marshalled .pyc file
into precreated objects stored as static C data. You add this .C file
to the Python build. Then there's a patch to Python itself (about 250
lines iirc) that teaches it to load modules from these data structures.
I wrote a quick dumb test harness to compare this patch vs 3.8 stock.
It runs a command line 500 times and uses time.perf_counter to time the
process. On a fast quiescent laptop I observe a 21-22% improvement:
cmdline: ['./python', '-c', 'pass']
500 runs:
sm38
average time 0.006302303705982922
best 0.006055746000129147
worst 0.00816565500008437
clean38
average time 0.007969956444008858
best 0.007829047999621253
worst 0.008812210000542109
improvement 0.20924239043734505 %
cmdline: ['./python', '-c', 'import io']
500 runs:
sm38
average time 0.006297688038004708
best 0.005980765999993309
worst 0.0072462130010535475
clean38
average time 0.007996319670004595
best 0.0078091849991324125
worst 0.009175700999549008
improvement 0.21242667903482038 %
The downside of the patch: for these modules it ignores the Python files
on disk--it doesn't even stat them. If you add stat calls you lose half
of the speed improvement. I believe they added a work-around, where you
can set a flag (command-line? environment variable? I don't know, I
didn't go looking for it) that tells Python "don't use the frozen
modules" and it loads all those files from disk.
I don't propose to merge the patch in its current state. I think it
would need a lot of work both in terms of "doing things the way Python
does it" as well as just code smell (the serializer is implemented in
both C and Python and jumps back and forth, also the build process for
the serialized modules is pretty tiresome).
Is it worth working on?
The PSF has received a few inquiries asking the question — “How do I cite
Python?”So, I am reaching out to you all to figure this out.
(For those that don’t know my background, I have been in academia for a bit
as a Ph.D student and have worked at the Library of Congress writing code
to process Marc records <https://www.loc.gov/marc/bibliographic/>, among
other things.)
IMHO the citation for Python should be decided upon by the Python
developers and should live somewhere on the site.
Two questions to be answered…
1. What format should it take?
2. Where does it live on the site?
To help frame the first one, I quickly wrote this up —
https://docs.google.com/document/d/1R0mo8EYVIPNkmNBImpcZTbk0e78T2oU71ioX5Nv…
tldr; Summary of possibilities…
1. Article for one citation (1 DOI, generated by the publication)
2. No article (many DOIs — one for each major version through Zenodo
<https://zenodo.org/record/1404209> (or similar service))
Discuss.
-Jackie
Jackie Kazil
Board of Directors, PSF