Hello, First, I would like to say that I have no fondamental problem with this PEP. While I agree with Nathaniel that the rationale given about the CSP concurrency model seems a bit weak, the author is obviously expressing his opinion there and I won't object to that. However, I think the PEP is desirable for other reasons. Mostly, I hope that by making the subinterpreters functionality available to pure Python programmers (while it was formally an advanced and arcane part of the C API), we will spur of bunch of interesting third-party experimentations, including possibilities that we on python-dev have not thought about. The appeal of the PEP for experimentations is multiple: 1) ability to concurrently run independent execution environments without spawning child processes (which on some platforms and in some situations may not be very desirable: for example on Windows where the cost of spawning is rather high; also, child processes may crash, and sometimes it is not easy for the parent to recover, especially if a synchronization primitive is left in an unexpected state) 2) the potential for parallelizing CPU-bound pure Python code in a single process, if a per-interpreter GIL is finally implemented 3) easier support for sharing large data between separate execution environments, without the hassle of setting up shared memory or the fragility of relying on fork() semantics (and as I said, I hope people find other applications) As for the argument that we already have asyncio and several other packages, I actually think that combining these different concurrency mechanisms would be interesting complex applications (such as distributed systems). For that, however, I think the PEP as currently written is a bit lacking, see below. Now for the detailed comments. * I think the module should indeed be provisional. Experimentation may discover warts that call for a change in the API or semantics. Let's not prevent ourselves from fixing those issues. * The "association" timing seems quirky and potentially annoying: an interpreter only becomes associated with a channel the first time it calls recv() or send(). How about, instead, associating an interpreter with a channel as soon as that channel is given to it through `Interpreter.run(..., channels=...)` (or received through `recv()`)? * How hard would it be, in the current implementation, to add buffering to channels? It doesn't have to be infinite: you can choose a fixed buffer size (or make it configurable in the create() function, which allows passing 0 for unbuffered). Like Nathaniel, I think unbuffered channels will quickly be annoying to work with (yes, you can create a helper thread... now you have one additional thread per channel, which isn't pretty -- especially with the GIL). * In the same vein, I think channels should allow adding readiness callbacks (that are called whenever a channel becomes ready for sending or receiving, respectively). This would make it easy to plug them into an event loop or other concurrency systems (such as Future-based concurrency). Note that each interpreter "associated" with a channel should be able to set its own readiness callback: so one callback per Python object representing the channel, but potentially multiple callbacks for the underlying channel primitive. (how would the callback be scheduled for execution in the right interpreter? perhaps using `_PyEval_AddPendingCall()` or a similar mechanism?) * I think either `interpreters.get_main()` or `interpreters.is_main()` is desirable. Inevitable, the slight differences between main and non-main interpreters will surface in non-trivial applications (finalization issues in distributed systems can really be hairy). It seems this should be mostly costless to provide, so let's do it. * I do think a minimal synchronization primitive would be nice. Either a Lock (in the Python sense) or a Semaphore: both should be relatively easy to provide, by wrapping an OS-level synchronization primitive. Then you can recreate all high-level synchronization primitives, like the threading and multiprocessing modules do (using a Lock or a Semaphore, respectively). (note you should be able to emulate a semaphore using blocking send() and recv() calls, but that's probably not very efficient, and efficiency is important) Of course, I hope these are all actionable before beta1 :-) If not, here is my preferential priority list: * High priority: fix association timing * High priority: either buffering /or/ readiness callbacks * Middle priority: get_main() /or/ is_main() * Middle / low priority: a simple synchronization primitive But I would stress the more of these we provide, the more we encourage people to experiment without pulling too much of their hair. (also, of course, I hope other people read the PEP and emit feedback) Best regards Antoine.
On Sat, 18 Apr 2020 19:02:47 +0200 Antoine Pitrou <solipsis@pitrou.net> wrote:
* I do think a minimal synchronization primitive would be nice. Either a Lock (in the Python sense) or a Semaphore: both should be relatively easy to provide, by wrapping an OS-level synchronization primitive. Then you can recreate all high-level synchronization primitives, like the threading and multiprocessing modules do (using a Lock or a Semaphore, respectively).
By the way, perhaps this could be even be implemented as making _threading.Lock shareable. This would probably require some changes in the underlying C Lock structure (e.g. pointing to an atomically-refcounted shared control block), but nothing intractable, and reasonably efficient. Regards Antoine.
On Sat, Apr 18, 2020 at 11:16 AM Antoine Pitrou <solipsis@pitrou.net> wrote:
* I do think a minimal synchronization primitive would be nice. Either a Lock (in the Python sense) or a Semaphore: both should be relatively easy to provide, by wrapping an OS-level synchronization primitive. Then you can recreate all high-level synchronization primitives, like the threading and multiprocessing modules do (using a Lock or a Semaphore, respectively).
(note you should be able to emulate a semaphore using blocking send() and recv() calls, but that's probably not very efficient, and efficiency is important)
You make a good point about efficiency. The blocking is definitely why I figured we could get away with avoiding a locking primitive. One reason I wanted to avoid a shareable synchronization primitive is that I've had many bad experiences with something similar in Go: mixing locks, channels, and goroutines). I'll also admit that the ideas in CSP had an impact on this. :) Mixing channels and locks can be a serious pain point. So if we do end up supporting shared locks, I suppose I'd feel better about it if we had an effective way to discourage folks using them normally. Two possible approaches: * keep them in a separate module on PyPI that folks could use when experimenting * add a shareable lock class (to the "interpreters" module) with a name that made it clear you shouldn't use it normally. If blocking send/recv were efficient enough, I'd rather not have a shareable lock at all. Or I suppose it could be re-implemented later using a channel. :) On Sat, Apr 18, 2020 at 11:30 AM Antoine Pitrou <solipsis@pitrou.net> wrote:
By the way, perhaps this could be even be implemented as making _threading.Lock shareable. This would probably require some changes in the underlying C Lock structure (e.g. pointing to an atomically-refcounted shared control block), but nothing intractable, and reasonably efficient.
Making _threading.Lock shareable kind of seems like the best way to go. Honestly I was already looking into it relative to the implementation for the low-level channel_send_wait(). [1] However, I got nervous about that as soon as I started looking at how to separate the low-level mutex from the Lock object (so it could be shared). :) So I'd probably want some help on the implementation work. -eric [1] https://www.python.org/dev/peps/pep-0554/#return-a-lock-from-send
On 19/04/20 5:02 am, Antoine Pitrou wrote:
* How hard would it be, in the current implementation, to add buffering to channels?
* In the same vein, I think channels should allow adding readiness callbacks
Of these, I think the callbacks are more fundamental. If you have a non-buffered channel with readiness callbacks, you can implement a buffered channel on top of it. -- Greg
On Sat, Apr 18, 2020 at 6:50 PM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 19/04/20 5:02 am, Antoine Pitrou wrote:
* How hard would it be, in the current implementation, to add buffering to channels?
* In the same vein, I think channels should allow adding readiness callbacks
Of these, I think the callbacks are more fundamental. If you have a non-buffered channel with readiness callbacks, you can implement a buffered channel on top of it.
Some questions: * Do you think it is worth adding readiness callbacks if we already have channel buffering? * Would a low-level channel implementation based on callbacks or locks be better (simpler, faster, etc.) than one based on buffering? * Would readiness callbacks in the high-level API be more or less user-friendly than alternatives: optional blocking, a lock, etc.? FWIW, I tend to find callbacks a greater source of complexity than alternatives. Thanks! -eric
On 21/04/20 8:29 am, Eric Snow wrote:
* Would a low-level channel implementation based on callbacks or locks be better (simpler, faster, etc.) than one based on buffering?
Depends on what you mean by "better". Callbacks are more versatile; a buffered channel just does buffering, but with callbacks you can do other things, e.g. hooking into an event loop.
* Would readiness callbacks in the high-level API be more or less user-friendly than alternatives: optional blocking, a lock, etc.?
I would consider callbacks to be part of a low-level layer that you wouldn't use directly most of the time. Some user-friendly high-level things such as buffered channels would be provided. Efficiency is a secondary consideration. If it turns out to be a problem, that can be addressed later. -- Greg
On Mon, Apr 20, 2020 at 4:19 PM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 21/04/20 8:29 am, Eric Snow wrote:
* Would a low-level channel implementation based on callbacks or locks be better (simpler, faster, etc.) than one based on buffering?
Depends on what you mean by "better". Callbacks are more versatile; a buffered channel just does buffering, but with callbacks you can do other things, e.g. hooking into an event loop.
Thanks for clarifying. For the event loop case, what is the downside to adapting to the API in the existing proposal?
* Would readiness callbacks in the high-level API be more or less user-friendly than alternatives: optional blocking, a lock, etc.?
I would consider callbacks to be part of a low-level layer that you wouldn't use directly most of the time. Some user-friendly high-level things such as buffered channels would be provided.
Ah, PEP 554 is just about the high-level API. Currently in the low-level API recv() doesn't ever block (instead raising ChannelEmptyError if empty) and channel_send() returns a pre-acquired lock that releases once the object is received. I'm not opposed to a different low-level API, but keep in mind that we're short on time.
Efficiency is a secondary consideration. If it turns out to be a problem, that can be addressed later.
+1 -eric
On 21/04/20 10:35 am, Eric Snow wrote:
For the event loop case, what is the downside to adapting to the API in the existing proposal?
If you mean the suggestion of having async send and receive methods, that's okay. My point was that before responding to requests to add individual features such as buffered channels, it might be better to provide a general mechanism that would allow people to implement their own favourite features themselves. It seems like you're going to need some kind of callback mechanism under the covers to implement async send and receive anyway, so you may as well expose it as part of the official API. -- Greg
Thanks for the feedback, Antoine. I've responded inline below and will be making appropriate changes to the PEP. One point I'd like to reinforce before my comments is the PEP's emphasis on minimalism. From PEP 554: This proposal is focused on enabling the fundamental capability of multiple isolated interpreters in the same Python process. This is a new area for Python so there is relative uncertainly about the best tools to provide as companions to subinterpreters. Thus we minimize the functionality we add in the proposal as much as possible. I don't think anything you've mentioned really deviates much from that, and making the module provisional helps. I just want us to be careful not to add stuff that we'll decide we want to remove later. :) FYI, I'm already updating the PEP based on feedback from the other email thread. I'll let you know once all the updates are done. On Sat, Apr 18, 2020 at 11:16 AM Antoine Pitrou <solipsis@pitrou.net> wrote:
First, I would like to say that I have no fondamental problem with this PEP. While I agree with Nathaniel that the rationale given about the CSP concurrency model seems a bit weak, the author is obviously expressing his opinion there and I won't object to that. However, I think the PEP is desirable for other reasons. Mostly, I hope that by making the subinterpreters functionality available to pure Python programmers (while it was formally an advanced and arcane part of the C API), we will spur of bunch of interesting third-party experimentations, including possibilities that we on python-dev have not thought about.
The experimentation angle is one I didn't consider all that much, but you make a good point.
The appeal of the PEP for experimentations is multiple: 1) ability to concurrently run independent execution environments without spawning child processes (which on some platforms and in some situations may not be very desirable: for example on Windows where the cost of spawning is rather high; also, child processes may crash, and sometimes it is not easy for the parent to recover, especially if a synchronization primitive is left in an unexpected state) 2) the potential for parallelizing CPU-bound pure Python code in a single process, if a per-interpreter GIL is finally implemented 3) easier support for sharing large data between separate execution environments, without the hassle of setting up shared memory or the fragility of relying on fork() semantics
(and as I said, I hope people find other applications)
These are covered in the PEP, though not together in the rationale, etc. Should I add explicit mention of experimentation as a motivation in the abstract or rationale sections? Would you like me to add a dedicated paragraph/section covering experimentation?
As for the argument that we already have asyncio and several other packages, I actually think that combining these different concurrency mechanisms would be interesting complex applications (such as distributed systems). For that, however, I think the PEP as currently written is a bit lacking, see below.
Yeah, that would be interesting. What in particular will help make subinterpreters and asyncio more cooperative?
Now for the detailed comments.
* I think the module should indeed be provisional. Experimentation may discover warts that call for a change in the API or semantics. Let's not prevent ourselves from fixing those issues.
Sounds good.
* The "association" timing seems quirky and potentially annoying: an interpreter only becomes associated with a channel the first time it calls recv() or send(). How about, instead, associating an interpreter with a channel as soon as that channel is given to it through `Interpreter.run(..., channels=...)` (or received through `recv()`)?
That seems fine to me. I do not recall the exact reason for tying association to recv() or send(). I only vaguely remember doing it that way for a technical reason. If I determine that reason then I'll bring it up. In the meantime I'll update the PEP to associate interpreters when the channel end is sent. FWIW, it may have been influenced by the automatic channel closing when no interpreters are associated. If interpreters are associated when channel ends are sent (rather than when used) then interpreters will have to be more careful about releasing channels. That's just a guess as to why I did it that way. :)
* How hard would it be, in the current implementation, to add buffering to channels? It doesn't have to be infinite: you can choose a fixed buffer size (or make it configurable in the create() function, which allows passing 0 for unbuffered). Like Nathaniel, I think unbuffered channels will quickly be annoying to work with (yes, you can create a helper thread... now you have one additional thread per channel, which isn't pretty -- especially with the GIL).
Currently the low-level implementation supports "infinite" channel buffering. The restriction in proposed high-level API was there to allow us to go with a simpler low-level implementation. However, I don't think that is necessary at this point. I'll update the PEP.
* In the same vein, I think channels should allow adding readiness callbacks (that are called whenever a channel becomes ready for sending or receiving, respectively). This would make it easy to plug them into an event loop or other concurrency systems (such as Future-based concurrency). Note that each interpreter "associated" with a channel should be able to set its own readiness callback: so one callback per Python object representing the channel, but potentially multiple callbacks for the underlying channel primitive.
Would this be as useful if we have buffered channels? It sounds like you wanted one or the other but not both.
(how would the callback be scheduled for execution in the right interpreter? perhaps using `_PyEval_AddPendingCall()` or a similar mechanism?)
Yeah, the pending call machinery has become my dear friend for several parts of the low-level implementation for the PEP. :)
* I think either `interpreters.get_main()` or `interpreters.is_main()` is desirable. Inevitable, the slight differences between main and non-main interpreters will surface in non-trivial applications (finalization issues in distributed systems can really be hairy). It seems this should be mostly costless to provide, so let's do it.
In the PEP (https://www.python.org/dev/peps/pep-0554/#get-main) I have this listed as a deferred functionality: for the basic functionality of a high-level API a get_main() function is not necessary. Furthermore, there is no requirement that a Python implementation have a concept of a main interpreter. So until there's a clear need we'll leave get_main() out. My preference would be to leave it out, since it's much harder to remove something later than to add it later. However, it isn't a major issue and is one of the deferred bits that I almost kept in the PEP. :) So I'll go ahead and add it to the proposed API.
* I do think a minimal synchronization primitive would be nice. Either a Lock (in the Python sense) or a Semaphore: both should be relatively easy to provide, by wrapping an OS-level synchronization primitive. Then you can recreate all high-level synchronization primitives, like the threading and multiprocessing modules do (using a Lock or a Semaphore, respectively).
(note you should be able to emulate a semaphore using blocking send() and recv() calls, but that's probably not very efficient, and efficiency is important)
I'll address this specific ask in a separate post, to keep the discussion focused.
Of course, I hope these are all actionable before beta1 :-) If not, here is my preferential priority list:
* High priority: fix association timing * High priority: either buffering /or/ readiness callbacks * Middle priority: get_main() /or/ is_main()
These should be doable for beta1 since they re either trivial or already done. :)
* Middle / low priority: a simple synchronization primitive
This might be harder to get done for beta1. That said, with a provisional status we may be able to add it after beta1. :)
But I would stress the more of these we provide, the more we encourage people to experiment without pulling too much of their hair.
Good point. I think the emphasis on experimentation is valuable. Thanks again, -eric
On Mon, 20 Apr 2020 14:22:03 -0600 Eric Snow <ericsnowcurrently@gmail.com> wrote:
The appeal of the PEP for experimentations is multiple: 1) ability to concurrently run independent execution environments without spawning child processes (which on some platforms and in some situations may not be very desirable: for example on Windows where the cost of spawning is rather high; also, child processes may crash, and sometimes it is not easy for the parent to recover, especially if a synchronization primitive is left in an unexpected state) 2) the potential for parallelizing CPU-bound pure Python code in a single process, if a per-interpreter GIL is finally implemented 3) easier support for sharing large data between separate execution environments, without the hassle of setting up shared memory or the fragility of relying on fork() semantics
(and as I said, I hope people find other applications)
These are covered in the PEP, though not together in the rationale, etc. Should I add explicit mention of experimentation as a motivation in the abstract or rationale sections? Would you like me to add a dedicated paragraph/section covering experimentation?
I was mostly exposing my thought process here :-) IOW, you don't have to do anything, except if you think that would be helpful.
As for the argument that we already have asyncio and several other packages, I actually think that combining these different concurrency mechanisms would be interesting complex applications (such as distributed systems). For that, however, I think the PEP as currently written is a bit lacking, see below.
Yeah, that would be interesting. What in particular will help make subinterpreters and asyncio more cooperative?
Readiness callbacks would help wrangle any kind of asynchronous / event-driven framework around subinterpreters.
* In the same vein, I think channels should allow adding readiness callbacks (that are called whenever a channel becomes ready for sending or receiving, respectively). This would make it easy to plug them into an event loop or other concurrency systems (such as Future-based concurrency). Note that each interpreter "associated" with a channel should be able to set its own readiness callback: so one callback per Python object representing the channel, but potentially multiple callbacks for the underlying channel primitive.
Would this be as useful if we have buffered channels? It sounds like you wanted one or the other but not both.
Both are useful at somewhat different levels (though as Greg said, if you have readiness callbacks, you can probably cook up a buffering layer using them). Especially, readiness callbacks (or some other form of push notification) are desirable for reasonable interaction with an event loop.
Of course, I hope these are all actionable before beta1 :-) If not, here is my preferential priority list:
* High priority: fix association timing * High priority: either buffering /or/ readiness callbacks * Middle priority: get_main() /or/ is_main()
These should be doable for beta1 since they re either trivial or already done. :)
Great :-) Best regards Antoine.
On Mon, Apr 20, 2020 at 2:22 PM Eric Snow <ericsnowcurrently@gmail.com> wrote:
On Sat, Apr 18, 2020 at 11:16 AM Antoine Pitrou <solipsis@pitrou.net> wrote:
* The "association" timing seems quirky and potentially annoying: an interpreter only becomes associated with a channel the first time it calls recv() or send(). How about, instead, associating an interpreter with a channel as soon as that channel is given to it through `Interpreter.run(..., channels=...)` (or received through `recv()`)?
That seems fine to me. I do not recall the exact reason for tying association to recv() or send(). I only vaguely remember doing it that way for a technical reason. If I determine that reason then I'll bring it up. In the meantime I'll update the PEP to associate interpreters when the channel end is sent.
FWIW, it may have been influenced by the automatic channel closing when no interpreters are associated. If interpreters are associated when channel ends are sent (rather than when used) then interpreters will have to be more careful about releasing channels. That's just a guess as to why I did it that way. :)
As I've gone to update the PEP for this I'm feeling less comfortable with changing it. There is a subtle difference which concretely manifests in 2 ways. Firstly, the programmatic exposure of "associated" (SendChannel.interpreters and RecvChannel.Interpreters) would be different. With the current specification, "associated" means "has been used by". With your recommendation it would mean "is accessible by". Is it more useful to think about them one way or the other? Would there be value in making both meanings part of the API separately ("associated" + "bound") somehow? Secondly, with the current spec channels get automatically closed sooner, effectively as soon as all wrapping objects *that were used* are garbage collected (or released). With your recommendation it only happens as soon all all wrapping objects are garbage collected (or released). In the former case channels could get auto-closed before you expect them to. In the latter case they could leak if users forget to release them when unused. Is there a good way to address both downsides? -eric
On Mon, Apr 20, 2020 at 4:23 PM Eric Snow <ericsnowcurrently@gmail.com> wrote:
As I've gone to update the PEP for this I'm feeling less comfortable with changing it.
Also, the resulting text of the PEP makes it a little harder to follow when an interpreter gets associated. However, this is partly an artifact of the structure of the PEP. (The details of association need to be moved to a separate section.) The same situation would apply to docs. However, I'm not sure it would be a problem in practice. -eric
On 21/04/20 10:47 am, Eric Snow wrote:
On Mon, Apr 20, 2020 at 4:23 PM Eric Snow <ericsnowcurrently@gmail.com> wrote:
As I've gone to update the PEP for this I'm feeling less comfortable with changing it.
I don't get this whole business of channels being associated with interpreters, or why there needs to be a distinction between release() and close(). To my mind, a channel reference should be like a file descriptor for a pipe. When you've finished with it, you close() it. When the last reference to a channel is closed or garbage collected, the channel disappears. Why make it any more complicated than that? You seem to be worried about channels getting leaked if someone forgets to close them. But it's just the same for files and pipes, and nobody seems to worry about that. -- Greg
On Tue, Apr 21, 2020 at 1:39 AM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I don't get this whole business of channels being associated with interpreters, or why there needs to be a distinction between release() and close().
To my mind, a channel reference should be like a file descriptor for a pipe. When you've finished with it, you close() it. When the last reference to a channel is closed or garbage collected, the channel disappears. Why make it any more complicated than that?
You've mostly described what the PEP proposes: when all objects wrapping a channel in an interpreter are destroyed, that channel is automatically released. When all interpreters have released the channel then it is automatically closed. The main difference is that the PEP also provides an way to explicitly release or close a channel. Providing just "close()" would mean one interpreter could stomp on all other interpreters' use of a channel. Working around that would require clunky coordination (likely through other channels). The alternative ("release()") is much simpler.
You seem to be worried about channels getting leaked if someone forgets to close them. But it's just the same for files and pipes, and nobody seems to worry about that.
Fair enough. :) -eric
On 22/04/20 3:57 am, Eric Snow wrote:
The main difference is that the PEP also provides an way to explicitly release or close a channel. Providing just "close()" would mean one interpreter could stomp on all other interpreters' use of a channel.
What I'm suggesting is that close() should do what the PEP defines release() as doing, and release() shouldn't exist. I don't see why an interpreter needs the ability to close a channel for any *other* interpreter. There is no such ability for files and pipes. -- Greg
On Tue, Apr 21, 2020 at 11:21 PM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
What I'm suggesting is that close() should do what the PEP defines release() as doing, and release() shouldn't exist.
I don't see why an interpreter needs the ability to close a channel for any *other* interpreter. There is no such ability for files and pipes.
Ah, thanks for clarifying. One of the main inspirations for the proposed channels is CSP (and somewhat relatedly, my in-depth experience with Go). Channels are more than just a thread-safe data transport between interpreters. They also provide relatively straightforward mechanisms for managing cooperation in a group of interpreters. Having a distinct "close()" vs. "release()" is part of that. Furthermore, IMHO "release" is better at communicating the per-interpreter nature than "close". "release()" doesn't close the channel. It communicates that that particular interpreter is done using that end of the channel. I appreciate that you brought up comparisons with other objects and data types. I'm a fan of adapting existing APIs and patterns, especially from proven sources. That said, the comparison with files would be more complete if channels were persistent. With pipes the main difference is how many actors are involved. Pipes involve one sender and one receiver, right? FWIW, I also looked at other data types. Queues are the closest thing to the proposed channels, and I almost called them that, but there are a few subtle differences from queue.Queue and I didn't want folks inadvertently confusing the two. -eric
Even then, disconnect seems like the primary use case, with a channel.kill_for_all being a specialized subclass. One argument for leaving it to a subclass is that it isn't clear what other interpreters should do when that happens. Shut down? Start getting exceptions if they happen to use it again, with no information until then?
On 29/04/20 2:12 pm, Eric Snow wrote:
One of the main inspirations for the proposed channels is CSP (and somewhat relatedly, my in-depth experience with Go). Channels are more than just a thread-safe data transport between interpreters.
It's a while since I paid attention to the fine details of CSP. I'll have to do some research on that.
Furthermore, IMHO "release" is better at communicating the per-interpreter nature than "close".
Channels are a similar enough concept to pipes that I think it would be confusing to have "close" mean "close for all interpreters". Everyone understands that "closing" a pipe only means you're closing your reference to one end of it, and they will probably assume closing a channel means the same. Maybe it would be better to have a different name such as "destroy" for a complete shutdown.
With pipes the main difference is how many actors are involved. Pipes involve one sender and one receiver, right?
Not necessarily. Mostly they're used that way, but there's nothing to stop multiple processes having a handle on the reading or writing end of a pipe simultaneously. Of course you have to be careful about how you interleave the reads and writes if you do that. -- Greg
On Wed, Apr 29, 2020, 22:05 Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Furthermore, IMHO "release" is better at communicating the per-interpreter nature than "close".
Channels are a similar enough concept to pipes that I think it would be confusing to have "close" mean "close for all interpreters". Everyone understands that "closing" a pipe only means you're closing your reference to one end of it, and they will probably assume closing a channel means the same.
FWIW, I'd compare channels more closely to queues than to pipes. -eric
On 21/04/20 10:23 am, Eric Snow wrote:
with the current spec channels get automatically closed sooner, effectively as soon as all wrapping objects *that were used* are garbage collected (or released).
Maybe I'm missing something, but just because an object hasn't been used *yet* doesn't mean it isn't going to be used in the future, so isn't this wildly wrong? -- Greg
On Tue, 21 Apr 2020 18:27:41 +1200 Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 21/04/20 10:23 am, Eric Snow wrote:
with the current spec channels get automatically closed sooner, effectively as soon as all wrapping objects *that were used* are garbage collected (or released).
Maybe I'm missing something, but just because an object hasn't been used *yet* doesn't mean it isn't going to be used in the future, so isn't this wildly wrong?
That's my concern indeed. An interpreter may be willing to wait for incoming data in the future, without needing it immediately. (that incoming data may even represent something very trivial, such as a request to terminate itself) Regards Antoine.
On Tue, Apr 21, 2020 at 2:18 AM Antoine Pitrou <solipsis@pitrou.net> wrote:
On Tue, 21 Apr 2020 18:27:41 +1200 Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 21/04/20 10:23 am, Eric Snow wrote:
with the current spec channels get automatically closed sooner, effectively as soon as all wrapping objects *that were used* are garbage collected (or released).
Maybe I'm missing something, but just because an object hasn't been used *yet* doesn't mean it isn't going to be used in the future, so isn't this wildly wrong?
That's my concern indeed. An interpreter may be willing to wait for incoming data in the future, without needing it immediately.
(that incoming data may even represent something very trivial, such as a request to terminate itself)
Yeah, I had that same realization yesterday and it didn't change after sleeping on it. I suppose the only question I have left is if there is value to users in knowing which interpreters have *used* a particular channel. -eric
On Tue, 21 Apr 2020 09:36:22 -0600 Eric Snow <ericsnowcurrently@gmail.com> wrote:
On Tue, Apr 21, 2020 at 2:18 AM Antoine Pitrou <solipsis@pitrou.net> wrote:
On Tue, 21 Apr 2020 18:27:41 +1200 Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 21/04/20 10:23 am, Eric Snow wrote:
with the current spec channels get automatically closed sooner, effectively as soon as all wrapping objects *that were used* are garbage collected (or released).
Maybe I'm missing something, but just because an object hasn't been used *yet* doesn't mean it isn't going to be used in the future, so isn't this wildly wrong?
That's my concern indeed. An interpreter may be willing to wait for incoming data in the future, without needing it immediately.
(that incoming data may even represent something very trivial, such as a request to terminate itself)
Yeah, I had that same realization yesterday and it didn't change after sleeping on it. I suppose the only question I have left is if there is value to users in knowing which interpreters have *used* a particular channel.
I don't think so :-) Regards Antoine.
On Tue, Apr 21, 2020 at 10:33 AM Antoine Pitrou <solipsis@pitrou.net> wrote:
On Tue, 21 Apr 2020 09:36:22 -0600 Eric Snow <ericsnowcurrently@gmail.com> wrote:
Yeah, I had that same realization yesterday and it didn't change after sleeping on it. I suppose the only question I have left is if there is value to users in knowing which interpreters have *used* a particular channel.
I don't think so :-)
Yeah, as with so much with this PEP, if it proves desirable then we can add it later. -eric
Hi, Le sam. 18 avr. 2020 à 19:16, Antoine Pitrou <solipsis@pitrou.net> a écrit :
Mostly, I hope that by making the subinterpreters functionality available to pure Python programmers (while it was formally an advanced and arcane part of the C API), we will spur of bunch of interesting third-party experimentations, including possibilities that we on python-dev have not thought about. (...) * I think the module should indeed be provisional. Experimentation may discover warts that call for a change in the API or semantics. Let's not prevent ourselves from fixing those issues.
Would it make sense to start by adding the module as a private "_subinterpreters" module but document it? The "_" prefix would be a reminder that "hey! it's experimental, there is a no backward compatibility warranty there". We can also add a big warning in the documentation. Victor
On Tuesday, April 21, 2020 9:20 AM Victor Stinner [mailto:vstinner@python.org] wrote
Hi,
Le sam. 18 avr. 2020 à 19:16, Antoine Pitrou <solipsis@pitrou.net> a écrit :
Mostly, I hope that by making the subinterpreters functionality available to pure Python programmers (while it was formally an advanced and arcane part of the C API), we will spur of bunch of interesting third-party experimentations, including possibilities that we on python-dev have not thought about. (...) * I think the module should indeed be provisional. Experimentation may discover warts that call for a change in the API or semantics. Let's not prevent ourselves from fixing those issues.
Would it make sense to start by adding the module as a private "_subinterpreters" module but document it? The "_" prefix would be a reminder that "hey! it's experimental, there is a no backward compatibility warranty there".
We can also add a big warning in the documentation.
Victor
What about requiring "from __future__ import subinterpreters" to use this? According to the docs, the purpose of __future__ is "to document when incompatible changes were introduced", and it does seem that this would be an incompatible change for some C extensions. --Edwin
__future__ imports only have effects on the parser and compiler. PEP 554 is mostly a Python module, currently named "_xxsubinterpreters". Victor Le mar. 21 avr. 2020 à 15:37, Edwin Zimmerman <edwin@211mainstreet.net> a écrit :
On Tuesday, April 21, 2020 9:20 AM Victor Stinner [mailto:vstinner@python.org] wrote
Hi,
Le sam. 18 avr. 2020 à 19:16, Antoine Pitrou <solipsis@pitrou.net> a écrit :
Mostly, I hope that by making the subinterpreters functionality available to pure Python programmers (while it was formally an advanced and arcane part of the C API), we will spur of bunch of interesting third-party experimentations, including possibilities that we on python-dev have not thought about. (...) * I think the module should indeed be provisional. Experimentation may discover warts that call for a change in the API or semantics. Let's not prevent ourselves from fixing those issues.
Would it make sense to start by adding the module as a private "_subinterpreters" module but document it? The "_" prefix would be a reminder that "hey! it's experimental, there is a no backward compatibility warranty there".
We can also add a big warning in the documentation.
Victor
What about requiring "from __future__ import subinterpreters" to use this? According to the docs, the purpose of __future__ is "to document when incompatible changes were introduced", and it does seem that this would be an incompatible change for some C extensions. --Edwin
-- Night gathers, and now my watch begins. It shall not end until my death.
On Tue, Apr 21, 2020 at 7:25 AM Victor Stinner <vstinner@python.org> wrote:
Would it make sense to start by adding the module as a private "_subinterpreters" module but document it? The "_" prefix would be a reminder that "hey! it's experimental, there is a no backward compatibility warranty there".
I would expect a leading underscore to be confusing (as well as conflicting with the name of the low-level module). if we did anything then it would probably make more sense to name the module something like "interpreters_experimental". However, I'm not sure that offers that much benefit.
We can also add a big warning in the documentation.
We will mark it "provisional" in the docs, which I expect will include info on what that means and why it is provisional. -eric
Eric Snow wrote:
We will mark it "provisional" in the docs, which I expect will include info on what that means and why it is provisional.
If you'd like an example format for marking a section of the docs as provisional w/ reST, something like this at the top should suffice (with perhaps something more specific to the subinterpreters module): .. note:: This section of the documentation and all of its members have been added *provisionally*. For more details, see :term:`provisional api`. :term:`provisional api` generates a link to https://docs.python.org/3/glossary.html#term-provisional-api. On Tue, Apr 21, 2020 at 12:09 PM Eric Snow <ericsnowcurrently@gmail.com> wrote:
On Tue, Apr 21, 2020 at 7:25 AM Victor Stinner <vstinner@python.org> wrote:
Would it make sense to start by adding the module as a private "_subinterpreters" module but document it? The "_" prefix would be a reminder that "hey! it's experimental, there is a no backward compatibility warranty there".
I would expect a leading underscore to be confusing (as well as conflicting with the name of the low-level module). if we did anything then it would probably make more sense to name the module something like "interpreters_experimental". However, I'm not sure that offers that much benefit.
We can also add a big warning in the documentation.
We will mark it "provisional" in the docs, which I expect will include info on what that means and why it is provisional.
-eric _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/LB7ACPFA... Code of Conduct: http://python.org/psf/codeofconduct/
On Wed, Apr 22, 2020 at 2:13 AM Kyle Stanley <aeros167@gmail.com> wrote:
If you'd like an example format for marking a section of the docs as provisional w/ reST, something like this at the top should suffice (with perhaps something more specific to the subinterpreters module):
.. note:: This section of the documentation and all of its members have been added *provisionally*. For more details, see :term:`provisional api`.
:term:`provisional api` generates a link to https://docs.python.org/3/glossary.html#term-provisional-api.
Thanks! -eric
participants (7)
-
Antoine Pitrou
-
Edwin Zimmerman
-
Eric Snow
-
Greg Ewing
-
Jim J. Jewett
-
Kyle Stanley
-
Victor Stinner