Scope, not context? (was Re: PEP 550 v3 naming)
Guido van Rossum wrote:
On Tue, Aug 22, 2017 at 7:12 PM, Nathaniel Smith
wrote: I worry that that's going to lead more people astray thinking this has something to do with contextlib, which it really doesn't (it's much more closely related to threading.local()).
This is my problem with using "Context" for this PEP. Although I can't keep up with all names being thrown around, it seems to me that in Python we already have a well-established meaning for "contexts" -- context managers, and the protocols they implement as they participate in `with` statements. We have contextlib which reinforces this. What's being proposed in PEP 550 is so far removed from this concept that I think it's just going to cause confusion (well, it does in me anyway!). To me, the functionality proposed in PEP 550 feels more like a "scope" than a "context". Unlike a lexical scope, it can't be inferred from the layout of the source code. It's more of a dynamic "execution scope" and the stacking of "local execution scopes" reinforces that for me. You use a key to find a value in the current execution scope, and it chains up until the key is found or you've reached the top of the local execution (defined as the thread start, etc.). One other suggestion: maybe we shouldn't put these new functions in sys, but instead put them in their own module? It feels analogous to the gc module; all those functions could have gone in sys since they query and effect the Python runtime system, but it makes more sense (and improves the naming) by putting them in their own module. It also segregates the functionality so that sys doesn't become a catchall that overloads you when you're reading through the sys module documentation. Cheers, -Barry
On Thu, 24 Aug 2017 09:52:47 -0400
Barry Warsaw
To me, the functionality proposed in PEP 550 feels more like a "scope" than a "context".
I would call it "environment" myself, but that risks confusion with environment variables. Perhaps "dynamic environment" would remove the confusion.
One other suggestion: maybe we shouldn't put these new functions in sys, but instead put them in their own module? It feels analogous to the gc module; all those functions could have gone in sys since they query and effect the Python runtime system, but it makes more sense (and improves the naming) by putting them in their own module. It also segregates the functionality so that sys doesn't become a catchall that overloads you when you're reading through the sys module documentation.
+1 from me. Regards Antoine.
On Thu, Aug 24, 2017 at 9:52 AM, Barry Warsaw
Guido van Rossum wrote:
On Tue, Aug 22, 2017 at 7:12 PM, Nathaniel Smith
wrote: I worry that that's going to lead more people astray thinking this has something to do with contextlib, which it really doesn't (it's much more closely related to threading.local()).
This is my problem with using "Context" for this PEP. Although I can't keep up with all names being thrown around, it seems to me that in Python we already have a well-established meaning for "contexts" -- context managers, and the protocols they implement as they participate in `with` statements. We have contextlib which reinforces this. What's being proposed in PEP 550 is so far removed from this concept that I think it's just going to cause confusion (well, it does in me anyway!).
Although nobody refers to context managers as "context", they are always called with on of the following: "context manager", "CM", "context manager protocol". PEP 550 just introduces a concept of "context", that "context managers" will be able to manage.
To me, the functionality proposed in PEP 550 feels more like a "scope" than a "context". Unlike a lexical scope, it can't be inferred from the layout of the source code. It's more of a dynamic "execution scope" and the stacking of "local execution scopes" reinforces that for me. You use a key to find a value in the current execution scope, and it chains up until the key is found or you've reached the top of the local execution (defined as the thread start, etc.).
Yes, what PEP 550 proposes can be seen as a new scoping mechanism. But calling it a "scope" or "dynamic scope" would be a mistake (IMHO), as Python scoping is already a complex topic (with locals, nonlocals, globals, etc). Contrary to scoping, the programmer is much less likely to deal with Execution Context. How often do we use "threading.local()"?
One other suggestion: maybe we shouldn't put these new functions in sys, but instead put them in their own module? It feels analogous to the gc module; all those functions could have gone in sys since they query and effect the Python runtime system, but it makes more sense (and improves the naming) by putting them in their own module. It also segregates the functionality so that sys doesn't become a catchall that overloads you when you're reading through the sys module documentation.
I'm myself not a big fan of jamming all PEP 550 APIs into the sys module. We just need to come up with a good name. Yury
On Aug 24, 2017, at 10:23, Yury Selivanov
On Thu, Aug 24, 2017 at 9:52 AM, Barry Warsaw
wrote: Guido van Rossum wrote:
On Tue, Aug 22, 2017 at 7:12 PM, Nathaniel Smith
wrote: I worry that that's going to lead more people astray thinking this has something to do with contextlib, which it really doesn't (it's much more closely related to threading.local()).
This is my problem with using "Context" for this PEP. Although I can't keep up with all names being thrown around, it seems to me that in Python we already have a well-established meaning for "contexts" -- context managers, and the protocols they implement as they participate in `with` statements. We have contextlib which reinforces this. What's being proposed in PEP 550 is so far removed from this concept that I think it's just going to cause confusion (well, it does in me anyway!).
Although nobody refers to context managers as "context", they are always called with on of the following: "context manager", "CM", "context manager protocol". PEP 550 just introduces a concept of "context", that "context managers" will be able to manage.
To me, the functionality proposed in PEP 550 feels more like a "scope" than a "context". Unlike a lexical scope, it can't be inferred from the layout of the source code. It's more of a dynamic "execution scope" and the stacking of "local execution scopes" reinforces that for me. You use a key to find a value in the current execution scope, and it chains up until the key is found or you've reached the top of the local execution (defined as the thread start, etc.).
Yes, what PEP 550 proposes can be seen as a new scoping mechanism. But calling it a "scope" or "dynamic scope" would be a mistake (IMHO), as Python scoping is already a complex topic (with locals, nonlocals, globals, etc).
Contrary to scoping, the programmer is much less likely to deal with Execution Context. How often do we use "threading.local()”?
Yes, but in conversations about Python, the term “context” (in the context of context managers) comes up way more often than the term “scope”. I actually think Python’s scoping rules are fairly easy to grasp, as there aren’t that many levels or ways to access them, and the natural, common interactions are basically implicit when thinking about the code you’re writing. So while “context”, “environment”, and “scope” are certainly overloaded terms in Python, the first two have very specific existing, commonplace constructs within Python, and “scope” is both the least overloaded of the three and most closely matches what is actually going on. A different tack would more closely align with PEP 550’s heritage in thread-local storage, calling these things “execution storage”. I think I read Guido suggest elsewhere using a namespace here so that in common code you’d only have to change the “threading.local()” call to migrate to PEP 550. It might be neat if you could do something like: import execution els = execution.local() els.x = 1 By calling it “execution local storage” you’re also providing a nicer cognitive bridge from “thread local storage”, a concept anybody diving into this stuff will already understand pretty innately. No need to get fancy, we just think “Oh, I know what thread local storage is, and this seems related but slightly different, so now I basically understand what execution local storage is”. Cheers, -Barry
On 24 August 2017 at 15:38, Barry Warsaw
Yes, but in conversations about Python, the term “context” (in the context of context managers) comes up way more often than the term “scope”. I actually think Python’s scoping rules are fairly easy to grasp, as there aren’t that many levels or ways to access them, and the natural, common interactions are basically implicit when thinking about the code you’re writing.
So while “context”, “environment”, and “scope” are certainly overloaded terms in Python, the first two have very specific existing, commonplace constructs within Python, and “scope” is both the least overloaded of the three and most closely matches what is actually going on.
A different tack would more closely align with PEP 550’s heritage in thread-local storage, calling these things “execution storage”. I think I read Guido suggest elsewhere using a namespace here so that in common code you’d only have to change the “threading.local()” call to migrate to PEP 550. It might be neat if you could do something like:
I strongly agree with Barry's reservations about using the term "context" here. I've not been following the discussion (I was away when it started and there's too many emails to go through to catch up) but I've found the use of the term "context" to be a particular problem in trying to understand what's going on just skimming the messages. I don't have a strong opinion on what name should be used, but I am definitely against using the term "context". Paul
On Thu, Aug 24, 2017 at 10:38 AM, Barry Warsaw
A different tack would more closely align with PEP 550’s heritage in thread-local storage, calling these things “execution storage”. I think I read Guido suggest elsewhere using a namespace here so that in common code you’d only have to change the “threading.local()” call to migrate to PEP 550. It might be neat if you could do something like:
import execution els = execution.local() els.x = 1
A couple of relevant updates on this topic in old python-ideas threads (I'm not sure you've seen them): https://mail.python.org/pipermail/python-ideas/2017-August/046888.html https://mail.python.org/pipermail/python-ideas/2017-August/046889.html Unfortunately it's not feasible to re-use the "local()" idea for PEP 550. The "local()" semantics imposes many constraints and complexities to the design, while offering only "users already know it" argument for it. And even if we had "execution.local()" API, it would not be safe to always replace every "theading.local()" with it. Sometimes what you need is an actual TLS. Sometimes your design actually depends on it, even if you are not aware of that. Updating existing libraries to use PEP 550 should be a very conscious decision, simply because it has different semantics/guarantees. Y
Barry Warsaw wrote:
I actually think Python’s scoping rules are fairly easy to grasp,
The problem is that the word "scope", as generally used in relation to programming languages, has to do with visibility of names. A variable is "in scope" at a particular point in the code if you can acccess it just by writing its name there. The things we're talking about are never "in scope" in that sense. What we have is something similar to a dynamic scope, but a special action is required to access bindings in it. I can't think of any established term for things like that. The closest I'ves seen is one dialect of Scheme that called it a "fluid environment". In that dialect, fluid-let didn't create bindings in the normal scope, and you had to use specific functions to access them. -- Greg
On Thu, Aug 24, 2017 at 8:23 AM, Yury Selivanov
Contrary to scoping, the programmer is much less likely to deal with Execution Context. How often do we use "threading.local()"?
This is an important point. PEP 550 is targeting library authors, right? Most folks will not be interacting with the new functionality except perhaps indirectly via context managers. So I'm not convinced that naming collisions with more commonly used concepts is as much a concern, particularly because of conceptual similarity. However, if the functionality of the PEP would be used more commonly then the naming would be a stronger concern. -eric
On 08/24/2017 06:52 AM, Barry Warsaw wrote:
To me, the functionality proposed in PEP 550 feels more like a "scope" than a "context". Unlike a lexical scope, it can't be inferred from the layout of the source code. It's more of a dynamic "execution scope" and the stacking of "local execution scopes" reinforces that for me. You use a key to find a value in the current execution scope, and it chains up until the key is found or you've reached the top of the local execution (defined as the thread start, etc.).
Scope and dynamic certainly feel like the important concepts in PEP 550, and Guido (IIRC) has already said these functions do not belong in contextlib, so context may not be a good piece of the name. Some various ideas: - ExecutionScope, LocalScope - DynamicScope, ScopeLocals - DynamicExecutionScope, DynamicLocals - DynamicExecutionScope, ExecutionLocals - DynamicExecutionScope, ScopeLocals -- ~Ethan~
On Thu, Aug 24, 2017 at 7:52 AM, Barry Warsaw
Guido van Rossum wrote:
On Tue, Aug 22, 2017 at 7:12 PM, Nathaniel Smith
wrote: I worry that that's going to lead more people astray thinking this has something to do with contextlib, which it really doesn't (it's much more closely related to threading.local()).
This is my problem with using "Context" for this PEP. Although I can't keep up with all names being thrown around, it seems to me that in Python we already have a well-established meaning for "contexts" -- context managers, and the protocols they implement as they participate in `with` statements. We have contextlib which reinforces this. What's being proposed in PEP 550 is so far removed from this concept that I think it's just going to cause confusion (well, it does in me anyway!).
The precedent (and perhaps inspiration) here lies in decimal.context [1] and ssl.SSLContext [2]. email.Policy [3] qualifies in spirit, as does the import state [4]. Each encapsulates the state of some subsystem. They can be enabled in the current thread via a context manager. (Well, only the first two, but the latter two are strong candidates.) They are each specific to a *logical* execution context. Most notably, each is implicit global state in the current thread of execution. PEP 550, if I've understood right, is all about supporting these contexts in other logical threads of execution than just Python/OS threads (e.g. async, generators). Given all that, "context" is an understandable name to use here. Personally, I'm still on the fence on if "context" fits in the name. :) None of the proposed names have felt quite right thus far, which is probably why we're still talking about it! :)
To me, the functionality proposed in PEP 550 feels more like a "scope" than a "context". Unlike a lexical scope, it can't be inferred from the layout of the source code. It's more of a dynamic "execution scope" and the stacking of "local execution scopes" reinforces that for me. You use a key to find a value in the current execution scope, and it chains up until the key is found or you've reached the top of the local execution (defined as the thread start, etc.).
"scope" fits because it's all about chained lookup in implicit namespaces. However, to me the focus of PEP 550 is on the context (encapsulated state) more than the chaining.
One other suggestion: maybe we shouldn't put these new functions in sys, but instead put them in their own module? It feels analogous to the gc module; all those functions could have gone in sys since they query and effect the Python runtime system, but it makes more sense (and improves the naming) by putting them in their own module. It also segregates the functionality so that sys doesn't become a catchall that overloads you when you're reading through the sys module documentation.
+1 -eric [1] https://docs.python.org/3/library/ssl.html#ssl.SSLContext [2] https://docs.python.org/3/library/decimal.html#context-objects [3] https://docs.python.org/3/library/email.policy.html [4] https://www.python.org/dev/peps/pep-0406/
Hi, On 08/24/2017 03:52 PM, Barry Warsaw wrote:
Guido van Rossum wrote:
On Tue, Aug 22, 2017 at 7:12 PM, Nathaniel Smith
wrote: I worry that that's going to lead more people astray thinking this has something to do with contextlib, which it really doesn't (it's much more closely related to threading.local()).
If it's hard to find a name due collision with related meanings or already taken buckets then I would suggest just inventing a new word. An artificial one. What do you thing? Thanks, --francis
On Aug 24, 2017, at 16:01, francismb
On 08/24/2017 03:52 PM, Barry Warsaw wrote:
Guido van Rossum wrote:
On Tue, Aug 22, 2017 at 7:12 PM, Nathaniel Smith
wrote: I worry that that's going to lead more people astray thinking this has something to do with contextlib, which it really doesn't (it's much more closely related to threading.local()).
If it's hard to find a name due collision with related meanings or already taken buckets then I would suggest just inventing a new word. An artificial one.
What do you thing?
I propose RaymondLuxuryYach_t, but we’ll have to pronounce it ThroatwobblerMangrove. import-dinsdale-ly y’rs, -Barry
On Thu, 24 Aug 2017 17:06:09 -0400
Barry Warsaw
On Aug 24, 2017, at 16:01, francismb
wrote: On 08/24/2017 03:52 PM, Barry Warsaw wrote:
Guido van Rossum wrote:
On Tue, Aug 22, 2017 at 7:12 PM, Nathaniel Smith
wrote: I worry that that's going to lead more people astray thinking this has something to do with contextlib, which it really doesn't (it's much more closely related to threading.local()).
If it's hard to find a name due collision with related meanings or already taken buckets then I would suggest just inventing a new word. An artificial one.
What do you thing?
I propose RaymondLuxuryYach_t, but we’ll have to pronounce it ThroatwobblerMangrove.
Sorry, but I think we should prononce it ThroatwobblerMangrove. Regards Antoine.
I propose RaymondLuxuryYach_t, but we’ll have to pronounce it ThroatwobblerMangrove.
Sorry, but I think we should prononce it ThroatwobblerMangrove.
too hard to pronounce but at least is unique, I would prefer thredarena but I see naming is hard ... :-) Thanks! --francis
On Aug 26, 2017, at 12:43, francismb
I propose RaymondLuxuryYach_t, but we’ll have to pronounce it ThroatwobblerMangrove.
Sorry, but I think we should prononce it ThroatwobblerMangrove.
too hard to pronounce but at least is unique, I would prefer thredarena but I see naming is hard ... :-)
Oh mollusks, I thought you said bacon. -Barry
Barry Warsaw wrote:
This is my problem with using "Context" for this PEP. Although I can't keep up with all names being thrown around,
Not sure whether it helps, but a similar concept in some Scheme dialects is called a "fluid binding". e.g. Section 5.4 of http://www.scheme.com/csug8/binding.html It's not quite the same thing as we're talking about here (the names bound exist in the normal name lookup scope rather than a separate namespace) but maybe some of the terminology could be adapted. -- Greg
On 24 August 2017 at 23:52, Barry Warsaw
Guido van Rossum wrote:
On Tue, Aug 22, 2017 at 7:12 PM, Nathaniel Smith
wrote: I worry that that's going to lead more people astray thinking this has something to do with contextlib, which it really doesn't (it's much more closely related to threading.local()).
This is my problem with using "Context" for this PEP. Although I can't keep up with all names being thrown around, it seems to me that in Python we already have a well-established meaning for "contexts" -- context managers, and the protocols they implement as they participate in `with` statements. We have contextlib which reinforces this. What's being proposed in PEP 550 is so far removed from this concept that I think it's just going to cause confusion (well, it does in me anyway!).
While I understand the concern, I think context locals and contextlib are more closely related than folks realise, as one of the main problems that the PEP is aiming to solve is that with statements (and hence context managers) *do not work as expected* when their body includes "yield", "yield from" or "await" . The reason they don't work reliably is because in the absence of frame hacks, context managers are currently limited to manipulating process global and thread local state - there's no current notion of context local storage that interacts nicely with the frame suspension keywords. And hence in the absence of any support for context local state, any state changes made by context managers tend to remain in effect *even if the frame that entered the context manager gets suspended*. Context local state makes it possible to solve that problem, as it means that encountering one of the frame suspension keywords in the body of a with statement will implicitly revert all changes made to context local variables until such time as that particular frame is resumed, and you have to explicitly opt-in on a per-instance basis to instead allow those state changes to affect the calling context that's suspending & resuming the frame. That said, I can also see Guido's point that the proposed new APIs have a very different flavour to the existing APIs in contextlib, so it doesn't necessarily make sense to use that module directly. Building on the "context locals" naming scheme I've been discussing elsewhere, one possible name for a dedicated module would be "contextlocals" giving: # Read/write access to individual context locals * context_var = contextlocals.new_context_local(name: str='...') # Context local storage manipulation * context = contextlocals.new_context_local_state() * contextlocals.run_with_context_locals(context: ContextLocalState, func, *args, **kwargs). * __context_locals__ attribute on generators and coroutines # Overall execution context manipulation * current_ec = contextlocals.get_execution_context() * new_ec = contextlocals.new_execution_context() * contextlocals.run_with_execution_context(ec: ExecutionContext, func, *args, **kwargs). The remaining connection with "contextlib" would be that @contextlib.contextmanager (and its async counterpart) would implicitly clear the __context_locals__ attribute on the underlying generator instance so that context local state changes made there will affect the context where the context manager is actually being used. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Fri, 25 Aug 2017 15:36:55 +1000
Nick Coghlan
On 24 August 2017 at 23:52, Barry Warsaw
wrote: Guido van Rossum wrote:
On Tue, Aug 22, 2017 at 7:12 PM, Nathaniel Smith
wrote: I worry that that's going to lead more people astray thinking this has something to do with contextlib, which it really doesn't (it's much more closely related to threading.local()).
This is my problem with using "Context" for this PEP. Although I can't keep up with all names being thrown around, it seems to me that in Python we already have a well-established meaning for "contexts" -- context managers, and the protocols they implement as they participate in `with` statements. We have contextlib which reinforces this. What's being proposed in PEP 550 is so far removed from this concept that I think it's just going to cause confusion (well, it does in me anyway!).
While I understand the concern, I think context locals and contextlib are more closely related than folks realise, as one of the main problems that the PEP is aiming to solve is that with statements (and hence context managers) *do not work as expected* when their body includes "yield", "yield from" or "await" .
If I write: def read_chunks(fn, chunk_size=8192): with open(fn, "rb") as f: while True: data = f.read(chunk_size) if not data: break yield data The "with" statement here works fine even though its body includes a "yield" (and if there had been an added "await" things would probably not be different). The class of context managers you're talking about is in my experience a small minority (I've hardly ever used them myself, and I don't think I have ever written one). So I don't think the two concepts are as closely related as you seem to think. That said, I also think "context" is the best term (barring "environment" perhaps) to describe what PEP 550 is talking about. Qualifying it ("logical", etc.) helps disambiguate Regards Antoine.
On 25 August 2017 at 20:28, Antoine Pitrou
On Fri, 25 Aug 2017 15:36:55 +1000 Nick Coghlan
wrote: On 24 August 2017 at 23:52, Barry Warsaw
wrote: Guido van Rossum wrote:
On Tue, Aug 22, 2017 at 7:12 PM, Nathaniel Smith
wrote: I worry that that's going to lead more people astray thinking this has something to do with contextlib, which it really doesn't (it's much more closely related to threading.local()).
This is my problem with using "Context" for this PEP. Although I can't keep up with all names being thrown around, it seems to me that in Python we already have a well-established meaning for "contexts" -- context managers, and the protocols they implement as they participate in `with` statements. We have contextlib which reinforces this. What's being proposed in PEP 550 is so far removed from this concept that I think it's just going to cause confusion (well, it does in me anyway!).
While I understand the concern, I think context locals and contextlib are more closely related than folks realise, as one of the main problems that the PEP is aiming to solve is that with statements (and hence context managers) *do not work as expected* when their body includes "yield", "yield from" or "await" .
If I write:
def read_chunks(fn, chunk_size=8192): with open(fn, "rb") as f: while True: data = f.read(chunk_size) if not data: break yield data
The "with" statement here works fine even though its body includes a "yield" (and if there had been an added "await" things would probably not be different).
The class of context managers you're talking about is in my experience a small minority (I've hardly ever used them myself, and I don't think I have ever written one).
I actually agree with this, as the vast majority of context managers are about managing the state of frame locals, rather than messing about with implicitly shared state. It's similar to the way that thread locals are vastly outnumbered by ordinary frame locals. That's part of what motivated my suggested distinction between explicit context (such as your example here) and implicit context (the trickier cases that PEP 550 aims to help handle).
So I don't think the two concepts are as closely related as you seem to think.
Of the 12 examples in https://www.python.org/dev/peps/pep-0343/#examples, two of them related to manipulating the decimal thread local context, and a third relates to manipulating a hidden "queue signals for later or process them immediately?" flag, so the use cases that PEP 550 covers have always been an intended subset of the use cases the PEP 343 covers. It's just that the explicit use cases either already work sensibly in the face of frame suspension (e.g. keeping a file open, since you're still using it), or have other reasons why you wouldn't want to suspend the frame after entering that particular context (e.g. if you suspend a frame with a lock held, there's no real way for the interpreter to guess whether that's intentional or not, so it has to assume keeping it locked is intentional, and expect you to release it explicitly if that's what you want) And while PEP 550 doesn't handle the stream redirection case natively (since it doesn't allow for suspend/resume callbacks the way PEP 525 does), it at least allows for the development of a context-aware output stream wrapper API where: * you replace the target stream globally with a context-aware wrapper that delegates attribute access to a particular context local if that's set and to the original stream otherwise * you have a context manager that sets & reverts the context local variable rather than manipulating the process global state directly
That said, I also think "context" is the best term (barring "environment" perhaps) to describe what PEP 550 is talking about. Qualifying it ("logical", etc.) helps disambiguate
+1 Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Fri, Aug 25, 2017 at 9:23 AM, Nick Coghlan
And while PEP 550 doesn't handle the stream redirection case natively (since it doesn't allow for suspend/resume callbacks the way PEP 525 does), it at least allows for the development of a context-aware output stream wrapper API where:
PEP 525 can't handle streams redirection -- it can do it only for single-threaded programs. sys.stdout/stderr/stdin are global variables, that's how the API is specified. API users assume that the change is process-wide. Yury
On 25 August 2017 at 23:36, Yury Selivanov
On Fri, Aug 25, 2017 at 9:23 AM, Nick Coghlan
wrote: [..] And while PEP 550 doesn't handle the stream redirection case natively (since it doesn't allow for suspend/resume callbacks the way PEP 525 does), it at least allows for the development of a context-aware output stream wrapper API where:
PEP 525 can't handle streams redirection -- it can do it only for single-threaded programs.
Good point, whereas the hypothetical context-aware wrapper I proposed would be both thread-safe (since the process global would be changed once before the program went multi-threaded and then left alone) *and* potentially lower overhead (one context local lookup per stream attribute access, rather than a global state change every time a frame was suspended or resumed with a stream redirected)
sys.stdout/stderr/stdin are global variables, that's how the API is specified. API users assume that the change is process-wide.
Yeah, I wasn't suggesting any implicit changes to the way those work. However, it does occur to me that if we did add a new "contextlocals" API, then: 1. It could offer context-aware wrappers for stdin/stdout/stderr 2. If could offer context-aware alternatives to the stream redirection context managers in contextlib That approach would work even better than replacing sys.stdin/out/err themselves with wrappers since the wrappers wouldn't be vulnerable to being overwritten by other code that mutated the sys module. Anyway, that's not a serious proposal right now, but I do think it's decent validation of the power and flexibility of the proposed implicit state management model. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Right, Nick, I missed the part that you want to have a file-like wrapper stored in sys.std* streams that would redirect lookups/calls to the relevant real file-object in the current context (correct?) I has a similar idea when I discovered that PEP 550 can't be used directly to fix sys.std* streams redirection. Another idea: 1. We alter PyModule to make it possible to add properties (descriptor protocol, or we implement custom __getattr__). I think we can make it so that only sys module would be able to actually use it, so it's not going to be a new feature -- just a hack for CPython. 2. We make sys.std* attributes properties, with getters and setters. 3. sys.std* setters will: issue a DeprecationWarning; set whatever the user wants to set in a global variable + set a flag (let's call it "sys.__stdout_global_modified") that sys.std* were modified. 4. sys.std* getters will use PEP 550 to lookup when __stdout_global_modified is false. If it's true -- we fallback to globals. 5. We deprecate the current API and add new APIs for the redirection system that uses PEP 550 explicitly. 6. In Python 4 we remove the old sys.std* API. Thit is still *very* fragile: any code that writes to sys.stdout breaks all assumptions. But it offers a way to raise a warning when old-API is being used - something that we'll probably need if we add new APIs to fix this problem. Yury
On Aug 25, 2017, at 10:18, Yury Selivanov
I has a similar idea when I discovered that PEP 550 can't be used directly to fix sys.std* streams redirection. Another idea:
1. We alter PyModule to make it possible to add properties (descriptor protocol, or we implement custom __getattr__). I think we can make it so that only sys module would be able to actually use it, so it's not going to be a new feature -- just a hack for CPython.
2. We make sys.std* attributes properties, with getters and setters.
3. sys.std* setters will: issue a DeprecationWarning; set whatever the user wants to set in a global variable + set a flag (let's call it "sys.__stdout_global_modified") that sys.std* were modified.
4. sys.std* getters will use PEP 550 to lookup when __stdout_global_modified is false. If it's true -- we fallback to globals.
5. We deprecate the current API and add new APIs for the redirection system that uses PEP 550 explicitly.
6. In Python 4 we remove the old sys.std* API.
Thit is still *very* fragile: any code that writes to sys.stdout breaks all assumptions. But it offers a way to raise a warning when old-API is being used - something that we'll probably need if we add new APIs to fix this problem.
It’s ideas like this that do make me think of scopes when talking about global state and execution contexts. I understand that the current PEP 550 invokes an explicit separate namespace, but thinking bigger, if the usual patterns of just writing to sys.std{out,err} still worked and in the presence of single “threaded” execution it just did the normal thing, but in the presence of threads, async, etc. it *also* did the right thing, then code wouldn’t need to change just because you started to adopt async. That implies some scoping rules to make “sys.stdout” refer to the local execution’s sys.stdout if it were set, and the global sys.stdout if it were not. This would of course be a much deeper change to Python, with lots of tricky semantics and corner cases to get right. But it might be worth it to provide an execution model and an API that would be harder to get wrong because Python just Does the Right Thing. It’s difficult because you also have to be able to reason about what’s going on, and it’ll be imperative to be able to debug and examine the state of your execution when things go unexpected. That’s always harder when mixing dynamic scope with lexical scope, which I think is what PEP 550 is ultimately getting at. Cheers, -Barry
On Fri, Aug 25, 2017 at 11:10 AM, Barry Warsaw
It’s ideas like this that do make me think of scopes when talking about global state and execution contexts. I understand that the current PEP 550 invokes an explicit separate namespace, but thinking bigger, if the usual patterns of just writing to sys.std{out,err} still worked and in the presence of single “threaded” execution it just did the normal thing, but in the presence of threads, async, etc. it *also* did the right thing, then code wouldn’t need to change just because you started to adopt async. That implies some scoping rules to make “sys.stdout” refer to the local execution’s sys.stdout if it were set, and the global sys.stdout if it were not.
The problem here is exactly the "usual patterns of just writing to sys.std{out,err}". The usual pattern assumes that it's a global variable, and there are no ways of getting around this. None. There are many applications out there that are already written with the assumption that setting sys.stdout changes it for all threads, which means that we cannot change this already established semantics. The sys.std* API just needs to be slowly deprecated and replaced with a new API that uses context managers and does things differently under the hood, *if* and only if, we all agree that we even need to solve this problem. This is a completely separate problem from the one that PEP 550 solves, which is providing a better TLS that is aware of generators and async code. Yury
I think the issue with sys.std* is a distraction for this discussion. The issue also seems overstated, and I wouldn't want to change it. The ability to set these is mostly used in small programs that are also single-threaded. Libraries should never mess with them -- it's easy to explicitly pass an output file around, and for errors you should use logging or in a pinch you can write to sys.stderr, but you shouldn't set it. -- --Guido van Rossum (python.org/~guido)
On Fri, Aug 25, 2017 at 9:10 AM, Barry Warsaw
It’s ideas like this that do make me think of scopes when talking about global state and execution contexts. I understand that the current PEP 550 invokes an explicit separate namespace,
Right. The observation that PEP 550 proposes a separate stack of scopes from the lexical scope is an important one.
but thinking bigger, if the usual patterns of just writing to sys.std{out,err} still worked and in the presence of single “threaded” execution it just did the normal thing, but in the presence of threads, async, etc. it *also* did the right thing, then code wouldn’t need to change just because you started to adopt async. That implies some scoping rules to make “sys.stdout” refer to the local execution’s sys.stdout if it were set, and the global sys.stdout if it were not.
Yeah, at the bottom of the PEP 550 stack there'd need to be a proxy to the relevant global state. While working on the successor to PEP 406 (import state), I realized I'd need something like this.
This would of course be a much deeper change to Python, with lots of tricky semantics and corner cases to get right. But it might be worth it to provide an execution model and an API that would be harder to get wrong because Python just Does the Right Thing. It’s difficult because you also have to be able to reason about what’s going on, and it’ll be imperative to be able to debug and examine the state of your execution when things go unexpected. That’s always harder when mixing dynamic scope with lexical scope, which I think is what PEP 550 is ultimately getting at.
+1 Thankfully, PEP 550 is targeted more at a subset of library authors than at the general population. -eric
On Fri, Aug 25, 2017 at 8:18 AM, Yury Selivanov
Another idea:
1. We alter PyModule to make it possible to add properties (descriptor protocol, or we implement custom __getattr__). I think we can make it so that only sys module would be able to actually use it, so it's not going to be a new feature -- just a hack for CPython.
FWIW, I've been toying with a similar problem and solution for a while. I'd like to clean up the sys module, including grouping some of the attributes (e.g. the import state), turn the get/set pairs into properties, and deprecate direct usage of some of the attributes. Though supporting descriptors on module objects would work [1] and be useful (particularly deprecating module attrs), it's sufficiently worthy of a PEP that I haven't taken the time. Instead, the approach I settled on was to rename sys to _sys and add a sys written in Python that proxies _sys. Here's a rough first pass: https://github.com/ericsnowcurrently/cpython/tree/sys-module It's effectively the same thing as ModuleType supporting descriptors, but specific only to sys. One problem with both approaches is that we'd be changing the type of the sys module. There's a relatively common idiom in the stdlib (and elsewhere) of using "type(sys)" to get ModuleType. Changing the type of the sys module breaks that. -eric [1] This is doable with a custom __getattribute__ on ModuleType, though it will impact attribute lookup on all modules. I suppose there could be a subclass that does the right thing... Anyway, this is more python-ideas territory.
participants (10)
-
Antoine Pitrou
-
Barry Warsaw
-
Eric Snow
-
Ethan Furman
-
francismb
-
Greg Ewing
-
Guido van Rossum
-
Nick Coghlan
-
Paul Moore
-
Yury Selivanov