PEP 4XX: Adding sys.implementation
I've written up a PEP for the sys.implementation idea. Feedback is welcome!
You'll notice some gaps which I'll be working on to fill in over the
next couple days. Don't mind the gaps. <wink> They are in less
critical (?) portions and I wanted to get this out to you before the
weekend. Thanks!
-eric
--------------------------------------------------------------
PEP: 4XX
Title: Adding sys.implementation
Version: $Revision$
Last-Modified: $Date$
Author: Eric Snow
Here's an update to the PEP. Though I have indirect or old feedback
already, I'd love to hear from the other main Python implementations,
particularly regarding the version variable. Thanks.
-eric
-------------------------------------------------------------
PEP: 421
Title: Adding sys.implementation
Version: $Revision$
Last-Modified: $Date$
Author: Eric Snow
On Fri, Apr 27, 2012 at 11:06 PM, Eric Snow
Proposal ========
We will add ``sys.implementation``, in the ``sys`` module, as a namespace to contain implementation-specific information.
The contents of this namespace will remain fixed during interpreter execution and through the course of an implementation version. This ensures behaviors don't change between versions which depend on variables in ``sys.implementation``.
``sys.implementation`` will be a dictionary, as opposed to any form of "named" tuple (a la ``sys.version_info``). This is partly because it doesn't have meaning as a sequence, and partly because it's a potentially more variable data structure. <snip> Open Issues ===========
* What are the long-term objectives for ``sys.implementation``?
- possibly pull in implementation details from the main ``sys`` namespace and elsewhere (PEP 3137 lite).
* Alternatives to the approach dictated by this PEP?
* ``sys.implementation`` as a proper namespace rather than a dict. It would be it's own module or an instance of a concrete class.
So, what's the justification for it being a dict rather than an object with attributes? The PEP merely (sensibly) concludes that it cannot be considered a sequence. Relatedly, I find the PEP's use of the term "namespace" in reference to a dict to be somewhat confusing. Cheers, Chris
On Sat, Apr 28, 2012 at 12:22 AM, Chris Rebert
On Fri, Apr 27, 2012 at 11:06 PM, Eric Snow
wrote: * ``sys.implementation`` as a proper namespace rather than a dict. It would be it's own module or an instance of a concrete class.
So, what's the justification for it being a dict rather than an object with attributes? The PEP merely (sensibly) concludes that it cannot be considered a sequence.
At this point I'm not aware of the strong justifications either way. However, sys.implementation is currently intended as a simple collection of variables. A dict reflects that. One obvious concern is that if we start off with a dict we're binding ourselves to that interface. If we later want concrete class with dotted lookup, we'd be looking at backwards-incompatibility. This is the part of the PEP that still needs more serious thought.
Relatedly, I find the PEP's use of the term "namespace" in reference to a dict to be somewhat confusing.
In my mind a mapping is a namespace. I don't have a problem changing that to mitigate any confusion. Thanks for the feedback. -eric
On Tue, May 1, 2012 at 12:39 PM, Eric Snow
On Sat, Apr 28, 2012 at 12:22 AM, Chris Rebert
wrote: On Fri, Apr 27, 2012 at 11:06 PM, Eric Snow
wrote: * ``sys.implementation`` as a proper namespace rather than a dict. It would be it's own module or an instance of a concrete class.
So, what's the justification for it being a dict rather than an object with attributes? The PEP merely (sensibly) concludes that it cannot be considered a sequence.
At this point I'm not aware of the strong justifications either way. However, sys.implementation is currently intended as a simple collection of variables. A dict reflects that.
One obvious concern is that if we start off with a dict we're binding ourselves to that interface. If we later want concrete class with dotted lookup, we'd be looking at backwards-incompatibility. This is the part of the PEP that still needs more serious thought.
I think it's a case where practicality beats purity. By using structseq, we get a nice representation and dotted attribute access, just as we have for sys.float_info. Providing this kind of convenience is the same reason collections.namedtuple exists. We should just document that the length of the tuple and the order of items is not guaranteed (either across implementations or between versions), and even the ability to iterate over the items or access them by index is not mandatory in an implementation. Would it be better if we had a separate "namespace" type in CPython that simply *disallowed* iteration and indexing? Perhaps, but we've survived long enough without it that I have my doubts about the practical need. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Mon, Apr 30, 2012 at 8:57 PM, Nick Coghlan
On Tue, May 1, 2012 at 12:39 PM, Eric Snow
wrote: At this point I'm not aware of the strong justifications either way. However, sys.implementation is currently intended as a simple collection of variables. A dict reflects that.
One obvious concern is that if we start off with a dict we're binding ourselves to that interface. If we later want concrete class with dotted lookup, we'd be looking at backwards-incompatibility. This is the part of the PEP that still needs more serious thought.
I think it's a case where practicality beats purity. By using structseq, we get a nice representation and dotted attribute access, just as we have for sys.float_info. Providing this kind of convenience is the same reason collections.namedtuple exists.
That was my original sentiment, partly for the "this is how it's already been done" aspect. Barry made a good point about sys.implementation.get(name) vs. getattr(sys.implementation, name, None). However, having dotted access still seems more correct. (continued below...)
We should just document that the length of the tuple and the order of items is not guaranteed (either across implementations or between versions), and even the ability to iterate over the items or access them by index is not mandatory in an implementation. Would it be better if we had a separate "namespace" type in CPython that simply *disallowed* iteration and indexing? Perhaps, but we've survived long enough without it that I have my doubts about the practical need.
That's a good point. Perhaps it depends on how general we expect the
consumption of sys.implementation to be. If its practicality is
oriented toward internal use then the data structure is not as
critical. However, sys.implementation is intended to have a
non-localized impact across the standard library and the interpreter.
I'd rather not make hacking it become an attractive nuisance,
regardless of our intentions for usage.
This is where I usually defer to those that have been dealing for
Nick Coghlan wrote:
Would it be better if we had a separate "namespace" type in CPython that simply *disallowed* iteration and indexing? Perhaps, but we've survived long enough without it that I have my doubts about the practical need.
I have often wanted a namespace type, with class-like syntax and module-like semantics. In pseudocode: namespace Spam: x = 1 def ham(a): return x+a def cheese(a): return ham(a)*10 Spam.cheese(5) => returns 60 But I suspect that's not what you're talking about here in context. -- Steven
I've written up a PEP for the sys.implementation idea. Feedback is welcome!
Cool, it's better with PEP! Even the change looks trivial.
name the name of the implementation (case sensitive).
It would help if the PEP (and the documentation of sys.implementation) lists at least the most common names. I suppose that we would have something like: "CPython", "PyPy", "Jython", "IronPython".
version the version of the implementation, as opposed to the version of the language it implements. This would use a standard format, similar to ``sys.version_info`` (see `Version Format`_).
Dummy question: what is sys.version/sys.version_info? The version of the implementation or the version of the Python lnaguage? The PEP should explain that, and maybe also the documentation of sys.implementation.version (something like "use sys.version_info to get the version of the Python language").
cache_tag
Why not adding this information to the imp module? Victor
On Sat, Apr 28, 2012 at 7:39 PM, Victor Stinner
I've written up a PEP for the sys.implementation idea. Feedback is welcome!
Cool, it's better with PEP! Even the change looks trivial.
name the name of the implementation (case sensitive).
It would help if the PEP (and the documentation of sys.implementation) lists at least the most common names. I suppose that we would have something like: "CPython", "PyPy", "Jython", "IronPython".
Good point. I'll do that.
version the version of the implementation, as opposed to the version of the language it implements. This would use a standard format, similar to ``sys.version_info`` (see `Version Format`_).
Dummy question: what is sys.version/sys.version_info? The version of the implementation or the version of the Python lnaguage? The PEP should explain that, and maybe also the documentation of sys.implementation.version (something like "use sys.version_info to get the version of the Python language").
Yeah, sys.version (et al.) is the version of the language. It just happens to be the same as the implementation version for CPython. I'll make that more clear.
cache_tag
Why not adding this information to the imp module?
This is certainly something I need to clarify. Either the different implementors set these values in the various modules to which they pertain; or they set them all in one place (sys.implementation). I really think we should avoid having a mix. In my mind sys.implementation makes more sense. For example, in the case of cache_tag (which is merely a potential future variable), its value is an implementation detail used by importlib. Having it in sys.implementation would emphasize this point. -eric
On Tue, May 1, 2012 at 12:50 PM, Eric Snow
In my mind sys.implementation makes more sense. For example, in the case of cache_tag (which is merely a potential future variable), its value is an implementation detail used by importlib. Having it in sys.implementation would emphasize this point.
Personally, I think cache_tag should be part of the initial proposal. Implementations may want to use different cache tags depending on additional information that importlib shouldn't need to care about, and I think it would also be reasonable to allow "cache_tag=None" to disable the implicit caching altogether. The ultimate goal would be for us to be able to eliminate implementation checks from other parts of the standard library. importlib is a good place to start, since the idea is that, aside from the mechanism used to bootstrap it into place, along with optional acceleration of __import__, importlib itself should be implementation independent. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Mon, Apr 30, 2012 at 9:08 PM, Nick Coghlan
On Tue, May 1, 2012 at 12:50 PM, Eric Snow
wrote: In my mind sys.implementation makes more sense. For example, in the case of cache_tag (which is merely a potential future variable), its value is an implementation detail used by importlib. Having it in sys.implementation would emphasize this point.
Personally, I think cache_tag should be part of the initial proposal. Implementations may want to use different cache tags depending on additional information that importlib shouldn't need to care about, and I think it would also be reasonable to allow "cache_tag=None" to disable the implicit caching altogether.
Agreed. This is how I was thinking of it. I just wanted to keep things as minimal as possible to start. In importlib we can fall back to name+version if cache_tag isn't there. Still, of the potential variables, cache_tag is the strongest candidate, having a solid (if optional) use-case right now.
The ultimate goal would be for us to be able to eliminate implementation checks from other parts of the standard library. importlib is a good place to start, since the idea is that, aside from the mechanism used to bootstrap it into place, along with optional acceleration of __import__, importlib itself should be implementation independent.
Spot on! -eric
On Mon, Apr 30, 2012 at 9:08 PM, Nick Coghlan
On Tue, May 1, 2012 at 12:50 PM, Eric Snow
wrote: In my mind sys.implementation makes more sense. For example, in the case of cache_tag (which is merely a potential future variable), its value is an implementation detail used by importlib. Having it in sys.implementation would emphasize this point.
Personally, I think cache_tag should be part of the initial proposal. Implementations may want to use different cache tags depending on additional information that importlib shouldn't need to care about, and I think it would also be reasonable to allow "cache_tag=None" to disable the implicit caching altogether.
I'm going to leave it as-is for the moment, but I'm leaning toward doing this. -eric
On Apr 27, 2012, at 12:36 AM, Eric Snow wrote:
I've written up a PEP for the sys.implementation idea. Feedback is welcome!
Thanks for working on this PEP, Eric!
``sys.implementation`` is a dictionary, as opposed to any form of "named" tuple (a la ``sys.version_info``). This is partly because it doesn't have meaning as a sequence, and partly because it's a potentially more variable data structure.
I agree that sequence semantics are meaningless here. Presumably, a dictionary is proposed because this cache_tag = sys.implementation.get('cache_tag') is nicer than cache_tag = getattr(sys.implementation, 'cache_tag', None) OTOH, maybe we need a nameddict type!
repository the implementation's repository URL.
What does this mean? Oh, I think you mean the URL for the VCS used to develop this version of the implementation. Maybe vcs_url (and even then there could be alternative blessed mirrors in other vcs's). A Debian analog are the Vcs-* header (e.g. Vcs-Git, Vcs-Bzr, etc.).
repository_revision the revision identifier for the implementation.
I'm not sure what this is. Is it like the hexgoo you see in the banner of a from-source build that identifies the revision used to build this interpreter? Is this key a replacement for that?
build_toolchain identifies the tools used to build the interpreter.
As a tuple of free-form strings?
url (or website) the URL of the implementation's site.
Maybe 'homepage' (another Debian analog).
site_prefix the preferred site prefix for this implementation.
runtime the run-time environment in which the interpreter is running.
I'm not sure what this means either. ;)
gc_type the type of garbage collection used.
Another free-form string? What would be the values say, for CPython and Jython?
Version Format --------------
XXX same as sys.version_info?
Why not? :) It might be useful also to have something similar to sys.hexversion, which I often find convenient.
* What are the long-term objectives for sys.implementation?
- pull in implementation detail from the main sys namespace and elsewhere (PEP 3137 lite).
That's where this seems to be leaning. Even if it's a good idea, I bet it will be a long time before the old sys names can be removed.
* Alternatives to the approach dictated by this PEP?
* ``sys.implementation`` as a proper namespace rather than a dict. It would be it's own module or an instance of a concrete class.
Which might make sense, as would perhaps a top-level `implementation` module. IOW, why situate it in sys?
The implementatation of this PEP is covered in `issue 14673`_.
s/implementatation/implementation Nicely done! Let's see how those placeholders shake out. Cheers, -Barry
On Mon, Apr 30, 2012 at 3:04 PM, Barry Warsaw
On Apr 27, 2012, at 12:36 AM, Eric Snow wrote:
``sys.implementation`` is a dictionary, as opposed to any form of "named" tuple (a la ``sys.version_info``). This is partly because it doesn't have meaning as a sequence, and partly because it's a potentially more variable data structure.
I agree that sequence semantics are meaningless here. Presumably, a dictionary is proposed because this
cache_tag = sys.implementation.get('cache_tag')
is nicer than
cache_tag = getattr(sys.implementation, 'cache_tag', None)
That's a good point. Also, a dict better reflects a collection of variables that a dotted-access object, which to me implies the potential for methods as well.
OTOH, maybe we need a nameddict type!
You won't have to convince _me_. :)
repository the implementation's repository URL.
What does this mean? Oh, I think you mean the URL for the VCS used to develop this version of the implementation. Maybe vcs_url (and even then there could be alternative blessed mirrors in other vcs's). A Debian analog are the Vcs-* header (e.g. Vcs-Git, Vcs-Bzr, etc.).
Yeah, you got it. For CPython it would be "http://hg.python.org/cpython". You're right that vcs_url is more clear. I'll update it. Perhaps I should clarify "Other Possible Values" in the PEP? I'd intended it as a list of meaningful names, most of which others had suggested, that could be considered at some later point. That's part of why I didn't develop the descriptions there too much. Rather, I wanted to focus on the two primary names for now. Should those potential names be considered more seriously right now? I was hoping to keep it light to start out, just the things we'd use immediately.
repository_revision the revision identifier for the implementation.
I'm not sure what this is. Is it like the hexgoo you see in the banner of a from-source build that identifies the revision used to build this interpreter? Is this key a replacement for that?
I was thinking along those lines. For CPython, it could be 76678 or ab63e874265e or both. The decision on any constraints for this one would be subject to further discussion.
build_toolchain identifies the tools used to build the interpreter.
As a tuple of free-form strings?
That would work. I expect it would depend on how it would be used.
url (or website) the URL of the implementation's site.
Maybe 'homepage' (another Debian analog).
Sounds good to me.
site_prefix the preferred site prefix for this implementation.
runtime the run-time environment in which the interpreter is running.
I'm not sure what this means either. ;)
Yeah, it's not so clear there. For Jython it would be something like "jvm X.X", for IronPython it would be ".net CLR X.X" or whatever. Again the actual definition would be subject to more discussion relative to the use case, be it information or otherwise.
gc_type the type of garbage collection used.
Another free-form string? What would be the values say, for CPython and Jython?
I was imagining a free-form string, like "reference counting" or "mark and sweep". I just depends on what people need it for.
Version Format --------------
XXX same as sys.version_info?
Why not? :) It might be useful also to have something similar to sys.hexversion, which I often find convenient.
That's the way I'm leaning. I've covered it a little more in the newer version of the PEP (on python-ideas).
* What are the long-term objectives for sys.implementation?
- pull in implementation detail from the main sys namespace and elsewhere (PEP 3137 lite).
That's where this seems to be leaning. Even if it's a good idea, I bet it will be a long time before the old sys names can be removed.
Yeah, it's definitely not the focus of the PEP, but I think it's a valid long-term goal of which we should be cognizant.
* Alternatives to the approach dictated by this PEP?
* ``sys.implementation`` as a proper namespace rather than a dict. It would be it's own module or an instance of a concrete class.
Which might make sense, as would perhaps a top-level `implementation` module. IOW, why situate it in sys?
The implementatation of this PEP is covered in `issue 14673`_.
s/implementatation/implementation
Got it.
Nicely done! Let's see how those placeholders shake out.
Thanks. I'm glad to get this rolling. And yeah, I need to poke the folks with the other implementations to get their feedback (rather than rely on nods from 3 years ago). :) -eric
On Apr 30, 2012, at 09:22 PM, Eric Snow wrote:
Perhaps I should clarify "Other Possible Values" in the PEP? I'd intended it as a list of meaningful names, most of which others had suggested, that could be considered at some later point. That's part of why I didn't develop the descriptions there too much. Rather, I wanted to focus on the two primary names for now.
Should those potential names be considered more seriously right now? I was hoping to keep it light to start out, just the things we'd use immediately.
I think you could keep it light (but +1 for adding cache_tag now). I'd suggest making it clear that neither the keys, values, nor semantics are actually being proposed in this PEP. The PEP could just include some examples for future additions (and thus de-emphasize that section of the PEP). It might be helpful to describe a mechanism by which future values would be added to sys.implementation. E.g. is a new PEP required for each? (I don't have an opinion on that right now. :) -Barry
On Tue, May 1, 2012 at 4:25 PM, Barry Warsaw
On Apr 30, 2012, at 09:22 PM, Eric Snow wrote:
Perhaps I should clarify "Other Possible Values" in the PEP? I'd intended it as a list of meaningful names, most of which others had suggested, that could be considered at some later point. That's part of why I didn't develop the descriptions there too much. Rather, I wanted to focus on the two primary names for now.
Should those potential names be considered more seriously right now? I was hoping to keep it light to start out, just the things we'd use immediately.
I think you could keep it light (but +1 for adding cache_tag now).
cache_tag it is.
I'd suggest making it clear that neither the keys, values, nor semantics are actually being proposed in this PEP. The PEP could just include some examples for future additions (and thus de-emphasize that section of the PEP).
It might be helpful to describe a mechanism by which future values would be added to sys.implementation. E.g. is a new PEP required for each? (I don't have an opinion on that right now. :)
This is a good direction. I'll update the PEP. Thanks! -eric
On Apr 30, 2012, at 09:22 PM, Eric Snow wrote:
I agree that sequence semantics are meaningless here. Presumably, a dictionary is proposed because this
cache_tag = sys.implementation.get('cache_tag')
is nicer than
cache_tag = getattr(sys.implementation, 'cache_tag', None)
That's a good point. Also, a dict better reflects a collection of variables that a dotted-access object, which to me implies the potential for methods as well.
OTOH, maybe we need a nameddict type!
You won't have to convince _me_. :)
Well, I was being a bit facetious. You can easily implement those semantics in pure Python. 5 minute hack below. Cheers, -Barry -----snip snip----- #! /usr/bin/python3 _missing = object() import operator import unittest class Implementation: cache_tag = 'cpython33' name = 'CPython' def __getitem__(self, name, default=_missing): result = getattr(self, name, default) if result is _missing: raise AttributeError("'{}' object has no attribute '{}'".format( self.__class__.__name__, name)) return result def __setitem__(self, name, value): raise TypeError('read only') def __setattr__(self, name, value): raise TypeError('read only') implementation = Implementation() class TestImplementation(unittest.TestCase): def test_cache_tag(self): self.assertEqual(implementation.cache_tag, 'cpython33') self.assertEqual(implementation['cache_tag'], 'cpython33') def test_name(self): self.assertEqual(implementation.name, 'CPython') self.assertEqual(implementation['name'], 'CPython') def test_huh(self): self.assertRaises(AttributeError, operator.getitem, implementation, 'droids') self.assertRaises(AttributeError, getattr, implementation, 'droids') def test_read_only(self): self.assertRaises(TypeError, operator.setitem, implementation, 'droids', 'looking') self.assertRaises(TypeError, setattr, implementation, 'droids', 'looking') self.assertRaises(TypeError, operator.setitem, implementation, 'cache_tag', 'xpython99') self.assertRaises(TypeError, setattr, implementation, 'cache_tag', 'xpython99')
Eric Snow wrote:
On Mon, Apr 30, 2012 at 3:04 PM, Barry Warsaw
wrote: On Apr 27, 2012, at 12:36 AM, Eric Snow wrote:
``sys.implementation`` is a dictionary, as opposed to any form of "named" tuple (a la ``sys.version_info``). This is partly because it doesn't have meaning as a sequence, and partly because it's a potentially more variable data structure. I agree that sequence semantics are meaningless here. Presumably, a dictionary is proposed because this
cache_tag = sys.implementation.get('cache_tag')
is nicer than
cache_tag = getattr(sys.implementation, 'cache_tag', None)
That's a good point. Also, a dict better reflects a collection of variables that a dotted-access object, which to me implies the potential for methods as well.
Dicts have methods, and support iteration. A dict suggests to me that an arbitrary number of items could be included, rather than suggesting a record-like structure with an fixed number of items. (Even if that number varies from release to release.) On the other hand, a dict supports iteration, and len, so even if you don't know how many fields there are, you can always find them by iterating over the record. Syntax-wise, dotted name access seems right to me for this, similar to sys.float_info. If you know a field exists, sys.implementation.field is much nicer than sys.implementation['field']. I hate to admit it, but I'm starting to think that the right solution here is something like a dict with dotted name access. http://code.activestate.com/recipes/473786 http://code.activestate.com/recipes/576586 sort of thing. -- Steven
On Wed, May 2, 2012 at 11:09 AM, Steven D'Aprano
Syntax-wise, dotted name access seems right to me for this, similar to sys.float_info. If you know a field exists, sys.implementation.field is much nicer than sys.implementation['field'].
I hate to admit it, but I'm starting to think that the right solution here is something like a dict with dotted name access.
Whereas I'm thinking it makes sense to explicitly separate out "standard, must be defined by all conforming Python implementations" and "implementation specific extras" Under that model, we'd add an extra "metadata" field at the standard level to hold implementation specific fields. The initial set of standard fields would then be: name: the name of the implementation (e.g. "CPython", "IronPython", "PyPy", "Jython") version: the version of the implemenation (in sys.version_info format) cache_tag: the identifier used by importlib when caching bytecode files in __pycache__ (set to None to disable bytecode caching) metadata: a dict containing arbitrary additional information about a particular implementation sys.implementation.metadata would then give a home for information that needs to be builtin, without having to pollute the main sys namespace. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Tue, May 1, 2012 at 8:37 PM, Nick Coghlan
On Wed, May 2, 2012 at 11:09 AM, Steven D'Aprano
wrote: Syntax-wise, dotted name access seems right to me for this, similar to sys.float_info. If you know a field exists, sys.implementation.field is much nicer than sys.implementation['field'].
I hate to admit it, but I'm starting to think that the right solution here is something like a dict with dotted name access.
Whereas I'm thinking it makes sense to explicitly separate out "standard, must be defined by all conforming Python implementations" and "implementation specific extras"
Under that model, we'd add an extra "metadata" field at the standard level to hold implementation specific fields. The initial set of standard fields would then be:
name: the name of the implementation (e.g. "CPython", "IronPython", "PyPy", "Jython") version: the version of the implemenation (in sys.version_info format) cache_tag: the identifier used by importlib when caching bytecode files in __pycache__ (set to None to disable bytecode caching) metadata: a dict containing arbitrary additional information about a particular implementation
sys.implementation.metadata would then give a home for information that needs to be builtin, without having to pollute the main sys namespace.
I really like this approach, particularly the separation aspect. Presumably sys.implementation would be more struct-like (static-ish, dotted-access namespace). I'll give it a day or two to stew and if it still seems like a good idea I'll weave it into the PEP. One question though: having it be iterable (a la structseq or namedtuple) doesn't seem to be a good fit, but does it matter? Likewise with mutability. Thoughts? -eric
On Mon, Apr 30, 2012 at 3:04 PM, Barry Warsaw
On Apr 27, 2012, at 12:36 AM, Eric Snow wrote:
Version Format --------------
XXX same as sys.version_info?
Why not? :) It might be useful also to have something similar to sys.hexversion, which I often find convenient.
Would it be worth mirroring all 3 (sys.version, sys.version_info, sys.hexversion)? Symmetry is nice, but it also makes sense if the each would be as meaningful as they are in sys. -eric
On Wed, May 02, 2012 at 08:17:40PM -0600, Eric Snow wrote:
On Mon, Apr 30, 2012 at 3:04 PM, Barry Warsaw
wrote: On Apr 27, 2012, at 12:36 AM, Eric Snow wrote:
Version Format --------------
XXX same as sys.version_info?
Why not? :) It might be useful also to have something similar to sys.hexversion, which I often find convenient.
Would it be worth mirroring all 3 (sys.version, sys.version_info, sys.hexversion)? Symmetry is nice, but it also makes sense if the each would be as meaningful as they are in sys.
I am still unclear what justification there is for having a separate sys.version (from PEP 421: "the version of the Python language") and sys.implementation.version ("the version of the Python implementation"). Under what circumstances will one change but not the other? -- Steven
On 05/02/2012 09:49 PM, Steven D'Aprano wrote:
On Wed, May 02, 2012 at 08:17:40PM -0600, Eric Snow wrote:
On Mon, Apr 30, 2012 at 3:04 PM, Barry Warsaw
wrote: On Apr 27, 2012, at 12:36 AM, Eric Snow wrote:
Version Format --------------
XXX same as sys.version_info?
Why not? :) It might be useful also to have something similar to sys.hexversion, which I often find convenient.
Would it be worth mirroring all 3 (sys.version, sys.version_info, sys.hexversion)? Symmetry is nice, but it also makes sense if the each would be as meaningful as they are in sys.
I am still unclear what justification there is for having a separate sys.version (from PEP 421: "the version of the Python language") and sys.implementation.version ("the version of the Python implementation"). Under what circumstances will one change but not the other?
I know at least PyPy has separate "PyPy version" and "Python language compatibility version" numbers. They might choose to do a release that increments the PyPy version (because they've made improvements to the JIT or any number of other implementation-quality issues) but doesn't change the bundled stdlib version or language-compatibility version at all. Seems pretty reasonable to me. Carl
On Wed, May 2, 2012 at 8:49 PM, Steven D'Aprano
On Wed, May 02, 2012 at 08:17:40PM -0600, Eric Snow wrote:
On Mon, Apr 30, 2012 at 3:04 PM, Barry Warsaw
wrote: On Apr 27, 2012, at 12:36 AM, Eric Snow wrote: <snip> I am still unclear what justification there is for having a separate sys.version (from PEP 421: "the version of the Python language") and sys.implementation.version ("the version of the Python implementation"). Under what circumstances will one change but not the other?
In the event of an implementation bugfix? The Python version implemented would be unchanged, but the implementation version would be incremented slightly. Cheers, Chris
On Thu, May 3, 2012 at 1:49 PM, Steven D'Aprano
I am still unclear what justification there is for having a separate sys.version (from PEP 421: "the version of the Python language") and sys.implementation.version ("the version of the Python implementation"). Under what circumstances will one change but not the other?
The PyPy example is the real motivator. It allows "sys.version" to declare what version of Python the implementation intends to implement, while sys.implementation.version may be completely different. For example, a new implementation might declare sys.version_info as (3, 3, etc...) to indicate they're aiming at 3.3 compatibility, while setting sys.implementation.version to (0, 1, etc...) to reflect its actual immaturity as an implementation. Implementations are of course free to set the two numbers in lock step, and CPython, IronPython and Jython will likely continue to do exactly that. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Some corrections to the PEP text: platform.python_implementation() -------------------------------- The following text in the PEP needs to be updated: """ The platform module guesses the python implementation by looking for clues in a couple different sys variables [3]. However, this approach is fragile. """ Fact is, that sys.version parsing is documented to be done by the platform module (see the docs on sys.version), so implementations are free to provide patches in case they choose different ways of formatting sys.version. A sys.implementation record would make things easier for the platform module, though, so it's an improvement. sys.version ----------- sys.version is defined as "A string containing the version number of the Python interpreter plus additional information on the build number and compiler used. This string is displayed when the interactive interpreter is started. Do not extract version information out of it, rather, use version_info and the functions provided by the platform module. It's not defined as "version of the Python language" as the PEP appears to indicate. Other things: Making sys.implementation a dictionary -------------------------------------- This is not a good idea, since it allows for monkey-patching the values and will also result in new undocumented or per-implementation keys. Better use a namedtuple like we do for all other such informational resources. sys.implementation information ------------------------------ While I'm not sure whether details such as VCS URLs and revision ids should really be part of a data structure that is supposed to identify the implementation (sys.version is better for that), if you do want to add such information, then please add all of it, not just part of the available build information. See platform._sys_version() returns (name, version, branch, revision, buildno, builddate, compiler). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 03 2012)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2012-04-26: Released mxODBC 3.1.2 http://egenix.com/go28 2012-04-25: Released eGenix mx Base 3.2.4 http://egenix.com/go27 ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
On Thu, May 3, 2012 at 2:20 AM, M.-A. Lemburg
Some corrections to the PEP text:
platform.python_implementation() --------------------------------
The following text in the PEP needs to be updated:
""" The platform module guesses the python implementation by looking for clues in a couple different sys variables [3]. However, this approach is fragile. """
Fact is, that sys.version parsing is documented to be done by the platform module (see the docs on sys.version), so implementations are free to provide patches in case they choose different ways of formatting sys.version.
A sys.implementation record would make things easier for the platform module, though, so it's an improvement.
Yeah, I'll update that to be softer and more clear.
sys.version -----------
sys.version is defined as "A string containing the version number of the Python interpreter plus additional information on the build number and compiler used. This string is displayed when the interactive interpreter is started. Do not extract version information out of it, rather, use version_info and the functions provided by the platform module.
It's not defined as "version of the Python language" as the PEP appears to indicate.
This is an excellent point. sys.(version|version_info|hexversion) reflect CPython specifics, rather than the language itself. As far as I know the language does not have a "micro" version, nor a release level or serial. So where does that leave us? Undoubtedly no small number of people already depend on the the sys variables for CPython release info, so we can't just change the semantics. I'll clarify the PEP and add this to the open issues list because the PEP definitely needs to be clear here. Any suggestions on this point would be great.
Other things:
Making sys.implementation a dictionary --------------------------------------
This is not a good idea, since it allows for monkey-patching the values and will also result in new undocumented or per-implementation keys.
Better use a namedtuple like we do for all other such informational resources.
Nick Coghlan made good suggestion on this front that I'm likely going to adopt: sys.implementation as an object (namespace with dotted access) with required attributes. One required attribute would be 'metadata', a dict where optional/per-implementation values could go. Having it be immutable (make monkey-patching hard) didn't seem like it mattered, though I'm not opposed. I just don't see that as a convincing reason for it to be a named tuple (structseq, etc.). To be honest, I'd like to avoid making sys.implementation any kind of sequence. It has no meaning as a sequence (hence why the PEP shifted from named tuple to dict). Unlike other informational sources, we expect that the namespace of required attributes will grow over time. As such, people shouldn't rely on a fixed number of attributes, which a named tuple would imply. As well, I'm not convinced that the order of the attributes is significant, nor that sequence unpacking is useful here. So in order to send the right message on both points, I'd rather not make it a sequence. It *could* be meaningful to implement the Mapping ABC, but I'm not going to specify that in the PEP without good reason. (I will add that as an open issue though.) Unless there is a good reason to use a named tuple, as opposed to a regular object, let's not. However, I'm still quite open to hearing out arguments on this point. -eric
participants (8)
-
Barry Warsaw
-
Carl Meyer
-
Chris Rebert
-
Eric Snow
-
M.-A. Lemburg
-
Nick Coghlan
-
Steven D'Aprano
-
Victor Stinner