PEP 4XX: Adding sys.implementation

I've written up a PEP for the sys.implementation idea. Feedback is welcome! You'll notice some gaps which I'll be working on to fill in over the next couple days. Don't mind the gaps. <wink> They are in less critical (?) portions and I wanted to get this out to you before the weekend. Thanks! -eric -------------------------------------------------------------- PEP: 4XX Title: Adding sys.implementation Version: $Revision$ Last-Modified: $Date$ Author: Eric Snow <ericsnowcurrently@gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 26-April-2012 Python-Version: 3.3 Abstract ======== This PEP introduces a new variable for the sys module: ``sys.implementation``. The variable holds consolidated information about the implementation of the running interpreter. Thus ``sys.implementation`` is the source to which the standard library may look for implementation-specific information. The proposal in this PEP is in line with a broader emphasis on making Python friendlier to alternate implementations. It describes the new variable and the constraints on what that variable contains. The PEP also explains some immediate use cases for ``sys.implementation``. Motivation ========== For a number of years now, the distinction between Python-the-language and CPython (the reference implementation) has been growing. Most of this change is due to the emergence of Jython, IronPython, and PyPy as viable alternate implementations of Python. Consider, however, the nearly two decades of CPython-centric Python (i.e. most of its existance). That focus had understandably contributed to quite a few CPython-specific artifacts both in the standard library and exposed in the interpreter. Though the core developers have made an effort in recent years to address this, quite a few of the artifacts remain. Part of the solution is presented in this PEP: a single namespace on which to consolidate implementation specifics. This will help focus efforts to differentiate the implementation specifics from the language. Additionally, it will foster a multiple-implementation mindset. Proposal ======== We will add ``sys.implementation``, in the sys module, as a namespace to contain implementation-specific information. The contents of this namespace will remain fixed during interpreter execution and through the course of an implementation version. This ensures behaviors don't change between versions which depend on variables in ``sys.implementation``. ``sys.implementation`` is a dictionary, as opposed to any form of "named" tuple (a la ``sys.version_info``). This is partly because it doesn't have meaning as a sequence, and partly because it's a potentially more variable data structure. The namespace will contain at least the variables described in the `Required Variables`_ section below. However, implementations are free to add other implementation information there. Some possible extra variables are described in the `Other Possible Variables`_ section. This proposal takes a conservative approach in requiring only two variables. As more become appropriate, they may be added with discretion. Required Variables -------------------- These are variables in ``sys.implementation`` on which the standard library would rely, meaning they would need to be defined: name the name of the implementation (case sensitive). version the version of the implementation, as opposed to the version of the language it implements. This would use a standard format, similar to ``sys.version_info`` (see `Version Format`_). Other Possible Variables ------------------------ These variables could be useful, but don't necessarily have a clear use case presently: cache_tag a string used for the PEP 3147 cache tag (e.g. 'cpython33' for CPython 3.3). The name and version from above could be used to compose this, though an implementation may want something else. However, module caching is not a requirement of implementations, nor is the use of cache tags. repository the implementation's repository URL. repository_revision the revision identifier for the implementation. build_toolchain identifies the tools used to build the interpreter. url (or website) the URL of the implementation's site. site_prefix the preferred site prefix for this implementation. runtime the run-time environment in which the interpreter is running. gc_type the type of garbage collection used. Version Format -------------- XXX same as sys.version_info? Rationale ========= The status quo for implementation-specific information gives us that information in a more fragile, harder to maintain way. It's spread out over different modules or inferred from other information, as we see with ``platform.python_implementation()``. This PEP is the main alternative to that approach. It consolidates the implementation-specific information into a single namespace and makes explicit that which was implicit. With the single-namespace-under-sys so straightforward, no alternatives have been considered for this PEP. Discussion ========== The topic of ``sys.implementation`` came up on the python-ideas list in 2009, where the reception was broadly positive [1]_. I revived the discussion recently while working on a pure-python ``imp.get_tag()`` [2]_. The messages in `issue 14673`_ are also relevant. Use-cases ========= ``platform.python_implementation()`` ------------------------------------ "explicit is better than implicit" The platform module guesses the python implementation by looking for clues in a couple different sys variables [3]_. However, this approach is fragile. Beyond that, it's limited to those implementations that core developers have blessed by special-casing them in the platform module. With ``sys.implementation` the various implementations would *explicitly* set the values in their own version of the sys module. Aside from the guessing, another concern is that the platform module is part of the stdlib, which ideally would minimize implementation details such as would be moved to ``sys.implementation``. Any overlap between ``sys.implementation`` and the platform module would simply defer to ``sys.implementation`` (with the same interface in platform wrapping it). Cache Tag Generation in Frozen Importlib ---------------------------------------- PEP 3147 defined the use of a module cache and cache tags for file names. The importlib bootstrap code, frozen into the Python binary as of 3.3, uses the cache tags during the import process. Part of the project to bootstrap importlib has been to clean out of Lib/import.c any code that did not need to be there. The cache tag defined in Lib/import.c was hard-coded to ``"cpython" MAJOR MINOR`` [4]_. For importlib the options are either hard-coding it in the same way, or guessing the implementation in the same way as does ``platform.python_implementation()``. As long as the hard-coded tag is limited to CPython-specific code, it's livable. However, inasmuch as other Python implementations use the importlib code to work with the module cache, a hard-coded tag would become a problem.. Directly using the platform module in this case is a non-starter. Any module used in the importlib bootstrap must be built-in or frozen, neither of which apply to the platform module. This is the point that led to the recent interest in ``sys.implementation``. Regardless of how the implementation name is gotten, the version to use for the cache tag is more likely to be the implementation version rather than the language version. That implementation version is not readily identified anywhere in the standard library. Implementation-Specific Tests ----------------------------- XXX http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l509 http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1246 http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1252 http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1275 Jython's ``os.name`` Hack ------------------------- XXX http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l512 Impact on CPython ================= XXX Feedback From Other Python Implementators ========================================= IronPython ---------- XXX Jython ------ XXX PyPy ---- XXX Past Efforts ============ XXX PEP 3139 XXX PEP 399 Open Issues =========== * What are the long-term objectives for sys.implementation? - pull in implementation detail from the main sys namespace and elsewhere (PEP 3137 lite). * Alternatives to the approach dictated by this PEP? * ``sys.implementation`` as a proper namespace rather than a dict. It would be it's own module or an instance of a concrete class. Implementation ============== The implementatation of this PEP is covered in `issue 14673`_. References ========== .. [1] http://mail.python.org/pipermail/python-dev/2009-October/092893.html .. [2] http://mail.python.org/pipermail/python-ideas/2012-April/014878.html .. [3] http://hg.python.org/cpython/file/2f563908ebc5/Lib/platform.py#l1247 .. [4] http://hg.python.org/cpython/file/2f563908ebc5/Python/import.c#l121 .. _issue 14673 http://bugs.python.org/issue14673 Copyright ========= This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End:

Here's an update to the PEP. Though I have indirect or old feedback already, I'd love to hear from the other main Python implementations, particularly regarding the version variable. Thanks. -eric ------------------------------------------------------------- PEP: 421 Title: Adding sys.implementation Version: $Revision$ Last-Modified: $Date$ Author: Eric Snow <ericsnowcurrently@gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 26-April-2012 Post-History: 26-April-2012 Abstract ======== This PEP introduces a new variable for the ``sys`` module: ``sys.implementation``. The variable holds consolidated information about the implementation of the running interpreter. Thus ``sys.implementation`` is the source to which the standard library may look for implementation-specific information. The proposal in this PEP is in line with a broader emphasis on making Python friendlier to alternate implementations. It describes the new variable and the constraints on what that variable contains. The PEP also explains some immediate use cases for ``sys.implementation``. Motivation ========== For a number of years now, the distinction between Python-the-language and CPython (the reference implementation) has been growing. Most of this change is due to the emergence of Jython, IronPython, and PyPy as viable alternate implementations of Python. Consider, however, the nearly two decades of CPython-centric Python (i.e. most of its existance). That focus had understandably contributed to quite a few CPython-specific artifacts both in the standard library and exposed in the interpreter. Though the core developers have made an effort in recent years to address this, quite a few of the artifacts remain. Part of the solution is presented in this PEP: a single namespace on which to consolidate implementation specifics. This will help focus efforts to differentiate the implementation specifics from the language. Additionally, it will foster a multiple-implementation mindset. Proposal ======== We will add ``sys.implementation``, in the ``sys`` module, as a namespace to contain implementation-specific information. The contents of this namespace will remain fixed during interpreter execution and through the course of an implementation version. This ensures behaviors don't change between versions which depend on variables in ``sys.implementation``. ``sys.implementation`` will be a dictionary, as opposed to any form of "named" tuple (a la ``sys.version_info``). This is partly because it doesn't have meaning as a sequence, and partly because it's a potentially more variable data structure. The namespace will contain at least the variables described in the `Required Variables`_ section below. However, implementations are free to add other implementation information there. Some possible extra variables are described in the `Other Possible Variables`_ section. This proposal takes a conservative approach in requiring only two variables. As more become appropriate, they may be added with discretion. Required Variables -------------------- These are variables in ``sys.implementation`` on which the standard library would rely, meaning implementors must define them: name the name of the implementation (case sensitive). version the version of the implementation, as opposed to the version of the language it implements. This would use a standard format, similar to ``sys.version_info`` (see `Version Format`_). Other Possible Variables ------------------------ These variables could be useful, but don't necessarily have a clear use case presently: cache_tag a string used for the PEP 3147 cache tag (e.g. 'cpython33' for CPython 3.3). The name and version from above could be used to compose this, though an implementation may want something else. However, module caching is not a requirement of implementations, nor is the use of cache tags. repository the implementation's repository URL. repository_revision the revision identifier for the implementation. build_toolchain identifies the tools used to build the interpreter. url (or website) the URL of the implementation's site. site_prefix the preferred site prefix for this implementation. runtime the run-time environment in which the interpreter is running. gc_type the type of garbage collection used. Version Format -------------- A main point of ``sys.implementation`` is to contain information that will be used in the standard library. In order to facilitate the usefulness of a version variable, its value should be in a consistent format across implementations. XXX Subject to feedback As such, the format of ``sys.implementation['version']`` must follow that of ``sys.version_info``, which is effectively a named tuple. It is a familiar format and generally consistent with normal version format conventions. Rationale ========= The status quo for implementation-specific information gives us that information in a more fragile, harder to maintain way. It's spread out over different modules or inferred from other information, as we see with ``platform.python_implementation()``. This PEP is the main alternative to that approach. It consolidates the implementation-specific information into a single namespace and makes explicit that which was implicit. The ``sys`` module should old the new namespace because ``sys`` is the depot for interpreter-centric variables and functions. With the single-namespace-under-sys so straightforward, no alternatives have been considered for this PEP. Discussion ========== The topic of ``sys.implementation`` came up on the python-ideas list in 2009, where the reception was broadly positive [1]_. I revived the discussion recently while working on a pure-python ``imp.get_tag()`` [2]_. The messages in `issue #14673`_ are also relevant. Use-cases ========= ``platform.python_implementation()`` ------------------------------------ "explicit is better than implicit" The ``platform`` module guesses the python implementation by looking for clues in a couple different ``sys`` variables [3]_. However, this approach is fragile. Beyond that, it's limited to those implementations that core developers have blessed by special-casing them in the ``platform`` module. With ``sys.implementation`` the various implementations would *explicitly* set the values in their own version of the ``sys`` module. Aside from the guessing, another concern is that the ``platform`` module is part of the stdlib, which ideally would minimize implementation details such as would be moved to ``sys.implementation``. Any overlap between ``sys.implementation`` and the ``platform`` module would simply defer to ``sys.implementation`` (with the same interface in ``platform`` wrapping it). Cache Tag Generation in Frozen Importlib ---------------------------------------- PEP 3147 defined the use of a module cache and cache tags for file names. The importlib bootstrap code, frozen into the Python binary as of 3.3, uses the cache tags during the import process. Part of the project to bootstrap importlib has been to clean out of `Python/import.c` any code that did not need to be there. The cache tag defined in `Python/import.c` was hard-coded to ``"cpython" MAJOR MINOR`` [4]_. For importlib the options are either hard-coding it in the same way, or guessing the implementation in the same way as does ``platform.python_implementation()``. As long as the hard-coded tag is limited to CPython-specific code, it's livable. However, inasmuch as other Python implementations use the importlib code to work with the module cache, a hard-coded tag would become a problem.. Directly using the ``platform`` module in this case is a non-starter. Any module used in the importlib bootstrap must be built-in or frozen, neither of which apply to the ``platform`` module. This is the point that led to the recent interest in ``sys.implementation``. Regardless of the outcome for the implementation name used, another problem relates to the version used in the cache tag. That version is likely to be the implementation version rather than the language version. However, the implementation version is not readily identified anywhere in the standard library. Implementation-Specific Tests ----------------------------- Currently there are a number of implementation-specific tests in the test suite under ``Lib/test``. The test support module (`Lib/test/support.py`_) provides some functionality for dealing with these tests. However, like the ``platform`` module, ``test.support`` must do some guessing that ``sys.implementation`` would render unnecessary. Jython's ``os.name`` Hack ------------------------- XXX http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l512 Feedback From Other Python Implementators ========================================= IronPython ---------- XXX Jython ------ XXX PyPy ---- XXX Past Efforts ============ PEP 3139 -------- This PEP from 2008 recommended a clean-up of the ``sys`` module in part by extracting implementation-specific variables and functions into a separate module. PEP 421 is a much lighter version of that idea. While PEP 3139 was rejected, its goals are reflected in PEP 421 to a large extent, though with a much lighter approach. PEP 399 ------- This informational PEP dictates policy regarding the standard library, helping to make it friendlier to alternate implementations. PEP 421 is proposed in that same spirit. Open Issues =========== * What are the long-term objectives for ``sys.implementation``? - possibly pull in implementation details from the main ``sys`` namespace and elsewhere (PEP 3137 lite). * Alternatives to the approach dictated by this PEP? * ``sys.implementation`` as a proper namespace rather than a dict. It would be it's own module or an instance of a concrete class. Implementation ============== The implementatation of this PEP is covered in `issue #14673`_. References ========== .. [1] http://mail.python.org/pipermail/python-dev/2009-October/092893.html .. [2] http://mail.python.org/pipermail/python-ideas/2012-April/014878.html .. [3] http://hg.python.org/cpython/file/2f563908ebc5/Lib/platform.py#l1247 .. [4] http://hg.python.org/cpython/file/2f563908ebc5/Python/import.c#l121 .. [5] Examples of implementation-specific handling in test.support: | http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l509 | http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1246 | http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1252 | http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1275 .. _issue #14673: http://bugs.python.org/issue14673 .. _Lib/test/support.py: http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py Copyright ========= This document has been placed in the public domain.

On Fri, Apr 27, 2012 at 11:06 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote: <snip>
So, what's the justification for it being a dict rather than an object with attributes? The PEP merely (sensibly) concludes that it cannot be considered a sequence. Relatedly, I find the PEP's use of the term "namespace" in reference to a dict to be somewhat confusing. Cheers, Chris

On Sat, Apr 28, 2012 at 12:22 AM, Chris Rebert <pyideas@rebertia.com> wrote:
At this point I'm not aware of the strong justifications either way. However, sys.implementation is currently intended as a simple collection of variables. A dict reflects that. One obvious concern is that if we start off with a dict we're binding ourselves to that interface. If we later want concrete class with dotted lookup, we'd be looking at backwards-incompatibility. This is the part of the PEP that still needs more serious thought.
Relatedly, I find the PEP's use of the term "namespace" in reference to a dict to be somewhat confusing.
In my mind a mapping is a namespace. I don't have a problem changing that to mitigate any confusion. Thanks for the feedback. -eric

On Tue, May 1, 2012 at 12:39 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
I think it's a case where practicality beats purity. By using structseq, we get a nice representation and dotted attribute access, just as we have for sys.float_info. Providing this kind of convenience is the same reason collections.namedtuple exists. We should just document that the length of the tuple and the order of items is not guaranteed (either across implementations or between versions), and even the ability to iterate over the items or access them by index is not mandatory in an implementation. Would it be better if we had a separate "namespace" type in CPython that simply *disallowed* iteration and indexing? Perhaps, but we've survived long enough without it that I have my doubts about the practical need. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Mon, Apr 30, 2012 at 8:57 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
That was my original sentiment, partly for the "this is how it's already been done" aspect. Barry made a good point about sys.implementation.get(name) vs. getattr(sys.implementation, name, None). However, having dotted access still seems more correct. (continued below...)
That's a good point. Perhaps it depends on how general we expect the consumption of sys.implementation to be. If its practicality is oriented toward internal use then the data structure is not as critical. However, sys.implementation is intended to have a non-localized impact across the standard library and the interpreter. I'd rather not make hacking it become an attractive nuisance, regardless of our intentions for usage. This is where I usually defer to those that have been dealing for <non-trivial #> years with the aftermath of these types of decisions. <wink> -eric

Nick Coghlan wrote:
I have often wanted a namespace type, with class-like syntax and module-like semantics. In pseudocode: namespace Spam: x = 1 def ham(a): return x+a def cheese(a): return ham(a)*10 Spam.cheese(5) => returns 60 But I suspect that's not what you're talking about here in context. -- Steven

I've written up a PEP for the sys.implementation idea. Feedback is welcome!
Cool, it's better with PEP! Even the change looks trivial.
name the name of the implementation (case sensitive).
It would help if the PEP (and the documentation of sys.implementation) lists at least the most common names. I suppose that we would have something like: "CPython", "PyPy", "Jython", "IronPython".
Dummy question: what is sys.version/sys.version_info? The version of the implementation or the version of the Python lnaguage? The PEP should explain that, and maybe also the documentation of sys.implementation.version (something like "use sys.version_info to get the version of the Python language").
cache_tag
Why not adding this information to the imp module? Victor

On Sat, Apr 28, 2012 at 7:39 PM, Victor Stinner <victor.stinner@gmail.com> wrote:
Good point. I'll do that.
Yeah, sys.version (et al.) is the version of the language. It just happens to be the same as the implementation version for CPython. I'll make that more clear.
cache_tag
Why not adding this information to the imp module?
This is certainly something I need to clarify. Either the different implementors set these values in the various modules to which they pertain; or they set them all in one place (sys.implementation). I really think we should avoid having a mix. In my mind sys.implementation makes more sense. For example, in the case of cache_tag (which is merely a potential future variable), its value is an implementation detail used by importlib. Having it in sys.implementation would emphasize this point. -eric

On Tue, May 1, 2012 at 12:50 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
Personally, I think cache_tag should be part of the initial proposal. Implementations may want to use different cache tags depending on additional information that importlib shouldn't need to care about, and I think it would also be reasonable to allow "cache_tag=None" to disable the implicit caching altogether. The ultimate goal would be for us to be able to eliminate implementation checks from other parts of the standard library. importlib is a good place to start, since the idea is that, aside from the mechanism used to bootstrap it into place, along with optional acceleration of __import__, importlib itself should be implementation independent. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Mon, Apr 30, 2012 at 9:08 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Agreed. This is how I was thinking of it. I just wanted to keep things as minimal as possible to start. In importlib we can fall back to name+version if cache_tag isn't there. Still, of the potential variables, cache_tag is the strongest candidate, having a solid (if optional) use-case right now.
Spot on! -eric

On Apr 27, 2012, at 12:36 AM, Eric Snow wrote:
I've written up a PEP for the sys.implementation idea. Feedback is welcome!
Thanks for working on this PEP, Eric!
I agree that sequence semantics are meaningless here. Presumably, a dictionary is proposed because this cache_tag = sys.implementation.get('cache_tag') is nicer than cache_tag = getattr(sys.implementation, 'cache_tag', None) OTOH, maybe we need a nameddict type!
repository the implementation's repository URL.
What does this mean? Oh, I think you mean the URL for the VCS used to develop this version of the implementation. Maybe vcs_url (and even then there could be alternative blessed mirrors in other vcs's). A Debian analog are the Vcs-* header (e.g. Vcs-Git, Vcs-Bzr, etc.).
repository_revision the revision identifier for the implementation.
I'm not sure what this is. Is it like the hexgoo you see in the banner of a from-source build that identifies the revision used to build this interpreter? Is this key a replacement for that?
build_toolchain identifies the tools used to build the interpreter.
As a tuple of free-form strings?
url (or website) the URL of the implementation's site.
Maybe 'homepage' (another Debian analog).
I'm not sure what this means either. ;)
gc_type the type of garbage collection used.
Another free-form string? What would be the values say, for CPython and Jython?
Why not? :) It might be useful also to have something similar to sys.hexversion, which I often find convenient.
That's where this seems to be leaning. Even if it's a good idea, I bet it will be a long time before the old sys names can be removed.
Which might make sense, as would perhaps a top-level `implementation` module. IOW, why situate it in sys?
The implementatation of this PEP is covered in `issue 14673`_.
s/implementatation/implementation Nicely done! Let's see how those placeholders shake out. Cheers, -Barry

On Mon, Apr 30, 2012 at 3:04 PM, Barry Warsaw <barry@python.org> wrote:
That's a good point. Also, a dict better reflects a collection of variables that a dotted-access object, which to me implies the potential for methods as well.
OTOH, maybe we need a nameddict type!
You won't have to convince _me_. :)
Yeah, you got it. For CPython it would be "http://hg.python.org/cpython". You're right that vcs_url is more clear. I'll update it. Perhaps I should clarify "Other Possible Values" in the PEP? I'd intended it as a list of meaningful names, most of which others had suggested, that could be considered at some later point. That's part of why I didn't develop the descriptions there too much. Rather, I wanted to focus on the two primary names for now. Should those potential names be considered more seriously right now? I was hoping to keep it light to start out, just the things we'd use immediately.
I was thinking along those lines. For CPython, it could be 76678 or ab63e874265e or both. The decision on any constraints for this one would be subject to further discussion.
That would work. I expect it would depend on how it would be used.
Sounds good to me.
Yeah, it's not so clear there. For Jython it would be something like "jvm X.X", for IronPython it would be ".net CLR X.X" or whatever. Again the actual definition would be subject to more discussion relative to the use case, be it information or otherwise.
I was imagining a free-form string, like "reference counting" or "mark and sweep". I just depends on what people need it for.
That's the way I'm leaning. I've covered it a little more in the newer version of the PEP (on python-ideas).
Yeah, it's definitely not the focus of the PEP, but I think it's a valid long-term goal of which we should be cognizant.
Got it.
Nicely done! Let's see how those placeholders shake out.
Thanks. I'm glad to get this rolling. And yeah, I need to poke the folks with the other implementations to get their feedback (rather than rely on nods from 3 years ago). :) -eric

On Apr 30, 2012, at 09:22 PM, Eric Snow wrote:
I think you could keep it light (but +1 for adding cache_tag now). I'd suggest making it clear that neither the keys, values, nor semantics are actually being proposed in this PEP. The PEP could just include some examples for future additions (and thus de-emphasize that section of the PEP). It might be helpful to describe a mechanism by which future values would be added to sys.implementation. E.g. is a new PEP required for each? (I don't have an opinion on that right now. :) -Barry

On Apr 30, 2012, at 09:22 PM, Eric Snow wrote:
Well, I was being a bit facetious. You can easily implement those semantics in pure Python. 5 minute hack below. Cheers, -Barry -----snip snip----- #! /usr/bin/python3 _missing = object() import operator import unittest class Implementation: cache_tag = 'cpython33' name = 'CPython' def __getitem__(self, name, default=_missing): result = getattr(self, name, default) if result is _missing: raise AttributeError("'{}' object has no attribute '{}'".format( self.__class__.__name__, name)) return result def __setitem__(self, name, value): raise TypeError('read only') def __setattr__(self, name, value): raise TypeError('read only') implementation = Implementation() class TestImplementation(unittest.TestCase): def test_cache_tag(self): self.assertEqual(implementation.cache_tag, 'cpython33') self.assertEqual(implementation['cache_tag'], 'cpython33') def test_name(self): self.assertEqual(implementation.name, 'CPython') self.assertEqual(implementation['name'], 'CPython') def test_huh(self): self.assertRaises(AttributeError, operator.getitem, implementation, 'droids') self.assertRaises(AttributeError, getattr, implementation, 'droids') def test_read_only(self): self.assertRaises(TypeError, operator.setitem, implementation, 'droids', 'looking') self.assertRaises(TypeError, setattr, implementation, 'droids', 'looking') self.assertRaises(TypeError, operator.setitem, implementation, 'cache_tag', 'xpython99') self.assertRaises(TypeError, setattr, implementation, 'cache_tag', 'xpython99')

Eric Snow wrote:
Dicts have methods, and support iteration. A dict suggests to me that an arbitrary number of items could be included, rather than suggesting a record-like structure with an fixed number of items. (Even if that number varies from release to release.) On the other hand, a dict supports iteration, and len, so even if you don't know how many fields there are, you can always find them by iterating over the record. Syntax-wise, dotted name access seems right to me for this, similar to sys.float_info. If you know a field exists, sys.implementation.field is much nicer than sys.implementation['field']. I hate to admit it, but I'm starting to think that the right solution here is something like a dict with dotted name access. http://code.activestate.com/recipes/473786 http://code.activestate.com/recipes/576586 sort of thing. -- Steven

On Wed, May 2, 2012 at 11:09 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Whereas I'm thinking it makes sense to explicitly separate out "standard, must be defined by all conforming Python implementations" and "implementation specific extras" Under that model, we'd add an extra "metadata" field at the standard level to hold implementation specific fields. The initial set of standard fields would then be: name: the name of the implementation (e.g. "CPython", "IronPython", "PyPy", "Jython") version: the version of the implemenation (in sys.version_info format) cache_tag: the identifier used by importlib when caching bytecode files in __pycache__ (set to None to disable bytecode caching) metadata: a dict containing arbitrary additional information about a particular implementation sys.implementation.metadata would then give a home for information that needs to be builtin, without having to pollute the main sys namespace. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Tue, May 1, 2012 at 8:37 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
I really like this approach, particularly the separation aspect. Presumably sys.implementation would be more struct-like (static-ish, dotted-access namespace). I'll give it a day or two to stew and if it still seems like a good idea I'll weave it into the PEP. One question though: having it be iterable (a la structseq or namedtuple) doesn't seem to be a good fit, but does it matter? Likewise with mutability. Thoughts? -eric

On Wed, May 02, 2012 at 08:17:40PM -0600, Eric Snow wrote:
I am still unclear what justification there is for having a separate sys.version (from PEP 421: "the version of the Python language") and sys.implementation.version ("the version of the Python implementation"). Under what circumstances will one change but not the other? -- Steven

On 05/02/2012 09:49 PM, Steven D'Aprano wrote:
I know at least PyPy has separate "PyPy version" and "Python language compatibility version" numbers. They might choose to do a release that increments the PyPy version (because they've made improvements to the JIT or any number of other implementation-quality issues) but doesn't change the bundled stdlib version or language-compatibility version at all. Seems pretty reasonable to me. Carl

On Thu, May 3, 2012 at 1:49 PM, Steven D'Aprano <steve@pearwood.info> wrote:
The PyPy example is the real motivator. It allows "sys.version" to declare what version of Python the implementation intends to implement, while sys.implementation.version may be completely different. For example, a new implementation might declare sys.version_info as (3, 3, etc...) to indicate they're aiming at 3.3 compatibility, while setting sys.implementation.version to (0, 1, etc...) to reflect its actual immaturity as an implementation. Implementations are of course free to set the two numbers in lock step, and CPython, IronPython and Jython will likely continue to do exactly that. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Some corrections to the PEP text: platform.python_implementation() -------------------------------- The following text in the PEP needs to be updated: """ The platform module guesses the python implementation by looking for clues in a couple different sys variables [3]. However, this approach is fragile. """ Fact is, that sys.version parsing is documented to be done by the platform module (see the docs on sys.version), so implementations are free to provide patches in case they choose different ways of formatting sys.version. A sys.implementation record would make things easier for the platform module, though, so it's an improvement. sys.version ----------- sys.version is defined as "A string containing the version number of the Python interpreter plus additional information on the build number and compiler used. This string is displayed when the interactive interpreter is started. Do not extract version information out of it, rather, use version_info and the functions provided by the platform module. It's not defined as "version of the Python language" as the PEP appears to indicate. Other things: Making sys.implementation a dictionary -------------------------------------- This is not a good idea, since it allows for monkey-patching the values and will also result in new undocumented or per-implementation keys. Better use a namedtuple like we do for all other such informational resources. sys.implementation information ------------------------------ While I'm not sure whether details such as VCS URLs and revision ids should really be part of a data structure that is supposed to identify the implementation (sys.version is better for that), if you do want to add such information, then please add all of it, not just part of the available build information. See platform._sys_version() returns (name, version, branch, revision, buildno, builddate, compiler). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 03 2012)
2012-04-26: Released mxODBC 3.1.2 http://egenix.com/go28 2012-04-25: Released eGenix mx Base 3.2.4 http://egenix.com/go27 ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On Thu, May 3, 2012 at 2:20 AM, M.-A. Lemburg <mal@egenix.com> wrote:
Yeah, I'll update that to be softer and more clear.
This is an excellent point. sys.(version|version_info|hexversion) reflect CPython specifics, rather than the language itself. As far as I know the language does not have a "micro" version, nor a release level or serial. So where does that leave us? Undoubtedly no small number of people already depend on the the sys variables for CPython release info, so we can't just change the semantics. I'll clarify the PEP and add this to the open issues list because the PEP definitely needs to be clear here. Any suggestions on this point would be great.
Nick Coghlan made good suggestion on this front that I'm likely going to adopt: sys.implementation as an object (namespace with dotted access) with required attributes. One required attribute would be 'metadata', a dict where optional/per-implementation values could go. Having it be immutable (make monkey-patching hard) didn't seem like it mattered, though I'm not opposed. I just don't see that as a convincing reason for it to be a named tuple (structseq, etc.). To be honest, I'd like to avoid making sys.implementation any kind of sequence. It has no meaning as a sequence (hence why the PEP shifted from named tuple to dict). Unlike other informational sources, we expect that the namespace of required attributes will grow over time. As such, people shouldn't rely on a fixed number of attributes, which a named tuple would imply. As well, I'm not convinced that the order of the attributes is significant, nor that sequence unpacking is useful here. So in order to send the right message on both points, I'd rather not make it a sequence. It *could* be meaningful to implement the Mapping ABC, but I'm not going to specify that in the PEP without good reason. (I will add that as an open issue though.) Unless there is a good reason to use a named tuple, as opposed to a regular object, let's not. However, I'm still quite open to hearing out arguments on this point. -eric

Here's an update to the PEP. Though I have indirect or old feedback already, I'd love to hear from the other main Python implementations, particularly regarding the version variable. Thanks. -eric ------------------------------------------------------------- PEP: 421 Title: Adding sys.implementation Version: $Revision$ Last-Modified: $Date$ Author: Eric Snow <ericsnowcurrently@gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 26-April-2012 Post-History: 26-April-2012 Abstract ======== This PEP introduces a new variable for the ``sys`` module: ``sys.implementation``. The variable holds consolidated information about the implementation of the running interpreter. Thus ``sys.implementation`` is the source to which the standard library may look for implementation-specific information. The proposal in this PEP is in line with a broader emphasis on making Python friendlier to alternate implementations. It describes the new variable and the constraints on what that variable contains. The PEP also explains some immediate use cases for ``sys.implementation``. Motivation ========== For a number of years now, the distinction between Python-the-language and CPython (the reference implementation) has been growing. Most of this change is due to the emergence of Jython, IronPython, and PyPy as viable alternate implementations of Python. Consider, however, the nearly two decades of CPython-centric Python (i.e. most of its existance). That focus had understandably contributed to quite a few CPython-specific artifacts both in the standard library and exposed in the interpreter. Though the core developers have made an effort in recent years to address this, quite a few of the artifacts remain. Part of the solution is presented in this PEP: a single namespace on which to consolidate implementation specifics. This will help focus efforts to differentiate the implementation specifics from the language. Additionally, it will foster a multiple-implementation mindset. Proposal ======== We will add ``sys.implementation``, in the ``sys`` module, as a namespace to contain implementation-specific information. The contents of this namespace will remain fixed during interpreter execution and through the course of an implementation version. This ensures behaviors don't change between versions which depend on variables in ``sys.implementation``. ``sys.implementation`` will be a dictionary, as opposed to any form of "named" tuple (a la ``sys.version_info``). This is partly because it doesn't have meaning as a sequence, and partly because it's a potentially more variable data structure. The namespace will contain at least the variables described in the `Required Variables`_ section below. However, implementations are free to add other implementation information there. Some possible extra variables are described in the `Other Possible Variables`_ section. This proposal takes a conservative approach in requiring only two variables. As more become appropriate, they may be added with discretion. Required Variables -------------------- These are variables in ``sys.implementation`` on which the standard library would rely, meaning implementors must define them: name the name of the implementation (case sensitive). version the version of the implementation, as opposed to the version of the language it implements. This would use a standard format, similar to ``sys.version_info`` (see `Version Format`_). Other Possible Variables ------------------------ These variables could be useful, but don't necessarily have a clear use case presently: cache_tag a string used for the PEP 3147 cache tag (e.g. 'cpython33' for CPython 3.3). The name and version from above could be used to compose this, though an implementation may want something else. However, module caching is not a requirement of implementations, nor is the use of cache tags. repository the implementation's repository URL. repository_revision the revision identifier for the implementation. build_toolchain identifies the tools used to build the interpreter. url (or website) the URL of the implementation's site. site_prefix the preferred site prefix for this implementation. runtime the run-time environment in which the interpreter is running. gc_type the type of garbage collection used. Version Format -------------- A main point of ``sys.implementation`` is to contain information that will be used in the standard library. In order to facilitate the usefulness of a version variable, its value should be in a consistent format across implementations. XXX Subject to feedback As such, the format of ``sys.implementation['version']`` must follow that of ``sys.version_info``, which is effectively a named tuple. It is a familiar format and generally consistent with normal version format conventions. Rationale ========= The status quo for implementation-specific information gives us that information in a more fragile, harder to maintain way. It's spread out over different modules or inferred from other information, as we see with ``platform.python_implementation()``. This PEP is the main alternative to that approach. It consolidates the implementation-specific information into a single namespace and makes explicit that which was implicit. The ``sys`` module should old the new namespace because ``sys`` is the depot for interpreter-centric variables and functions. With the single-namespace-under-sys so straightforward, no alternatives have been considered for this PEP. Discussion ========== The topic of ``sys.implementation`` came up on the python-ideas list in 2009, where the reception was broadly positive [1]_. I revived the discussion recently while working on a pure-python ``imp.get_tag()`` [2]_. The messages in `issue #14673`_ are also relevant. Use-cases ========= ``platform.python_implementation()`` ------------------------------------ "explicit is better than implicit" The ``platform`` module guesses the python implementation by looking for clues in a couple different ``sys`` variables [3]_. However, this approach is fragile. Beyond that, it's limited to those implementations that core developers have blessed by special-casing them in the ``platform`` module. With ``sys.implementation`` the various implementations would *explicitly* set the values in their own version of the ``sys`` module. Aside from the guessing, another concern is that the ``platform`` module is part of the stdlib, which ideally would minimize implementation details such as would be moved to ``sys.implementation``. Any overlap between ``sys.implementation`` and the ``platform`` module would simply defer to ``sys.implementation`` (with the same interface in ``platform`` wrapping it). Cache Tag Generation in Frozen Importlib ---------------------------------------- PEP 3147 defined the use of a module cache and cache tags for file names. The importlib bootstrap code, frozen into the Python binary as of 3.3, uses the cache tags during the import process. Part of the project to bootstrap importlib has been to clean out of `Python/import.c` any code that did not need to be there. The cache tag defined in `Python/import.c` was hard-coded to ``"cpython" MAJOR MINOR`` [4]_. For importlib the options are either hard-coding it in the same way, or guessing the implementation in the same way as does ``platform.python_implementation()``. As long as the hard-coded tag is limited to CPython-specific code, it's livable. However, inasmuch as other Python implementations use the importlib code to work with the module cache, a hard-coded tag would become a problem.. Directly using the ``platform`` module in this case is a non-starter. Any module used in the importlib bootstrap must be built-in or frozen, neither of which apply to the ``platform`` module. This is the point that led to the recent interest in ``sys.implementation``. Regardless of the outcome for the implementation name used, another problem relates to the version used in the cache tag. That version is likely to be the implementation version rather than the language version. However, the implementation version is not readily identified anywhere in the standard library. Implementation-Specific Tests ----------------------------- Currently there are a number of implementation-specific tests in the test suite under ``Lib/test``. The test support module (`Lib/test/support.py`_) provides some functionality for dealing with these tests. However, like the ``platform`` module, ``test.support`` must do some guessing that ``sys.implementation`` would render unnecessary. Jython's ``os.name`` Hack ------------------------- XXX http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l512 Feedback From Other Python Implementators ========================================= IronPython ---------- XXX Jython ------ XXX PyPy ---- XXX Past Efforts ============ PEP 3139 -------- This PEP from 2008 recommended a clean-up of the ``sys`` module in part by extracting implementation-specific variables and functions into a separate module. PEP 421 is a much lighter version of that idea. While PEP 3139 was rejected, its goals are reflected in PEP 421 to a large extent, though with a much lighter approach. PEP 399 ------- This informational PEP dictates policy regarding the standard library, helping to make it friendlier to alternate implementations. PEP 421 is proposed in that same spirit. Open Issues =========== * What are the long-term objectives for ``sys.implementation``? - possibly pull in implementation details from the main ``sys`` namespace and elsewhere (PEP 3137 lite). * Alternatives to the approach dictated by this PEP? * ``sys.implementation`` as a proper namespace rather than a dict. It would be it's own module or an instance of a concrete class. Implementation ============== The implementatation of this PEP is covered in `issue #14673`_. References ========== .. [1] http://mail.python.org/pipermail/python-dev/2009-October/092893.html .. [2] http://mail.python.org/pipermail/python-ideas/2012-April/014878.html .. [3] http://hg.python.org/cpython/file/2f563908ebc5/Lib/platform.py#l1247 .. [4] http://hg.python.org/cpython/file/2f563908ebc5/Python/import.c#l121 .. [5] Examples of implementation-specific handling in test.support: | http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l509 | http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1246 | http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1252 | http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1275 .. _issue #14673: http://bugs.python.org/issue14673 .. _Lib/test/support.py: http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py Copyright ========= This document has been placed in the public domain.

On Fri, Apr 27, 2012 at 11:06 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote: <snip>
So, what's the justification for it being a dict rather than an object with attributes? The PEP merely (sensibly) concludes that it cannot be considered a sequence. Relatedly, I find the PEP's use of the term "namespace" in reference to a dict to be somewhat confusing. Cheers, Chris

On Sat, Apr 28, 2012 at 12:22 AM, Chris Rebert <pyideas@rebertia.com> wrote:
At this point I'm not aware of the strong justifications either way. However, sys.implementation is currently intended as a simple collection of variables. A dict reflects that. One obvious concern is that if we start off with a dict we're binding ourselves to that interface. If we later want concrete class with dotted lookup, we'd be looking at backwards-incompatibility. This is the part of the PEP that still needs more serious thought.
Relatedly, I find the PEP's use of the term "namespace" in reference to a dict to be somewhat confusing.
In my mind a mapping is a namespace. I don't have a problem changing that to mitigate any confusion. Thanks for the feedback. -eric

On Tue, May 1, 2012 at 12:39 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
I think it's a case where practicality beats purity. By using structseq, we get a nice representation and dotted attribute access, just as we have for sys.float_info. Providing this kind of convenience is the same reason collections.namedtuple exists. We should just document that the length of the tuple and the order of items is not guaranteed (either across implementations or between versions), and even the ability to iterate over the items or access them by index is not mandatory in an implementation. Would it be better if we had a separate "namespace" type in CPython that simply *disallowed* iteration and indexing? Perhaps, but we've survived long enough without it that I have my doubts about the practical need. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Mon, Apr 30, 2012 at 8:57 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
That was my original sentiment, partly for the "this is how it's already been done" aspect. Barry made a good point about sys.implementation.get(name) vs. getattr(sys.implementation, name, None). However, having dotted access still seems more correct. (continued below...)
That's a good point. Perhaps it depends on how general we expect the consumption of sys.implementation to be. If its practicality is oriented toward internal use then the data structure is not as critical. However, sys.implementation is intended to have a non-localized impact across the standard library and the interpreter. I'd rather not make hacking it become an attractive nuisance, regardless of our intentions for usage. This is where I usually defer to those that have been dealing for <non-trivial #> years with the aftermath of these types of decisions. <wink> -eric

Nick Coghlan wrote:
I have often wanted a namespace type, with class-like syntax and module-like semantics. In pseudocode: namespace Spam: x = 1 def ham(a): return x+a def cheese(a): return ham(a)*10 Spam.cheese(5) => returns 60 But I suspect that's not what you're talking about here in context. -- Steven

I've written up a PEP for the sys.implementation idea. Feedback is welcome!
Cool, it's better with PEP! Even the change looks trivial.
name the name of the implementation (case sensitive).
It would help if the PEP (and the documentation of sys.implementation) lists at least the most common names. I suppose that we would have something like: "CPython", "PyPy", "Jython", "IronPython".
Dummy question: what is sys.version/sys.version_info? The version of the implementation or the version of the Python lnaguage? The PEP should explain that, and maybe also the documentation of sys.implementation.version (something like "use sys.version_info to get the version of the Python language").
cache_tag
Why not adding this information to the imp module? Victor

On Sat, Apr 28, 2012 at 7:39 PM, Victor Stinner <victor.stinner@gmail.com> wrote:
Good point. I'll do that.
Yeah, sys.version (et al.) is the version of the language. It just happens to be the same as the implementation version for CPython. I'll make that more clear.
cache_tag
Why not adding this information to the imp module?
This is certainly something I need to clarify. Either the different implementors set these values in the various modules to which they pertain; or they set them all in one place (sys.implementation). I really think we should avoid having a mix. In my mind sys.implementation makes more sense. For example, in the case of cache_tag (which is merely a potential future variable), its value is an implementation detail used by importlib. Having it in sys.implementation would emphasize this point. -eric

On Tue, May 1, 2012 at 12:50 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
Personally, I think cache_tag should be part of the initial proposal. Implementations may want to use different cache tags depending on additional information that importlib shouldn't need to care about, and I think it would also be reasonable to allow "cache_tag=None" to disable the implicit caching altogether. The ultimate goal would be for us to be able to eliminate implementation checks from other parts of the standard library. importlib is a good place to start, since the idea is that, aside from the mechanism used to bootstrap it into place, along with optional acceleration of __import__, importlib itself should be implementation independent. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Mon, Apr 30, 2012 at 9:08 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Agreed. This is how I was thinking of it. I just wanted to keep things as minimal as possible to start. In importlib we can fall back to name+version if cache_tag isn't there. Still, of the potential variables, cache_tag is the strongest candidate, having a solid (if optional) use-case right now.
Spot on! -eric

On Apr 27, 2012, at 12:36 AM, Eric Snow wrote:
I've written up a PEP for the sys.implementation idea. Feedback is welcome!
Thanks for working on this PEP, Eric!
I agree that sequence semantics are meaningless here. Presumably, a dictionary is proposed because this cache_tag = sys.implementation.get('cache_tag') is nicer than cache_tag = getattr(sys.implementation, 'cache_tag', None) OTOH, maybe we need a nameddict type!
repository the implementation's repository URL.
What does this mean? Oh, I think you mean the URL for the VCS used to develop this version of the implementation. Maybe vcs_url (and even then there could be alternative blessed mirrors in other vcs's). A Debian analog are the Vcs-* header (e.g. Vcs-Git, Vcs-Bzr, etc.).
repository_revision the revision identifier for the implementation.
I'm not sure what this is. Is it like the hexgoo you see in the banner of a from-source build that identifies the revision used to build this interpreter? Is this key a replacement for that?
build_toolchain identifies the tools used to build the interpreter.
As a tuple of free-form strings?
url (or website) the URL of the implementation's site.
Maybe 'homepage' (another Debian analog).
I'm not sure what this means either. ;)
gc_type the type of garbage collection used.
Another free-form string? What would be the values say, for CPython and Jython?
Why not? :) It might be useful also to have something similar to sys.hexversion, which I often find convenient.
That's where this seems to be leaning. Even if it's a good idea, I bet it will be a long time before the old sys names can be removed.
Which might make sense, as would perhaps a top-level `implementation` module. IOW, why situate it in sys?
The implementatation of this PEP is covered in `issue 14673`_.
s/implementatation/implementation Nicely done! Let's see how those placeholders shake out. Cheers, -Barry

On Mon, Apr 30, 2012 at 3:04 PM, Barry Warsaw <barry@python.org> wrote:
That's a good point. Also, a dict better reflects a collection of variables that a dotted-access object, which to me implies the potential for methods as well.
OTOH, maybe we need a nameddict type!
You won't have to convince _me_. :)
Yeah, you got it. For CPython it would be "http://hg.python.org/cpython". You're right that vcs_url is more clear. I'll update it. Perhaps I should clarify "Other Possible Values" in the PEP? I'd intended it as a list of meaningful names, most of which others had suggested, that could be considered at some later point. That's part of why I didn't develop the descriptions there too much. Rather, I wanted to focus on the two primary names for now. Should those potential names be considered more seriously right now? I was hoping to keep it light to start out, just the things we'd use immediately.
I was thinking along those lines. For CPython, it could be 76678 or ab63e874265e or both. The decision on any constraints for this one would be subject to further discussion.
That would work. I expect it would depend on how it would be used.
Sounds good to me.
Yeah, it's not so clear there. For Jython it would be something like "jvm X.X", for IronPython it would be ".net CLR X.X" or whatever. Again the actual definition would be subject to more discussion relative to the use case, be it information or otherwise.
I was imagining a free-form string, like "reference counting" or "mark and sweep". I just depends on what people need it for.
That's the way I'm leaning. I've covered it a little more in the newer version of the PEP (on python-ideas).
Yeah, it's definitely not the focus of the PEP, but I think it's a valid long-term goal of which we should be cognizant.
Got it.
Nicely done! Let's see how those placeholders shake out.
Thanks. I'm glad to get this rolling. And yeah, I need to poke the folks with the other implementations to get their feedback (rather than rely on nods from 3 years ago). :) -eric

On Apr 30, 2012, at 09:22 PM, Eric Snow wrote:
I think you could keep it light (but +1 for adding cache_tag now). I'd suggest making it clear that neither the keys, values, nor semantics are actually being proposed in this PEP. The PEP could just include some examples for future additions (and thus de-emphasize that section of the PEP). It might be helpful to describe a mechanism by which future values would be added to sys.implementation. E.g. is a new PEP required for each? (I don't have an opinion on that right now. :) -Barry

On Apr 30, 2012, at 09:22 PM, Eric Snow wrote:
Well, I was being a bit facetious. You can easily implement those semantics in pure Python. 5 minute hack below. Cheers, -Barry -----snip snip----- #! /usr/bin/python3 _missing = object() import operator import unittest class Implementation: cache_tag = 'cpython33' name = 'CPython' def __getitem__(self, name, default=_missing): result = getattr(self, name, default) if result is _missing: raise AttributeError("'{}' object has no attribute '{}'".format( self.__class__.__name__, name)) return result def __setitem__(self, name, value): raise TypeError('read only') def __setattr__(self, name, value): raise TypeError('read only') implementation = Implementation() class TestImplementation(unittest.TestCase): def test_cache_tag(self): self.assertEqual(implementation.cache_tag, 'cpython33') self.assertEqual(implementation['cache_tag'], 'cpython33') def test_name(self): self.assertEqual(implementation.name, 'CPython') self.assertEqual(implementation['name'], 'CPython') def test_huh(self): self.assertRaises(AttributeError, operator.getitem, implementation, 'droids') self.assertRaises(AttributeError, getattr, implementation, 'droids') def test_read_only(self): self.assertRaises(TypeError, operator.setitem, implementation, 'droids', 'looking') self.assertRaises(TypeError, setattr, implementation, 'droids', 'looking') self.assertRaises(TypeError, operator.setitem, implementation, 'cache_tag', 'xpython99') self.assertRaises(TypeError, setattr, implementation, 'cache_tag', 'xpython99')

Eric Snow wrote:
Dicts have methods, and support iteration. A dict suggests to me that an arbitrary number of items could be included, rather than suggesting a record-like structure with an fixed number of items. (Even if that number varies from release to release.) On the other hand, a dict supports iteration, and len, so even if you don't know how many fields there are, you can always find them by iterating over the record. Syntax-wise, dotted name access seems right to me for this, similar to sys.float_info. If you know a field exists, sys.implementation.field is much nicer than sys.implementation['field']. I hate to admit it, but I'm starting to think that the right solution here is something like a dict with dotted name access. http://code.activestate.com/recipes/473786 http://code.activestate.com/recipes/576586 sort of thing. -- Steven

On Wed, May 2, 2012 at 11:09 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Whereas I'm thinking it makes sense to explicitly separate out "standard, must be defined by all conforming Python implementations" and "implementation specific extras" Under that model, we'd add an extra "metadata" field at the standard level to hold implementation specific fields. The initial set of standard fields would then be: name: the name of the implementation (e.g. "CPython", "IronPython", "PyPy", "Jython") version: the version of the implemenation (in sys.version_info format) cache_tag: the identifier used by importlib when caching bytecode files in __pycache__ (set to None to disable bytecode caching) metadata: a dict containing arbitrary additional information about a particular implementation sys.implementation.metadata would then give a home for information that needs to be builtin, without having to pollute the main sys namespace. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Tue, May 1, 2012 at 8:37 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
I really like this approach, particularly the separation aspect. Presumably sys.implementation would be more struct-like (static-ish, dotted-access namespace). I'll give it a day or two to stew and if it still seems like a good idea I'll weave it into the PEP. One question though: having it be iterable (a la structseq or namedtuple) doesn't seem to be a good fit, but does it matter? Likewise with mutability. Thoughts? -eric

On Wed, May 02, 2012 at 08:17:40PM -0600, Eric Snow wrote:
I am still unclear what justification there is for having a separate sys.version (from PEP 421: "the version of the Python language") and sys.implementation.version ("the version of the Python implementation"). Under what circumstances will one change but not the other? -- Steven

On 05/02/2012 09:49 PM, Steven D'Aprano wrote:
I know at least PyPy has separate "PyPy version" and "Python language compatibility version" numbers. They might choose to do a release that increments the PyPy version (because they've made improvements to the JIT or any number of other implementation-quality issues) but doesn't change the bundled stdlib version or language-compatibility version at all. Seems pretty reasonable to me. Carl

On Thu, May 3, 2012 at 1:49 PM, Steven D'Aprano <steve@pearwood.info> wrote:
The PyPy example is the real motivator. It allows "sys.version" to declare what version of Python the implementation intends to implement, while sys.implementation.version may be completely different. For example, a new implementation might declare sys.version_info as (3, 3, etc...) to indicate they're aiming at 3.3 compatibility, while setting sys.implementation.version to (0, 1, etc...) to reflect its actual immaturity as an implementation. Implementations are of course free to set the two numbers in lock step, and CPython, IronPython and Jython will likely continue to do exactly that. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Some corrections to the PEP text: platform.python_implementation() -------------------------------- The following text in the PEP needs to be updated: """ The platform module guesses the python implementation by looking for clues in a couple different sys variables [3]. However, this approach is fragile. """ Fact is, that sys.version parsing is documented to be done by the platform module (see the docs on sys.version), so implementations are free to provide patches in case they choose different ways of formatting sys.version. A sys.implementation record would make things easier for the platform module, though, so it's an improvement. sys.version ----------- sys.version is defined as "A string containing the version number of the Python interpreter plus additional information on the build number and compiler used. This string is displayed when the interactive interpreter is started. Do not extract version information out of it, rather, use version_info and the functions provided by the platform module. It's not defined as "version of the Python language" as the PEP appears to indicate. Other things: Making sys.implementation a dictionary -------------------------------------- This is not a good idea, since it allows for monkey-patching the values and will also result in new undocumented or per-implementation keys. Better use a namedtuple like we do for all other such informational resources. sys.implementation information ------------------------------ While I'm not sure whether details such as VCS URLs and revision ids should really be part of a data structure that is supposed to identify the implementation (sys.version is better for that), if you do want to add such information, then please add all of it, not just part of the available build information. See platform._sys_version() returns (name, version, branch, revision, buildno, builddate, compiler). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 03 2012)
2012-04-26: Released mxODBC 3.1.2 http://egenix.com/go28 2012-04-25: Released eGenix mx Base 3.2.4 http://egenix.com/go27 ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On Thu, May 3, 2012 at 2:20 AM, M.-A. Lemburg <mal@egenix.com> wrote:
Yeah, I'll update that to be softer and more clear.
This is an excellent point. sys.(version|version_info|hexversion) reflect CPython specifics, rather than the language itself. As far as I know the language does not have a "micro" version, nor a release level or serial. So where does that leave us? Undoubtedly no small number of people already depend on the the sys variables for CPython release info, so we can't just change the semantics. I'll clarify the PEP and add this to the open issues list because the PEP definitely needs to be clear here. Any suggestions on this point would be great.
Nick Coghlan made good suggestion on this front that I'm likely going to adopt: sys.implementation as an object (namespace with dotted access) with required attributes. One required attribute would be 'metadata', a dict where optional/per-implementation values could go. Having it be immutable (make monkey-patching hard) didn't seem like it mattered, though I'm not opposed. I just don't see that as a convincing reason for it to be a named tuple (structseq, etc.). To be honest, I'd like to avoid making sys.implementation any kind of sequence. It has no meaning as a sequence (hence why the PEP shifted from named tuple to dict). Unlike other informational sources, we expect that the namespace of required attributes will grow over time. As such, people shouldn't rely on a fixed number of attributes, which a named tuple would imply. As well, I'm not convinced that the order of the attributes is significant, nor that sequence unpacking is useful here. So in order to send the right message on both points, I'd rather not make it a sequence. It *could* be meaningful to implement the Mapping ABC, but I'm not going to specify that in the PEP without good reason. (I will add that as an open issue though.) Unless there is a good reason to use a named tuple, as opposed to a regular object, let's not. However, I'm still quite open to hearing out arguments on this point. -eric
participants (8)
-
Barry Warsaw
-
Carl Meyer
-
Chris Rebert
-
Eric Snow
-
M.-A. Lemburg
-
Nick Coghlan
-
Steven D'Aprano
-
Victor Stinner