[Python-Dev] PEP 514: Python registration in the Windows registry
Paul Moore
p.f.moore at gmail.com
Sat Jul 16 15:54:02 EDT 2016
On 15 July 2016 at 23:20, Steve Dower <steve.dower at python.org> wrote:
> Hi all
>
> I'd like to get this PEP approved (status changed to Active, IIUC).
Some comments below.
> So far (to my knowledge), Anaconda is writing out the new metadata and
> Visual Studio is reading it. Any changes to the schema now will require
> somewhat public review anyway, so I don't see any harm in approving the PEP
> right now.
>
> To reiterate, this doesn't require changing anything about CPython at all
> and has no backwards compatibility impact on official releases (but
> hopefully it will stop alternative distros from overwriting our essential
> metadata and causing problems).
Certainly there's nothing that impacts existing releases. I've noted
an issue around sys.winver below, that as an absolute minimum needs a
clarification in the 3.6 docs (the documented behaviour of sys.winver
isn't explicit enough to provide the uniqueness guarantees this PEP
needs) and may in fact need a code change or a PEP change if
sys.winver doesn't actually distinguish between 32-bit and 64-bit
builds (I've not been able to confirm that either way, unfortunately).
[...]
> Motivation
> ==========
>
> When installed on Windows, the official Python installer creates a registry
> key for discovery and detection by other applications. This allows tools such
> as installers or IDEs to automatically detect and display a user's Python
> installations.
The PEP seems quite strongly focused on GUI tools, where the normal
mode of operation would be to present the user with a list of
"available installations" (with extra data where possible, not just a
bare list of names) and ask for a selection. I'd like to see console
tools considered as well.
Basically, I'd like to avoid tool developers reading this section and
thinking "it only applies to GUI tools or OS integration, not to me".
For example, virtualenv introspects the available Python installations
- see https://github.com/pypa/virtualenv/blob/master/virtualenv.py#L86
- to support the "-p <interpreter>" flag. To handle this well, it
would be useful to allow distributions to register a "short tag", so
that as well as "-p 3.5" or "-p 2", Virtualenv could support (say) "-p
conda3.4" or "-p pypy2". (The short tag should be at the Company
level, so "conda" or "pypy", and the version gets added to that).
Another place where this might be useful is the py.exe launcher (it's
not in scope for this PEP, but having the data needed to allow the
launcher to invoke any available installation could be useful for
future enhancements).
Another key motivation for me would be to define clearly what
information tools can rely on being able to get from the available
registry entries describing what's installed. Whenever I've needed to
scan the registry, the things I've needed to find out are where I find
the Python interpreter, what Python version it is, and whether it's
32-bit or 64-bit. The first so that I can run Python, and the latter
two so that I can tell if this is a version I support *without*
needing to run the interpreter. For me, everything else in this PEP is
about UI, but those 3 items plus the "short tag" idea are more about
what capabilities I can provide.
[...]
> On 64-bit Windows, ``HKEY_LOCAL_MACHINE\Software\Wow6432Node`` is a special
> key that 32-bit processes transparently read and write to rather than
> accessing the ``Software`` key directly.
It might be worth being more explicit here that 32-bit and 64-bit
processes see the registry keys slightly differently. More on this
below.
> Backwards Compatibility
> -----------------------
>
> Python 3.4 and earlier did not distinguish between 32-bit and 64-bit builds
> in ``sys.winver``. As a result, it is possible to have valid side-by-side
> installations of both 32-bit and 64-bit interpreters.
(As Nick pointed out, "it is not possible to have valid...". I'd also
add "under the rules described above").
Also, Python 3.5 doesn't appear to include the architecture in
sys.winver either.
>py
Python 3.5.1 (v3.5.1:37a07cee5969, Dec 6 2015, 01:54:25) [MSC v.1900
64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.winver
'3.5'
(Unless it adds -32 for 32-bit, and reserves the bare version for
64-bit. I've skimmed the CPython source but can't confirm that). The
documentation of sys.winver makes no mention of whether it
distinguishes 32- and 64-bit builds. In fact, it states "The value is
normally the first three characters of version". If we're relying on
sys.winver being unique by version/architecture, the docs need to say
so (so that future changes don't accidentally violate that).
> To ensure backwards compatibility, applications should treat environments
> listed under the following two registry keys as distinct, even when the Tag
> matches::
>
> HKEY_LOCAL_MACHINE\Software\Python\PythonCore\<Tag>
> HKEY_LOCAL_MACHINE\Software\Wow6432Node\Python\PythonCore\<Tag>
>
> Environments listed under ``HKEY_CURRENT_USER`` may be treated as distinct
> from both of the above keys, potentially resulting in three environments
> discovered using the same Tag. Alternatively, a tool may determine whether
> the per-user environment is 64-bit or 32-bit and give it priority over the
> per-machine environment, resulting in a maximum of two discovered
> environments.
>
> It is not possible to detect side-by-side installations of both 64-bit and
> 32-bit versions of Python prior to 3.5 when they have been installed for the
> current user. Python 3.5 and later always uses different Tags for 64-bit and
> 32-bit versions.
>From what I can see, this latter isn't true. I presume that 64-bit
uses no suffix, but 32-bit uses a "-32" suffix? This should probably
be made explicit. At a minimum, if I were writing a tool to list all
installed Python versions, with only what I have available to go on
(the PEP and a 64-bit Python 3.5) I wouldn't be able to write correct
code, as I don't have all the information I need.
Also, if we expect to be able to distinguish 32 and 64 bit
implementations in this way, that's putting a new restriction on
sys.winver, that it returns a different value for 32-bit and 64-bit
builds. If that's the case, I'd rather see that explicitly documented,
both here and in the sys.winver documentation.
I'd actually prefer a more explicit mechanism going forward, but as
this is a "backward compatibility" section I'll save that for later.
> Environments registered under other Company names must use distinct Tags to
> support side-by-side installations. Tools consuming these registrations are
> not required to disambiguate tags other than by preferring the user's
> setting.
Clarification needed here? "Environments registered under other
Company names have no backward compatibility requirements, and thus
each distinct environment must use a distinct Tag, to support
side-by-side installations."
> Company
> -------
>
> The Company part of the key is intended to group related environments and to
> ensure that Tags are namespaced appropriately. The key name should be
> alphanumeric without spaces and likely to be unique. For example, a
> trademarked
> name, a UUID, or a hostname would be appropriate::
>
> HKEY_CURRENT_USER\Software\Python\ExampleCorp
> HKEY_CURRENT_USER\Software\Python\6C465E66-5A8C-4942-9E6A-D29159480C60
> HKEY_CURRENT_USER\Software\Python\www.example.com
I'd suggest adding "Human-readable Company values are preferred".
UUIDs seem like a horrible idea in practice.
> If a string value named ``DisplayName`` exists, it should be used to identify
> the environment category to users. Otherwise, the name of the key should be
> used.
>
> If a string value named ``SupportUrl`` exists, it may be displayed or
> otherwise used to direct users to a web site related to the environment.
The next few sections are talking about what data gets included in the
registry. Much of this is optional, which is perfectly OK, but there
are some defaulting rules here as well. I think we should clearly note
those data items that tools which read the data can rely on having
available. For example, the "Display Name" can always be obtained,
either directly or from the Company key. But the support URL may or
may not exist. This is important IMO, as it provides a guide for tool
writers over what details they are entitled to assume they know about
a distribution. This becomes more important later, when the technical
information starts appearing.
It's also worth noting that "Display Name" isn't actually as useful as
it sounds, in practice. A tool that relies on it would report the
python.org installers as being provided by "PythonCore", which isn't
particularly user friendly. Maybe we need something in the "Backward
Compatibility" section going into a bit more detail as to how tools
should deal with that, and maybe we need to add a "DisplayName" in
3.6+.
> The Tag part of the key is intended to uniquely identify an environment
> within those provided by a single company. The key name should be
> alphanumeric without spaces and stable across installations. For example, the
> Python language version, a UUID or a partial/complete hash would be
> appropriate; an integer counter that increases for each new environment may
> not::
>
> HKEY_CURRENT_USER\Software\Python\ExampleCorp\3.6
> HKEY_CURRENT_USER\Software\Python\ExampleCorp\6C465E66
Again, I'd add a recommendation that human readable Tag values be used
whenever possible.
> If a string value named ``DisplayName`` exists, it should be used to
> identify the environment to users. Otherwise, the name of the key should be used.
To an extent there's the same comment here as for DisplayName for
Company - it needs to be defined with consideration for how it will be
used. This is, of course, more of a "quality of implementation" matter
than a standards one. But the PEP might benefit from an example of
use, maybe showing the output from a hypothetical command line tool
that lists all installations on the machine.
> If a string value named ``Version`` exists, it should be used to identify the
> version of the environment. This is independent from the version of Python
> implemented by the environment.
>
> If a string value named ``SysVersion`` exists, it must be in ``x.y`` or
> ``x.y.z`` format matching the version returned by ``sys.version_info`` in the
> interpreter. Otherwise, if the Tag matches this format it is used. If not,
> the Python version is unknown.
I'm not too happy with this. What's the benefit of allowing an
installation to *not* provide the Python version? Instead, I'd prefer
to say:
1. All installations must provide the Python version. They are free to
use x.y or x.y.z. format (i.e., the micro version is optional -
although again what's the benefit? Why not mandate x.y for
consistency?). The rule given in SysVersion is fine (without the final
sentence).
2. If CPython *does*, as I'm assuming, use 3.5-32, then that's an
issue, because CPython doesn't follow the PEP. Maybe we should allow
the Tag to be version-architecture.
3. Following on from (2) we should include a string value
SysArchitecture for the architecture (32 or 64) as well. Again, this
should always be available from the value or the Tag.
The reason I think that the interpreter version and architecture
should be mandatory is because otherwise a tool that (for example)
only supports Python 3.4 or greater, or only 64-bit, has no way to
exclude unsupported installations.
So in summary:
SysVersion = x.y
SysArchitecture = 32 or 64
If SysArchitecture is missing, Tag must end in -32 or -64, and the
part after the "-" is the architecture.
If SysVersion is missing, Tag must be x.y or x.y-NN and the version is x.y.
For backward compatibility, if Company is "PythonCore",
SysArchitecture is missing, and Tag doesn't end in -NN, then
SysArchitecture is 32 if the registry key is under Wow6432Node.
Otherwise, it's 64 if we're a 64-bit process and 32 if we're a 32-bit
process. This final heuristic could be wrong, though, and code that
cannot cope with getting the wrong value (for example, it's planning
on loading the Python DLL into its address space) MUST take other
measures to check, or ignore any ambiguous entries.
(I'm open to the above being corrected - I didn't check any references
when writing it down).
BTW, is there any reason why the python.org installers couldn't be
modified to provide *all* the information suggested in this PEP,
rather than just sticking with what we've traditionally provided? It
would be a good example of how to register yourself "properly", as
well as avoiding the sort of ambiguity we see above.
> Note that each of these values is recommended, but optional.
SysVersion and SysArchitecture (or a Tag that works as a fallback)
should be mandatory. Otherwise I'm OK with this statement.
> Beneath the environment key, an ``InstallPath`` key must be created. This key
> is always named ``InstallPath``, and the default value must match
> ``sys.prefix``::
>
> HKEY_CURRENT_USER\Software\Python\ExampleCorp\3.6\InstallPath
> (Default) = "C:\ExampleCorpPy36"
>
> If a string value named ``ExecutablePath`` exists, it must be a path to the
> ``python.exe`` (or equivalent) executable. Otherwise, the interpreter
> executable is assumed to be called ``python.exe`` and exist in the directory
> referenced by the default value.
>
> If a string value named ``WindowedExecutablePath`` exists, it must be a path
> to the ``pythonw.exe`` (or equivalent) executable. Otherwise, the windowed
> interpreter executable is assumed to be called ``pythonw.exe`` and exist in
> the directory referenced by the default value.
These two items assume implicitly that a Python installation must
provide python.exe and pythonw.exe. I'm inclined to make this
explicit. Specifically, I think it's crucial that tools can read the
(console or windowed) executable path as described here, and run that
executable with standard Python command line arguments, and expect it
to work. Otherwise there's little point in the installation
registering its existence.
I can see an argument for a distribution providing just python.exe and
omitting pythonw.exe (or even the other way around). But I can't see
how I could write generic code to work with such a distribution. So
let's disallow that possibility until someone comes up with a concrete
use case
[...]
> Other Keys
> ----------
>
> Some other registry keys are used for defining or inferring search paths
> under certain conditions. A third-party installation is permitted to define
> these keys under their Company-Tag key, however, the interpreter must be
> modified and rebuilt in order to read these values. Alternatively, the
> interpreter may be modified to not use any registry keys for determining
> search paths. Making such changes is a decision for the third party; this PEP
> makes no recommendation either way.
I think we need to be clearer here. First of all, it should probably
clearly state that any subkey of the <Company>\<Tag> key (and any
value of that key, I guess), unless explicitly documented in this PEP,
is free for any use by the vendor (Although this may make later
expansion of this PEP hard - do we want to worry about that?). We
should also note that PythonCore has a number of such "private" keys,
and tools should not assume any particular meaning for them. Secondly,
I think we should be more explicit about the search path issue. Maybe
something like the following (this is based on my memory of the issue,
so apologies for any inaccuracy):
"""
The Python core has traditionally used certain other keys under the
PythonCore\<Tag> key to set interpreter paths and similar. This usage
is considered historical, and is retained mainly for backward
compatibility[1]. Third party installations are permitted to use a
similar approach under their own <Company>\<Tag> namespace, but the
interpreter must be modified and rebuilt in order to read these
values. Alternatively, the interpreter may be modified to not use any
registry keys (not even the PythonCore ones) for determining search
paths. Making such changes is a decision for the third party; this PEP
makes no recommendation either way. It should be noted, however, that
without modification, the Python interpreter's behaviour will be based
on the values under the PythonCore namespace, not under the vendor's
namespace.
"""
[1] Is this sentence true? IIRC, nothing new is using that feature,
and older stuff that did, such as pywin32, is removing it. But I know
of no actual plans to rip it out at any point.
Paul
More information about the Python-Dev
mailing list