distutils.util.get_platform() for Windows
Many of the distutils "commands" use distutils.util.get_platform() as the basis for file and directory names used to package up extensions. On Windows, this returns the value of sys.platform. On all (desktop) Windows versions, this currently returns 'win32'. This causes a problem when trying to create a 64bit version of an extension. For example, using bdist_msi, the pywin32 extensions end up with a filename of 'pywin32-211.win32-py2.5.msi' for both 32bit and 64bit versions. This is not desirable for (hopefully) obvious reasons. I'd like to propose that an (untested and against 2.5) patch similar to the following be adopted in distutils: Index: util.py =================================================================== --- util.py (revision 56286) +++ util.py (working copy) @@ -29,8 +29,19 @@ irix-5.3 irix64-6.2 - For non-POSIX platforms, currently just returns 'sys.platform'. + For Windows, the result will be one of 'win32', 'amd64' or 'itanium' + + For other non-POSIX platforms, currently just returns 'sys.platform'. """ + if os.name == 'nt': + # copied from msvccompiler - find the processor architecture + prefix = " bit (" + i = string.find(sys.version, prefix) + if i == -1: + return sys.platform + j = string.find(sys.version, ")", i) + return sys.version[i+len(prefix):j].lower() + if os.name != "posix" or not hasattr(os, 'uname'): # XXX what about the architecture? NT is Intel or Alpha, # Mac OS is M68k or PPC, etc. This will result in both the final version of most bdist_* installations having the architecture in the filename. It also has the nice side effect of having the temp directories used by these commands include the architecture in their names, meaning its possible to build multiple Windows architectures from the same build tree, although that is not the primary motivation. Also note that bdist_msi has 'win32' hard-coded in one place, where a call to get_platform() would be more appropriate, but I'm assuming that is a bug (ie, bdist_msi should use get_platform() regardless of the outcome of this discussion about what get_platform() should return) Note that this issue is quite different than, but ultimately impacted by, the cross-compiling issue. Its quite different as even when building x64 natively on x64, 'win32' is used in the generated filename and this patch fixes that. It is impacted by cross-compiling, as it assumes the host environment is the target environment - but so does the rest of that function on all platforms. Any objections? Cheers, Mark
At 04:00 PM 7/18/2007 +1000, Mark Hammond wrote:
This will result in both the final version of most bdist_* installations having the architecture in the filename. It also has the nice side effect of having the temp directories used by these commands include the architecture in their names, meaning its possible to build multiple Windows architectures from the same build tree, although that is not the primary motivation.
I presume the intention of this is to have it end up as either 'win32' or 'win64', yes?
Also note that bdist_msi has 'win32' hard-coded in one place, where a call to get_platform() would be more appropriate, but I'm assuming that is a bug (ie, bdist_msi should use get_platform() regardless of the outcome of this discussion about what get_platform() should return)
Well, if it becomes possible to build msi's on other platforms, they're still going to target Windows. Currently you can build a bdist_wininst on Linux, for example, especially if it's only pure Python contents.
This will result in both the final version of most bdist_* installations having the architecture in the filename. It also has the nice side effect of having the temp directories used by these commands include the architecture in their names, meaning its possible to build multiple Windows architectures from the same build tree, although that is not
At 04:00 PM 7/18/2007 +1000, Mark Hammond wrote: the primary
motivation.
I presume the intention of this is to have it end up as either 'win32' or 'win64', yes?
Probably 'win32', 'amd64' or 'itanium' - I'm not worried about the specific strings, but there would need to be different ones for each of the 64bit architectures.
Also note that bdist_msi has 'win32' hard-coded in one place, where a call to get_platform() would be more appropriate, but I'm assuming that is a bug (ie, bdist_msi should use get_platform() regardless of the outcome of this discussion about what get_platform() should return)
Well, if it becomes possible to build msi's on other platforms, they're still going to target Windows. Currently you can build a bdist_wininst on Linux, for example, especially if it's only pure Python contents.
If the feature you refer to was capable of making packages with extension modules (ie, a kind of cross-compile environment), it seems to me that the same problem would exist - regardless of the architecture, the generated filename would always be identical. There would be no way to identify the architecture from just the name of the file. On the other hand, if this ability to create an MSI on Linux will forever be limited to *only* pure-python packages, I'd think the needs of people who need to package multiple architecures on Windows trumps this feature. Its not clear from your reply, but do you believe that all architectures having identical filenames is a problem? If so, how do you think we should approach it? Cheers, Mark
At 09:13 AM 7/19/2007 +1000, Mark Hammond wrote:
Its not clear from your reply, but do you believe that all architectures having identical filenames is a problem? If so, how do you think we should approach it?
Mostly, I'm just interested in understanding how to update setuptools' platform API functions: http://peak.telecommunity.com/DevCenter/PkgResources#platform-utilities In particular, the difference here between the supported platform vs. build platform, and how to tell whether two platform strings are compatible. Setuptools uses these functions to know what eggs will work on the current system, as well as what filenames to build eggs with.
Its not clear from your reply, but do you believe that all architectures having identical filenames is a problem? If so, how do you
At 09:13 AM 7/19/2007 +1000, Mark Hammond wrote: think we should
approach it?
I'm still not sure what the answer to the question is though :)
Mostly, I'm just interested in understanding how to update setuptools' platform API functions:
http://peak.telecommunity.com/DevCenter/PkgResources#platform-utilities
That sounds worthwhile - but I'm unsure why we wouldn't simply update distutils in similar ways, and then have setuptools borrow that implementation, especially if the requirements are similar - which they seem to be. Does setuptools have unique requirements in this regard, or is there some other reason I'm missing why we can't kill multiple birds with a single stone? I assume that 'distutils' is still the 'officially preferred' way of building extensions? Or maybe distutils used directly really is considered dead, so I'm wasting my time even discussing changes to distutils itself? I'm really just trying to make it simple for the next person trying to build for 64bit Windows platforms (I've got a build - it just uses lots of hacks that may not be obvious to others), but I'm no longer sure what we expect this next person to be using when they build their extensions... Mark
At 10:06 AM 7/19/2007 +1000, Mark Hammond wrote:
Its not clear from your reply, but do you believe that all architectures having identical filenames is a problem? If so, how do you
At 09:13 AM 7/19/2007 +1000, Mark Hammond wrote: think we should
approach it?
I'm still not sure what the answer to the question is though :)
I just want to make sure that I understand your proposal well enough to ensure that 1) it won't cause trouble for eggs and 2) I can implement the necessary changes, if any, to setuptools.
Mostly, I'm just interested in understanding how to update setuptools' platform API functions:
http://peak.telecommunity.com/DevCenter/PkgResources#platform-utilities
That sounds worthwhile - but I'm unsure why we wouldn't simply update distutils in similar ways,
Because distutils' get_platform() isn't really clear about the distinctions that setuptools makes explicit, about version compatibility. In a sense, distutils' get_platform assumes that no platform can run any other platform's code, even though this isn't so for at least Mac OS. For example, if a 'win32' package can be used on 'win64', setuptools needs to know this -- but distutils has no clue and doesn't care, because distutils doesn't *do* anything with platform information except generate filenames (including build directory file names). Notice that setuptools has *two* different get_*_platform APIs, because of this distinction, plus a "compatible_platforms" function; distutils doesn't have any of that. So if we move away from win32 as a platform designator, I need to understand *precisely* what values will replace it, so that compatible_platforms will be able to support them correctly. In general, platform strings are an underspecified area of the distutils, so if we change anything about them, I'd like to improve the specification. I'm not happy with adding more code whose behavior is implementation-defined, as it leaves me with no way to figure out what platforms are compatible with what.
Phillip writes:
Because distutils' get_platform() isn't really clear about the distinctions that setuptools makes explicit, about version compatibility. In a sense, distutils' get_platform assumes that no platform can run any other platform's code, even though this isn't so for at least Mac OS.
For example, if a 'win32' package can be used on 'win64', setuptools needs to know this -- but distutils has no clue and doesn't care, because distutils doesn't *do* anything with platform information except generate filenames (including build directory file names).
IIUC, this is a little tricky - in this specific example, it depends on what Python architecture is installed. On an x64 system, a 32bit Python and 32bit extensions can be installed - but an x64 extension can not. However, if an x64 version of Python itself is installed, 32bit extensions can not be. Note that both x64 and x32 builds of Python on Windows report 'win32' for sys.platform and 'nt' for os.name. So from this perspective, it seems no compatibility can be recorded between the architectures, as they can never be mixed.
Notice that setuptools has *two* different get_*_platform APIs, because of this distinction, plus a "compatible_platforms" function; distutils doesn't have any of that.
I understand that is how distutils works now. I'm questioning if that is the way distutils should keep working in the future. Is there some reason I'm missing why we can't *add* this functionality to distutils and have setuptools consume it? Or is the (defacto) intent that cross-compilation only be supported via setuptools? I've no objection to this being the case, but "explicit is better than implicit" <wink>
So if we move away from win32 as a platform designator, I need to understand *precisely* what values will replace it, so that compatible_platforms will be able to support them correctly.
Yes, part of this process would be deciding the precise strings that would be returned.
In general, platform strings are an underspecified area of the distutils, so if we change anything about them, I'd like to improve the specification. I'm not happy with adding more code whose behavior is implementation-defined, as it leaves me with no way to figure out what platforms are compatible with what.
I've no problem with that - but best I can tell, no compatibility could be recorded for this specific example, so this problem doesn't really depend on a better way of capturing compatibility between platforms. Cheers, Mark
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Mark Hammond wrote:
This will result in both the final version of most bdist_* installations having the architecture in the filename. It also has the nice side effect of having the temp directories used by these commands include the architecture in their names, meaning its possible to build multiple Windows architectures from the same build tree, although that is not
At 04:00 PM 7/18/2007 +1000, Mark Hammond wrote: the primary
motivation. I presume the intention of this is to have it end up as either 'win32' or 'win64', yes?
Probably 'win32', 'amd64' or 'itanium' - I'm not worried about the specific strings, but there would need to be different ones for each of the 64bit architectures.
Why would you use processor type IDs to indicate Windows-specifc platforms? Lots of systems running on AMD64 boxen don't run windows (can't say "lots" and "Itanium" in the same sentence, I guess, but I know for a fact that OpenVMS is running on Itanium, at least). Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD4DBQFGnqpU+gerLs4ltQ4RAj5PAJ4jKxm7Lqxx9hwXRaRZl69CLysQ6ACVEKAb ql/iXfN18/e9wmdYWKxnAQ== =8r02 -----END PGP SIGNATURE-----
Tres writes:
Mark Hammond wrote:
This will result in both the final version of most bdist_* installations having the architecture in the filename. It also has the nice side effect of having the temp directories used by these commands include the architecture in their names, meaning its possible to build multiple Windows architectures from the same build tree, although that is not
At 04:00 PM 7/18/2007 +1000, Mark Hammond wrote: the primary
motivation. I presume the intention of this is to have it end up as either 'win32' or 'win64', yes?
Probably 'win32', 'amd64' or 'itanium' - I'm not worried about the specific strings, but there would need to be different ones for each of the 64bit architectures.
Why would you use processor type IDs to indicate Windows-specifc platforms? Lots of systems running on AMD64 boxen don't run windows (can't say "lots" and "Itanium" in the same sentence, I guess, but I know for a fact that OpenVMS is running on Itanium, at least).
Yes, I agree with that. I'm not too worried about what the specific strings are, and I agree they should include the OS *and* the architecture (eg, 'win32', 'win-amdx64' or 'win-itanium' might be suitable, or maybe win64-amd/win64-itanium). However, I'm just trying to take things one step at a time - if we can agree that having the same string for all architectures is bad, we can then move forward into a "bike-shed" discussion of what the new strings should be :) I'm yet to hear anyone explicitly agree with me that the current situation needs changing though. Cheers, Mark
I wrote:
Many of the distutils "commands" use distutils.util.get_platform() as the basis for file and directory names used to package up extensions. On Windows, this returns the value of sys.platform. On all (desktop) Windows versions, this currently returns 'win32'.
This causes a problem when trying to create a 64bit version of an extension. For example, using bdist_msi, the pywin32 extensions end up with a filename of 'pywin32-211.win32-py2.5.msi' for both 32bit and 64bit versions. This is not desirable for (hopefully) obvious reasons.
I'd like to propose that an (untested and against 2.5) patch similar to the following be adopted in distutils:
Thanks to all for the comments so far. It seems that there are no real objections, nor any real barriers to making this change. While Philip's notes regarding dependencies are well founded, I believe its clear that no dependencies are possible between 32 and 64bit versions for Windows, so there are no broader architectural requirements that would block us moving ahead with this. Tres rightly points out that the name of the OS should be embedded in the name of the package - so rather than including a new patch, I will simply propose the specific strings that should be returned. Specifically, I propose that distutils.util.get_platform return the following: * On 32bit Windows, continue to return 'win32'. * On 64bit windows, return a string of the format 'win64-{architecture}' A good value for 'architecture' isn't clear. Either 'AMD' or 'Itanium' appeals at first glance, but I'm a little concerned that (say) a "casual" user with a new Intel Core Duo processor will not know they should use something labelled as "AMD" (eg, "I explicitly asked for an Intel chip, not an AMD one"). An alternative could be be 'x64' or 'i64', but I'm not sure that casual user would be any better off, and with only a single letter distinguishing them, there is scope for confusion. On my final (mutant) hand, it seems that Itanium will be a historical footnote and demand for Itanium 64bit packages will be tiny (I've had a reasonable number of requests for x64 versions of pywin32, but zero for i64), so I doubt many packages will bother with Itanium. So, I'm leaning towards 'win64-x64' and 'win64-i64', with the expectation that the 'i64' variant will be rarely seen in the wild. Alternatively, 'win-x64' and 'win-i64' appear reasonable - it doesn't seem necessary that '64' appear twice in the name. Depending on the feedback (or even lack of, which I'll take as meaning there are no objections to my proposal), I'll create a patch, solicit review, and assuming no further objections, check the changes in to the trunk. All comments welcome! Cheers, Mark
Mark Hammond wrote:
A good value for 'architecture' isn't clear. Either 'AMD' or 'Itanium' appeals at first glance, but I'm a little concerned that (say) a "casual" user with a new Intel Core Duo processor will not know they should use something labelled as "AMD" (eg, "I explicitly asked for an Intel chip, not an AMD one"). An alternative could be be 'x64' or 'i64', but I'm not sure that casual user would be any better off, and with only a single letter distinguishing them, there is scope for confusion. On my final (mutant) hand, it seems that Itanium will be a historical footnote and demand for Itanium 64bit packages will be tiny (I've had a reasonable number of requests for x64 versions of pywin32, but zero for i64), so I doubt many packages will bother with Itanium.
So, I'm leaning towards 'win64-x64' and 'win64-i64', with the expectation that the 'i64' variant will be rarely seen in the wild. Alternatively, 'win-x64' and 'win-i64' appear reasonable - it doesn't seem necessary that '64' appear twice in the name.
I could be mistaken, but I believe the standard abbreviations for these architectures are 'x86_64' (intel/amd chipset) and 'ia64' (itanium) -- see the first two paragraphs of the wikipedia article on Itanium for an example: http://en.wikipedia.org/wiki/Itanium So how about something like: win-x86_64 win-ia64 -- Dave
Dave writes:
I could be mistaken, but I believe the standard abbreviations for these architectures are 'x86_64' (intel/amd chipset) and 'ia64' (itanium) -- see the first two paragraphs of the wikipedia article on Itanium for an example: http://en.wikipedia.org/wiki/Itanium
So how about something like: win-x86_64 win-ia64
Good point - but I'm not that keen on 'x86_64' - it seems a little too long, especially as we will not be using 'x86_32', and the extra characters don't seem like they will help resolve confusion (ie, if someone doesn't know what architecture they are using, the extra '86' isn't going to help). http://en.wikipedia.org/wiki/X86-64 notes that 'x64' is a common name, so how does 'win-x64' and 'win-ia64' sound as a compromise? I'm happy to let any other informal "votes" make a final decision though... Cheers, Mark
At 08:12 AM 7/24/2007 +1000, Mark Hammond wrote:
Dave writes:
I could be mistaken, but I believe the standard abbreviations for these architectures are 'x86_64' (intel/amd chipset) and 'ia64' (itanium) -- see the first two paragraphs of the wikipedia article on Itanium for an example: http://en.wikipedia.org/wiki/Itanium
So how about something like: win-x86_64 win-ia64
Good point - but I'm not that keen on 'x86_64' - it seems a little too long, especially as we will not be using 'x86_32', and the extra characters don't seem like they will help resolve confusion (ie, if someone doesn't know what architecture they are using, the extra '86' isn't going to help).
http://en.wikipedia.org/wiki/X86-64 notes that 'x64' is a common name, so how does 'win-x64' and 'win-ia64' sound as a compromise? I'm happy to let any other informal "votes" make a final decision though...
win-x64 and win-ia64 sound fine to me, for whatever that's worth. :)
Phillip J. Eby wrote:
So how about something like: win-x86_64 win-ia64 http://en.wikipedia.org/wiki/X86-64 notes that 'x64' is a common name, so how does 'win-x64' and 'win-ia64' sound as a compromise? I'm happy to let any other informal "votes" make a final decision though... win-x64 and win-ia64 sound fine to me, for whatever that's worth. :)
Along with win-x32? (I suggest win32-x86, win64-x86 and win64-ia, but as I don't plan on using more than one of them, I don't particularly care. win32, (or was that win-32?,) win-x64 and win-ia64 seem fine, if that's what gets chosen.) Later, Blake.
Phillip J. Eby wrote:
So how about something like: win-x86_64 win-ia64 http://en.wikipedia.org/wiki/X86-64 notes that 'x64' is a common name, so how does 'win-x64' and 'win-ia64' sound as a compromise? I'm happy to let any other informal "votes" make a final decision though... win-x64 and win-ia64 sound fine to me, for whatever that's worth. :)
Along with win-x32?
The plan is to leave 32bit windows alone - it will continue to return 'win32'
(I suggest win32-x86, win64-x86 and win64-ia, but as I don't plan on using more than one of them, I don't particularly care. win32, (or was that win-32?,) win-x64 and win-ia64 seem fine, if that's what gets chosen.)
I'd be inclined to agree if 32bit windows was also up for a change - but I see no good reason to do that, and a number of bad ones. It does mean there will be a slight inconsistency in the names, but I think we can live with that - ie: pywin32-211.win32-py2.5.msi pywin32-211.win-x64-py2.5.msi pywin32-211.win-ia64-py2.5.msi Would be the names pywin32 uses for the relevant platforms, and although there is an extra dash in the 64bit versions, I think it looks quite reasonable. Cheers, Mark
participants (5)
-
Blake Winton
-
Dave Peterson
-
Mark Hammond
-
Phillip J. Eby
-
Tres Seaver