[issue35348] Problems with handling the file command output in platform.architecture()

STINNER Victor report at bugs.python.org
Mon Dec 17 05:15:35 EST 2018


STINNER Victor <vstinner at redhat.com> added the comment:

> I don't agree. Platform.architecture() is defined to look at a specified binary, not the currently running process. That can lead to inconsistencies like this and is not something you can avoid.

architecture() looks at running Python executable by default and documents a special case when executable equals to sys.executable:
"(...) then only if the executable points to the Python interpreter. Reasonable defaults are used when the above needs are not met."
https://docs.python.org/dev/library/platform.html#platform.architecture


> This doesn't necessarily need a new function, platform.architecture could also return something like "32bit,64bit".

As an user, I don't need for this information.

architecture() already contains a note:

"""
On Mac OS X (and perhaps other platforms), executable files may be universal files containing multiple architectures.

To get at the “64-bitness” of the current interpreter, it is more reliable to query the sys.maxsize attribute:

is_64bits = sys.maxsize > 2**32
"""


> But as I mentioned in my previous message I don't know why anyone would want to use this function in the first place. There are better ways to determine information about the current process (struct.calcsize, sys.maxsize, sys.byteorder), and I have never had a need to determine information about executable files that I couldn't get in a better way using other libraries (like macholib and pyelftools)

platform.architecture() has multiple issues:

* It rely on the external program "file". It is not available on Windows. It is likely missing on small Linux containers. It doensn't report an error if the program is missing but "should be available".
* Calling an external program can lead to security issues.
* Parsing file output is not reliable, even if PR 11159 should make the parsing more reliable. For example, file output is locale dependent and the -b option is not standard.
* The expected output on a macOS universal binary is unclear.
* The purpose of the function seems to be unclear to most developers...

Another solution is to deprecate the function. I agree with Ronald that sys.maxsize is enough for most use cases (get "bits"). For more accurate information, platform.architecture() is wrong and a third-party module is required.

By the way, platform.architecture() is not used in the stdlib which is a sign that maybe the function is not really helpful. Moreover, sysconfig and distutils.util contain the following code:

            # We can't use "platform.architecture()[0]" because a
            # bootstrap problem. We use a dict to get an error
            # if some suspicious happens.
            bitness = {2147483647:"32bit", 9223372036854775807:"64bit"}
            machine += ".%s" % bitness[sys.maxsize]

Serhiy, Ronald: What do you think of deprecating platform.architecture() instead of trying to fix it?

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue35348>
_______________________________________


More information about the Python-bugs-list mailing list