On Sat, May 1, 2021 at 10:49 AM Gregory Szorc <gregory.szorc@gmail.com> wrote:
The way it works today, if you have an application embedding Python, your sys.argv[0] is (likely) your main executable and sys.executable is probably None or the empty string (per the stdlib docs which say not to set sys.executable if there isn't a path to a known `python` executable).

Unfortunately, since sys.executable is a str, the executable it points to must behave as `python` does. This means that your application embedding and distributing its own Python must provide a `python` or `python`-like standalone executable and use it for sys.executable and this executable must be independent from your main application because the run-time behavior is different. (Yes, you can employ symlink hacks and your executable can sniff argv[0] and dispatch to your app or `python` accordingly. But symlinks aren't reliable on Windows and this still requires multiple files/executables.) **This limitation effectively prevents the existence of single file application binaries who also want to expose a full `python`-like environment, as there's no standard way to advertise a mechanism to invoke `python` that isn't a standalone executable with no arguments.**

minor nit: I wouldn't use the words "must behave" above... Since using sys.executable = None at work for the past five years.  The issues we run into are predominantly in unit tests that try to launch an interpreter via subprocess of sys.executable.  and the bulk of that is in CPython's own test suite (which I vote "doesn't really count").

regardless, not needing to tweak even those would be a convenience and it could open up doors for some more application frameworks that make such environment assumptions and are thus hard to distribute stand-alone.  ex: It'd open the door for multiprocessing spawn mode within stand alone embedded binaries.

While applications embedding Python may not have an explicit `python` executable, they do likely have the capability to instantiate a `python`-like environment at run-time: they have the interpreter after all, they "just" need to provide a mechanism to invoke Py_RunMain() with an interpreter config initialized using the "python" profile.

**I'd like to propose a long-term replacement to sys.executable that enables applications embedding Python to advertise a mechanism for invoking the same executable such that they get a `python` experience.**

The easiest way to do this is to introduce a list[str] variant. Let's call it sys.python_interpreter. Here's how it would work.

Say I've produced myapp.exe, a Windows application. If you run `myapp.exe python --`, the executable behaves like `python`. e.g. `myapp.exe python -- -c 'print("hello, world")'` would be equivalent to `python -c 'print("hello, world")'`. The app would set `sys.python_interpreter = ["myapp.exe", "python", "--"]`. Then Python code wanting to invoke a Python interpreter would do something like `subprocess.run(sys.python_interpreter)` and automatically dispatch through the same executable.

yep, that seems reasonable.  unfortunately the command line arguments are a global namespace, but choosing a unique "launch me as a standalone python interpreter" arg when building a standalone python executable app that will never conflict with an application, at build time, is doable.  Nobody's application wants this specific unique per build ---$(uuid) flag in argv[1] right? ;) ...

There's still an API challenge to decide on here: people using sys.executable also expect to pass flags to the python interpreter.  Do we make an API guarantee that the final flag in sys.python_interpreter is always a terminator that separates python flags from application flags (-- or otherwise)?

For applications not wanting to expose a `python`-like capability, they would simply set sys.python_interpreter to None or [], just like they do with sys.executable today.

Yep.  Though that should be done at stand alone python application build time to avoid any command line of the binary possibly launching as a plain interpreter.  (this isn't security, anyone with access to read the stand alone executable can figure out how to construct a raw interpreter usable in their environment from that)
 
In fact, I imagine Python's initialization would automatically set sys.python_interpreter to [sys.executable] by default and applications would have to opt in to a more advanced PyConfig field to make sys.python_interpreter different. This would make sys.python_interpreter behaviorally backwards compatible, so code bases could use sys.python_interpreter as a modern substitute for sys.executable, if available, without that much risk.

+1

-gps
 

Some applications may want more advanced mechanisms than command line arguments to dispatch off of. For example, maybe you want to key off an environment variable to activate "Python mode."  This scenario is a bit harder to implement, as it would require yet another advertisement on how to invoke `python`. If subprocess had a "builder" interface for iteratively constructing a process invocation, we could expose a stdlib function to return a builder preconfigured to invoke `python`. But since such an interface doesn't exist, there's not as clean a solution for cases that require something more advanced than additional process arguments. Maybe we could make sys.python_interpreter a tuple[list[str], dict[str, str]] where that dict is environment variables to set. Doable. But I'm unconvinced the complexity is warranted, especially since the application has full control over interpreter initialization and can set most of the settings that they'd want to set through environment variables (e.g. PYTHONHOME) as part of initializing the `python`-like environment.

Yes, there will be a long tail of applications needing to adapt to the reality that sys.python_interpreter exists and is a list. Checks like `if sys.executable == sys.argv[0]` will need to become more complicated. Maybe we could expose a simple "am I a Python interpreter process" in the stdlib? (The inverse "am I not a Python interpreter executable" question could also benefit from stdlib standardization, as there are unofficial mechanisms like sys.frozen and sys.meipass attempting to answer this question.)

Anyway, as it stands, sys.executable just doesn't work for applications embedding Python who want to expose a full `python`-like environment from single executable distributions. I think the introduction of a new API to allow applications to "self-dispatch" to a Python interpreter could eventually lead to significant ergonomic wins for embedded Python applications. This would make Python a more attractive target for embedding, which benefits the larger Python ecosystem.

Thoughts?

(I rarely post here. So if this idea is actionable, please inform me of next steps to make it become a reality.)
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/O66N56PB4U6AGICGBSRFD2OWA5JWMFC6/
Code of Conduct: http://python.org/psf/codeofconduct/