<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body><div><div style="font-family: Calibri,sans-serif; font-size: 11pt;">As much as I dislike sniping into threads like this, my gut feeling is strongly pushing towards defining the Python interface in the PEP and keeping command line interfaces as private.<br><br>I don't have any new evidence, but pickle and binary stdio (not to mention TCP/HTTP for doing things remotely) are reliable cross-platform where CLIs are not, so you're going to have a horrible time locking down something that will work across multiple OS/shell combinations. There are also limits to command lines lengths that may be triggered when passing many long paths (if that ends up in there).<br><br>Might be nice to have an in-proc option for builders too, so I can handle the IPC in my own way. Maybe that's not useful, but with a Python interface it's trivial to enable.<br><br>Cheers,<br>Steve<br><br>Top-posted from my Windows Phone</div></div><div dir="ltr"><hr><span style="font-family: Calibri,sans-serif; font-size: 11pt; font-weight: bold;">From: </span><span style="font-family: Calibri,sans-serif; font-size: 11pt;"><a href="mailto:njs@pobox.com">Nathaniel Smith</a></span><br><span style="font-family: Calibri,sans-serif; font-size: 11pt; font-weight: bold;">Sent: </span><span style="font-family: Calibri,sans-serif; font-size: 11pt;">11/11/2015 4:18</span><br><span style="font-family: Calibri,sans-serif; font-size: 11pt; font-weight: bold;">To: </span><span style="font-family: Calibri,sans-serif; font-size: 11pt;"><a href="mailto:robertc@robertcollins.net">Robert Collins</a></span><br><span style="font-family: Calibri,sans-serif; font-size: 11pt; font-weight: bold;">Cc: </span><span style="font-family: Calibri,sans-serif; font-size: 11pt;"><a href="mailto:distutils-sig@python.org">DistUtils mailing list</a></span><br><span style="font-family: Calibri,sans-serif; font-size: 11pt; font-weight: bold;">Subject: </span><span style="font-family: Calibri,sans-serif; font-size: 11pt;">Re: [Distutils] command line versus python API for build systemabstraction (was Re: build system abstraction PEP)</span><br><br></div>In case it's useful to make this discussion more concrete, here's a<br>sketch of what the pip code for dealing with a build system defined by<br>a Python API might look like:<br><br> https://gist.github.com/njsmith/75818a6debbce9d7ff48<br><br>Obviously there's room to build on this to get much fancier, but<br>AFAICT even this minimal version is already enough to correctly handle<br>all the important stuff -- schema version checking, error reporting,<br>full args/kwargs/return values. (It does assume that we'll only use<br>json-serializable data structures for argument and return values, but<br>that seems like a good plan anyway. Pickle would probably be a bad<br>idea because we're crossing between two different python environments<br>that may have different or incompatible packages/classes available.)<br><br>-n<br><br>On Wed, Nov 11, 2015 at 1:04 AM, Nathaniel Smith <njs@pobox.com> wrote:<br>> On Tue, Nov 10, 2015 at 11:27 PM, Robert Collins<br>> <robertc@robertcollins.net> wrote:<br>>> On 11 November 2015 at 19:49, Nick Coghlan <ncoghlan@gmail.com> wrote:<br>>>> On 11 November 2015 at 16:19, Robert Collins <robertc@robertcollins.net> wrote:<br>>> ...>> pip is going to be invoking a CLI *no matter what*. Thats a hard<br>>>>> requirement unless Python's very fundamental import behaviour changes.<br>>>>> Slapping a Python API on things is lipstick on a pig here IMO: we're<br>>>>> going to have to downgrade any richer interface; and by specifying the<br>>>>> actual LCD as the interface it is then amenable to direct exploration<br>>>>> by users without them having to reverse engineer an undocumented thunk<br>>>>> within pip.<br>>>><br>>>> I'm not opposed to documenting how pip talks to its worker CLI - I<br>>>> just share Nathan's concerns about locking that down in a PEP vs<br>>>> keeping *that* CLI within pip's boundary of responsibilities, and<br>>>> having a documented Python interface used for invoking build systems.<br>>><br>>> I'm also very wary of something that would be an attractive nuisance.<br>>> I've seen nothing suggesting that a Python API would be anything but:<br>>> - it won't be usable [it requires the glue to set up an isolated<br>>> context, which is buried in pip] in the general case<br>><br>> This is exactly as true of a command line API -- in the general case<br>> it also requires the glue to set up an isolated context. People who go<br>> ahead and run 'flit' from their global environment instead of in the<br>> isolated build environment will experience exactly the same problems<br>> as people who go ahead and import 'flit.build_system_api' in their<br>> global environment, so I don't see how one is any more of an<br>> attractive nuisance than the other?<br>><br>> AFAICT the main difference is that "setting up a specified Python<br>> context and then importing something and exploring its API" is<br>> literally what I do all day as a Python developer. Either way you have<br>> to set stuff up, and then once you do, in the Python API case you get<br>> stuff like tab completion, ipython introspection (? and ??), etc. for<br>> free.<br>><br>>> - no matter what we do, pip can't benefit from it beyond the<br>>> subprocess interface pip needs, because pip *cannot* import and use<br>>> the build interface<br>><br>> Not sure what you mean by "benefit" here. At best this is an argument<br>> that the two options have similar capabilities, in which case I would<br>> argue that we should choose the one that leads to simpler and thus<br>> more probably bug-free specification language.<br>><br>> But even this isn't really true -- the difference between them is that<br>> either way you have a subprocess API, but with a Python API, the<br>> subprocess interface that pip uses has the option of being improved<br>> incrementally over time -- including, potentially, to take further<br>> advantage of the underlying richness of the Python semantics. Sure,<br>> maybe the first release would just take all exceptions and map them<br>> into some text printed to stderr and a non-zero return code, and<br>> that's all that pip would get. But if someone had an idea for how pip<br>> could do better than this by, I dunno, encoding some structured<br>> metadata about the particular exception that occurred and passing this<br>> back up to pip to do something intelligent with it, they absolutely<br>> could write the code and submit a PR to pip, without having to write a<br>> new PEP.<br>><br>>> tl;dr - I think making the case that the layer we define should be a<br>>> Python protocol rather than a subprocess protocol requires some really<br>>> strong evidence. We're *not* dealing with the same moving parts that<br>>> typical Python stuff requires.<br>><br>> I'm very confused and honestly do not understand what you find<br>> attractive about the subprocess protocol approach. Even your arguments<br>> above aren't really even trying to be arguments that it's good, just<br>> arguments that the Python API approach isn't much better. I'm sure<br>> there is some reason you like it, and you might even have said it but<br>> I missed it because I disagreed or something :-). But literally the<br>> only reason I can think of right now for why one would prefer the<br>> subprocess approach is that it lets one remove 50 lines of "worker<br>> process" code from pip and move them into the individual build<br>> backends instead, which I guess is a win if one is focused narrowly on<br>> pip itself. But surely there is more I'm missing?<br>><br>> (And even this is lines-of-code argument is actually pretty dubious --<br>> right now your draft PEP is importing-by-reference an entire existing<br>> codebase (!) for shell variable expansion in command lines, which is<br>> code that simply doesn't need to exist in the Python API approach. I'd<br>> be willing to bet that your approach requires more code in pip than<br>> mine :-).)<br>><br>>>> However, I've now realised that we're not constrained even if we start<br>>>> with the CLI interface, as there's still a migration path to a Python<br>>>> API based model:<br>>>><br>>>> Now: documented CLI for invoking build systems<br>>>> Future: documented Python API for invoking build systems, default<br>>>> fallback invokes the documented CLI<br>>><br>>> Or we just issue an updated bootstrap schema, and there's no fallback<br>>> or anything needed.<br>><br>> Oh no! But this totally gives up the most brilliant part of your<br>> original idea! :-)<br>><br>> In my original draft, I had each hook specified separately in the<br>> bootstrap file, e.g. (super schematically):<br>><br>> build-requirements = flit-build-requirements<br>> do-wheel-build = flit-do-wheel-build<br>> do-editable-build = flit-do-editable build<br>><br>> and you counterproposed that instead there should just be one line like<br>><br>> build-system = flit-build-system<br>><br>> and this is exactly right, because it means that if some new<br>> capability is added to the spec (e.g. a new hook -- like<br>> hypothetically imagine if we ended up deferring the equivalent of<br>> egg-info or editable-build-mode to v2), then the new capability just<br>> needs to be implemented in pip and in flit, and then all the projects<br>> that use flit immediately gain superpowers without anyone having to go<br>> around and manually change all the bootstrap files in every project<br>> individually.<br>><br>> But for this to work it's crucial that the pip<->build-system<br>> interface have some sort of versioning or negotiation beyond the<br>> bootstrap file's schema version.<br>><br>>>> So the CLI documented in the PEP isn't *necessarily* going to be the<br>>>> one used by pip to communicate into the build environment - it may be<br>>>> invoked locally within the build environment.<br>>><br>>> No, it totally will be. Exactly as setup.py is today. Thats<br>>> deliberate: The *new* thing we're setting out to enable is abstract<br>>> build systems, not reengineering pip.<br>>><br>>> The future - sure, someone can write a new thing, and the necessary<br>>> capability we're building in to allow future changes will allow a new<br>>> PEP to slot in easily and take on that [non trivial and substantial<br>>> chunk of work]. (For instance, how do you do compiler and build system<br>>> specific options when you have a CLI to talk to pip with)?<br>><br>> I dunno, that seems pretty easy? My original draft just suggested that<br>> the build hook would take a dict of string-valued keys, and then we'd<br>> add some options to pip like "--project-build-option foo=bar" that<br>> would set entries in that dict, and that's pretty much sufficient to<br>> get the job done. To enable backcompat you'd also want to map the old<br>> --install-option and --build-option switches to add entries to some<br>> well-known keys in that dict. But none of the details here need to be<br>> specified, because it's up to individual projects/build-systems to<br>> assign meaning to this stuff and individual build-frontends like pip<br>> to provide an interface to it -- at the build-frontent/build-backend<br>> interface layer we just need some way to pass through the blobs.<br>><br>> I admit that this is another case where the Python API approach is<br>> making things trivial though ;-). If you want to pass arbitrary<br>> user-specified data through a command-line API, while avoiding things<br>> like potential namespace collisions between user-defined switches and<br>> standard-defined switches, then you have to do much more work than<br>> just say "there's another argument that's a dict".<br>><br>> -n<br>><br>> --<br>> Nathaniel J. Smith -- http://vorpus.org<br><br><br><br>-- <br>Nathaniel J. Smith -- http://vorpus.org<br>_______________________________________________<br>Distutils-SIG maillist - Distutils-SIG@python.org<br>https://mail.python.org/mailman/listinfo/distutils-sig<br></body></html>